The Lactuca lineage is one of nine lineages in the lettuce subtribe (Cichorieae, Asteraceae) distributed in Europe, Africa, Asia and North America. Within the Lactuca lineage two clades show disjunct Eurasian-North American distributions. One disjunct clade consists of diploids (x = 8) and allotetraploids (x = 17), the former restricted to Eurasia and the latter to North America and the Azores. In contrast, members of the other Eurasian-North American disjunct clade are all diploid (x = 9), like the remainder of the Lactuca lineage (diploid, x = 8 or 9). The aims of the present study were to investigate the migration pathways that led to the disjunct distributions of these two Eurasian-North American clades and the potential progenitors of the allopolyploid taxa. We conducted deep taxon sampling and multi-locus phylogenetic analyses using nuclear ribosomal DNA (ETS and ITS), a low-copy nuclear marker (A44) and five non-coding plastid markers. Divergence time estimations with BEAST and ancestral biogeographic estimations with BioGeoBEARS suggested that both lineages reached North America by the late Miocene. Cloning of the A44 region revealed two sequence copies within allopolyploid individuals that were resolved in divergent clades and this helped to identify potential progenitors. We provide competing hypotheses for the progenitor species and biogeographic pathways that gave rise to the allotetraploid lineage, and we propose a North American origin for the Azorean endemic. Taxonomic conclusions include L. graminifolia var. mexicana being raised to specific rank with the name L. brachyrrhyncha and the alleged endemic L. jamaicensis in fact represents the SE Asian L. indica, introduced to Jamaica.
Version of record first published online on 24 August 2018 ahead of inclusion in August 2018 issue.
The most striking floristic similarities among Northern Hemisphere plant disjunctions are observed between North America and E Asia (Gray 1859; Wen 1999; Milne & Abbott 2002; Liu & al. 2017). Disjunct distributions of closely related plant species between continents in the Northern Hemisphere may be explained by factors such as vicariance or long-distance dispersal (LDD; Wen 1999; Manos & Donoghue 2001; Donoghue & Smith 2004). Plant migration between North America and Eurasia has been facilitated by two major land bridges from the Middle Eocene until the Middle Miocene, and even until much more recently as fluctuating land masses would have persisted until the Last Glacial Maximum (LGM). Analyses of disjunct Northern Hemisphere plant lineages using molecular data incorporating molecular clocks, calibration points and ancestral range estimations enable us to test hypotheses to explain plant disjunctions across the Northern Hemisphere (Milne 2006; Drew & al. 2017).
The lettuce subtribe (Lactucinae) contains c. 200 species, has a widespread distribution, occurring in Eurasia, Africa and North America (Kilian & al. 2009), and its centres of diversity are in SW Asia and the Sino-Himalayan region (Wang & al. 2013). All North American Lactuca species and Northern Hemisphere intercontinental disjunctions are resolved within the so called “Lactuca lineage“, which is one of nine major lineages within the lettuce subtribe and contains c. 40 species [clade I in Kilian & al. (2017b); c. 9.1 mya (million years ago)]. Kilian & al. (2017b) revealed that the currently recognized genus Lactuca (currently consisting of up to c. 100 species) is not monophyletic, neither in the wider nor narrower circumscription. We therefore focus on this monophyletic “Lactuca lineage”, which, according to Kilian & al. (2017b), contains a number of taxa that had previously not been included in the genus Lactuca, whereas a number of taxa that had always been considered members of the genus were in fact resolved outside of the Lactuca lineage.
North American species of the Lactuca lineage belong to three distinct clades (two native and one non-native; Kilian & al. 2017b). The most diverse Lactuca clade native to North America (with some seven species) consists exclusively of allotetraploid species with a chromosome number of 2n = 34, referred to here as the “L. canadensis clade” (after its most widespread species L. canadensis L.; Table 1; Babcock & al. 1937; Feráková 1977; Kilian & al. 2017b). Lactuca watsoniana Trel., a rare perennial tall herb restricted to moist rocky slopes at high altitudes, endemic to the Azores archipelago in the Atlantic Ocean, is also a member of this clade and shares this chromosome number (Dias & al. 2018). The second clade native to North America consists of the widespread L. oblongifolia Nutt., which, in contrast to the allotetraploid L. canadensis clade, is diploid with a chromosome number of 2n = 18. This species is usually named L. pulchella (Pursh) DC. or Mulgedium pulchellum (Pursh) G. Don or treated as a subspecies of the Eurasian L. tatarica L. (L. tatarica subsp. pulchella (Pursh) Stebbins). Lactuca oblongifolia has yet to be sampled in a phylogenetic context, but due to its close morphological affinity to the Eurasian L. tatarica it is expected to be a member of the L. tatarica clade (Kilian & al. 2017b, see Table 1). There is a third clade of the Lactuca lineage present in North America, the “core Lactuca clade” (Kilian & al. 2017b), with four nonnative species: L. serriola L., accidentally introduced first in the late 1890s (Swearingen & Bargeron 2016), has very rapidly colonized the U.S.A. and Canada apart from the northernmost area; L. sativa L. is an escape from cultivation; L. saligna L. is present in many states of the U.S.A. and also in part of E Canada; and L. virosa L. has only a rare and scattered occurrence in the U.S.A. They are all restricted to disturbed sites and roadsides.
Names of the two North American Lactuca lineages focused on in the present study. For each lineage the general distribution ranges, species members and chromosome numbers are provided.
Two major land bridges have facilitated the migration of plant taxa across the Northern Hemisphere to North America at different times since the Early Cenozoic and Late Cretaceous: the N Atlantic and the Bering land bridges [NALB and BLB, respectively; Tiffney (1985a); Tiffney (1985b); Donoghue & al. (2001); Sanmartín & al. (2001); Wen & al. (2010); Wen & al. (2016)]. The NALB connected E North America and NE Europe, persisting until c. 15 mya (Milne 2006), yet it is likely that it continually facilitated plant dispersal until much later in the late Miocene (Denk & al. 2010). On the other hand, the BLB connected W North America with E Asia and persisted until the Pliocene, and fluctuating land masses would have facilitated migration until more recently in the Pleistocene, large parts of Beringia having remained unglaciated during the Last Glacial Maximum (Hultén 1937; DeChaine 2008). The BLB is therefore more recent than the NALB (Tiffney 1985b; Wen & al. 2016). Much of the research on Northern Hemisphere disjunct plant lineages to date has focussed on woody lineages that originated during the early Cenozoic, especially those showing W North American-E Asian disjunctions, which is the most common pattern among Northern Hemisphere plant disjunctions (Donoghue & Smith 2004; Liu & al. 2017).
The two Lactuca clades with species native to North America (Table 1) likely arose relatively recently, in the Pliocene (Kilian & al 2017; Dias & al. 2018). Within the L. tatarica clade, the closest known relative of the North American L. oblongifolia based on morphology, i.e. L. tatarica, occurs from Central Europe to Far East Russia and is estimated to have originated c. 2.7 mya based on the nuclear ribosomal Internal Transcribed Spacer (nrITS; Kilian & al. 2017b). In contrast, the L. canadensis clade was resolved as sister to the SW European Alpine-Pyrenean L. plumieri (L.) Gren. & Godr. (frequently treated as Cicerbita plumieri (L.) Kirschl.) and originated at least 4.3 mya in the Pliocene according to nrITS analyses for the North American clade (Kilian & al. 2017b) and 3.8 – 4.5 mya according to Dias & al. (2018) for the entire L. canadensis clade. Based on their nrITS and plastid DNA analyses and the fact that no other Lactuca species are native to the Azores, Dias & al. (2018) hypothesized that migration from Europe via the BLB to North America and subsequent LDD to the Azores could explain the current disjunction distribution between the L. canadensis clade and its closest relative, L. plumieri. Because most Azorean endemic species are considered to have their closest relatives in Europe, this colonization pathway to the Azores would be unusual, but it would not be an exception [e.g. Smilax azorica H. Schaef. & P. Schönfelder; Schaefer & Schönfelder (2009)]. The two Lactuca clades native to North America (Table 1) appear to be of independent origins, suggesting that two separate migration events from Eurasia to North America have occurred during the evolutionary history of Lactuca. Considering that members of these two clades show contrasting distribution patterns and there have been different land bridges connecting North America and Eurasia (BLB and NALB) at different times throughout the Cenozoic, it is possible that the two disjunct Lactuca lineages may have undergone contrasting migration pathways and therefore represent different biogeographic histories. The present study aims to incorporate divergence time and ancestral biogeographic range estimations into the phylogenetic reconstruction of the Lactuca lineage. By conducting deeper taxon sampling within the Lactuca lineage and expanding the marker sampling compared to previous studies (Kilian & al. 2017b; Dias & al. 2018) including a second nuclear ribosomal (nr) DNA marker the External Transcribed Spacer (nrETS), the present study assesses the migration pathways of the two disjunct Lactuca clades across the Northern Hemisphere.
As previously mentioned, the Lactuca canadensis clade is unique in containing exclusively allotetraploid members of the Lactuca lineage (Table 1). Allotetraploids result from the genome merging of different species (Krak & al. 2013). For the allotetraploid L. canadensis clade, a hybridization event between two species with basic chromosome numbers of x = 8 and 9, giving rise to the Azorean-North American L. canadensis clade of x = 17, is therefore likely (Babcock & al. 1937). Extant species of the Lactuca lineage with a basic chromosome number of x = 17 are exclusive to North America (and the Azores). Furthermore, the only other extant species native to North America (L. oblongifolia) has a basic chromosome number of x = 9; extant species with x = 8 are absent from North America. Based on the distribution areas of extant species, the allopolyploid L. canadensis clade is geographically isolated from at least one of its potential progenitors (the x = 8 lineage).
Allopolyploidy is commonly involved in hybrid speciation and polyploidy has been shown to facilitate LDD and evolutionary success within plant lineages (Inda & al. 2008; Linder & Barker 2014). However, allopolyploidy can complicate phylogenetic inference and in order to address questions of ancestral hybridization it is important to combine and compare evidence from differently inherited markers; this can provide valuable insights into the origins and evolutionary history of such lineages (Hughes & al. 2002; Krak & al. 2013). Nuclear data can be highly informative when investigating the origins of putative allopolyploid lineages because it is biparentally inherited, whereas plastid DNA is typically maternally inherited (Chapman & al. 2007). The nrDNA regions are subject to high copy number and concerted evolution that may lead to chimeric sequences, locus loss or duplication, and may cause ribotypes to be eliminated from some lineages, and therefore mislead our interpretations of evolutionary histories and ancestral hybridization (Alvarez & Wendel 2003; Nieto Feliner & Rossello 2007; Zhang & al. 2012). Low-copy nuclear markers are informative in plant phylogenetics because they are biparentally inherited and less susceptible to concerted evolution (Chapman & al. 2007; Duarte & al. 2010), they are therefore useful when nrDNA and plastid DNA regions show incongruence such as in the case of Lactuca (Wang & al. 2013; Kilian & al. 2017b). Low-copy nuclear markers were sampled by Smissen & al. (2011) to untangle multiple allopolyploid lineages within Gnaphalieae (Asteraceae) and by Kilian & al. (2017a) to investigate incongruences between plastid and nrDNA phylogenies within an endemic Socotran lineage also from Gnaphalieae. Therefore, the present study also incorporates sequences of the low-copy nuclear marker A44 for Lactuca (Chapman & al. 2007). Low-copy nuclear markers may, however, be subject to paralogy, when gene duplications reflect the history of the gene rather than the species' relationship to a most recent common ancestor. They also may be subject to incomplete lineage sorting (ILS), when gene copies coalesce within the duration of an ancestral species and not within the duration of the species (Linder & Rieseberg 2004; Krak & al. 2013). To address these issues it is important to incorporate differentially inherited markers in order to compare phylogenetic reconstructions. Moreover, cloning low-copy nuclear genes can also be informative because all potential sequence copies of a gene within an accession can be estimated and incorporated in phylogenetic reconstructions (Naumann & al. 2011). Issues associated with gene trees not reflecting species trees can then be addressed with higher confidence. Cloning is often performed when ambiguous sites are observed in the direct sequences. However, cloning a number of accessions from a group of taxa (i.e. not only those that show sequence polymorphism in the direct sequence) can also be informative when assessing whether a low-copy nuclear marker may be appropriate for phylogenetic reconstruction (Krak & al. 2012). This helps to confirm that different copies within accessions are monophyletic within diploid, non-hybridizing species and could therefore be useful to investigate the origin of hybrid species within a lineage. In this study, therefore, we also clone the low-copy nuclear marker A44 for a number of samples across the Lactuca lineage.
Phylogenetic studies by Kilian & al (2017b) were the first to sample the entire subtribe Lactucinae and disentangle generic diversity; their phylogenetic analyses showed high support for the monophyletic Lactuca lineage. The present study aims to expand taxon and marker sampling within the Lactuca lineage from Kilian & al. (2017b) to conduct phylogenetic analyses in order to gain greater resolution within the two disjunct Eurasian-North American clades (the L. canadensis and L. tatarica clades; Table 1). Using the phylogenetic trees in the present study as a backbone, we aim to answer the following questions: (1) What are the ancestral biogeographic ranges estimated for the two native disjunct Eurasian-North American Lactuca clades and what are the most likely migration pathways that have led to their current geographic distributions? (2) Which species are the likely progenitors associated with the hybrid origin of the allotetraploid L. canadensis clade? Because extant members of this x = 17 allopolyploid clade are geographically isolated from putative progenitors (known x = 8 species), we aim to bring together the evidence from our phylogenetic and ancestral biogeographic analyses, and discuss in the context of morphology and historical biogeography in order to provide hypotheses for the putative progenitors, their likely geographic origins, and the location of the ancestral hybridization event.
Material and methods
Taxon sampling, plant material and dataset — In order to investigate the diversity patterns and origins of native North American Lactuca taxa, we selected our sampling according to the topologies revealed by Kilian & al. (2017b) in the phylogenetic trees based on five non-coding plastid DNA loci and the nuclear ribosomal DNA internal transcribed spacer (nrITS). We focused our sampling on the Lactuca lineage and conducted deeper sampling of the native North American clades and their closest relatives. Three datasets were built (A1, A2 and B). Newly generated sequences are based on herbarium material from the herbaria AZB, B, TENN and TEX/LL (see Appendix 1 – Supplementary Material online (wi.48.48206_Appendix_1.xlsx)). Herbarium specimens from B, GH, GOET (type material), NY, TENN and TEX/LL were used for morphological comparison of taxa. High-resolution scans of further type material were accessed via JSTOR Global Plants (JSTOR 2016+), other specimen scans via GBIF (2015+).
Dataset A1 included 37 ingroup taxa that were sampled for the nrITS and nrETS; both hereon referred to as nrDNA. Dataset A2 included 36 ingroup taxa across the Lactuca lineage that were sampled for the same five noncoding plastid DNA loci used by Kilian & al. (2017b). The outgroup taxa for datasets A1 and A2 consisted of lettuce subtribe taxa that were resolved outside of the Lactuca lineage in both plastid DNA and nrITS analyses by Kilian & al. (2017b), namely Cicerbita alpina (L.) Wallr. and C. muralis (L.) Dumort. of the Cicerbita lineage, and Melanoseris lessertiana (Wall. ex DC.) Decne. and M. violifolia (Decne.) N. Kilian of the Melanoseris lineage. In order to investigate intraspecific variation of widespread taxa, we also extended the number of samples per taxon. Therefore, samples of L. tatarica from across its distribution in Europe and Asia and multiple accessions of the widespread North American L. oblongifolia were included. Multiple individuals were sampled for 19 of the 37 taxa of dataset A1, and for 17 of the 36 taxa in dataset A2 (see Appendix 1 (wi.48.48206_Appendix_1.xlsx)).
The third dataset (B) consisted of a subset of 23 ingroup taxa from datasets A1 and A2, for which the A44 low-copy nuclear region (Chapman & al. 2009) was amplified and sequenced, and a number of samples were cloned (see below). This region was selected for phylogenetic analyses in Lactuca because it was shown to consistently amplify and was phylogenetically informative across the group. The sampling strategy for dataset B aimed to include disjunct Northern Hemisphere clades (Table 1) and at least one member of all other clades resolved in the phylogenetic analyses of the datasets A1 and A2. As for datasets A1 and A2 the outgroup for dataset B was Cicerbita alpina.
DNA extraction — DNA samples were isolated using either the DNA MACHERY-NAGEL NucleoSpin plant II kit following the respective protocol or the Qiagen plant DNeasy kit.
Amplification and sequencing — For dataset A1 sequences of the nrITS region published by Wang & al. (2013), Schilling & al. (2015), Kilian & al. (2017b), Dias & al. (2018) and others were used, and 15 were newly generated, using the same conditions as Wang & al. (2013) and Kilian & al. (2017b) and the primer combinations ITS4/ ITS5 (White & al. 1990) or ITSA/ITSB (Blattner 1999). All 62 sequences of the nrETS region were newly generated, using the primers AST-1 and 18-S-ETS (Baldwin & Markos 1998; Markos & Baldwin 2001). For PCR amplification of the nrETS region the following protocol was performed: an initial denaturation step of 1 minute 30 seconds at 95 °C, followed by 35 cycles of a 30-second denaturation step at 95 °C, 1 minute of primer annealing at 56 °C and 45 seconds of extension at 72 °C, followed by a final extension step at 72 °C for 10 minutes.
For dataset A2 sequences of the five non-coding plastid DNA loci (petB-petD spacer plus petD intron, psbAtrnH spacer, 5′trnL(UAA)-trnF spacer, rpl32-trnL(UAG) spacer, 5′rps16-trnQ(UUG) spacer) sequences published by Wang & al. (2013), Kilian & al. (2017b) and Dias & al. (2018) were used and 118 were newly generated, using the same conditions as Wang & al. (2013) and Kilian & al. (2017b). The following primer combinations were used: petB-petD spacer plus petD intron were co-amplified with the universal primers PIpetB1411F/PIpetD738R (Löhne & Borsch 2005); the psbA-trnH spacer with the universal primers psbAF/trnHR (Sang & al. 1997); the 5′trnL(UAA)-trnF spacer with the universal primers trnC/trnF (Taberlet & al. 1991); the rpl32-trnL(UAG) spacer with the primers rpl32-F/trnL(UAG) and the 5′rps16-trnQ(UUG) spacer with the primers (rps16x1/trnQ(UUG); Shaw & al. 2007).
For dataset B primers A44F and A44R were used from Chapman & al. (2007). For PCR amplification the following protocol was performed: an initial denaturation step of 2 minutes at 95 °C, followed by the first cycle set of 10 cycles with a 30-second denaturation step at 95 °C, 1 minute of primer annealing at 60 °C (-1 °C each cycle) and 1 minute 30 seconds of extension at 72° C, followed by a second cycle set of 30 cycles with a 30-second denaturation step at 95 °C, 1 minute of primer annealing at 50 °C and 1 minute 30 seconds of extension at 72 °C, then a final extension at 72 °C for 10 minutes. A number of A44 direct sequences revealed additive polymorphic sites; therefore, the amplified A44 regions from 14 samples in taxon set B were subject to cloning and then sequenced (see further details below). The A44 dataset used for subsequent alignment and analyses consisted of all cloned and directly sequenced data.
For all newly generated sequences, ABI cycle sequencing was carried out by Macrogen Europe or the University of Tennessee MBRF. All newly generated sequences were submitted to the International Nucleotide Sequence Database Collaboration (INSDC); for accession numbers see Appendix 1 (wi.48.48206_Appendix_1.xlsx).
Cloning — A44 PCR amplicons were selected to be cloned in order to investigate allelic variation across the dataset B. A44 PCR amplicons for 12 of the 25 samples of ingroup taxa were subject to cloning and 18 ingroup samples were directly sequenced. Cloning of A44 was conducted to ensure a robust sampling of the taxonomic diversity within Lactuca, using the following sampling strategy: (1) An initial study had revealed that all direct A44 sequences of six members of the allopolyploid L. canadensis clade (Table 1) had additive polymorphic sites (six samples; data not shown); we therefore aimed to gain insight into putative ancestral hybridization in the origin of this allotetraploid lineage by cloning and sequencing the A44 region. (2) We selected diploid taxa that were incongruent between plastid DNA and nrITS trees in Kilian & al. (2017b): this included L. bourgaei and Cephalorrhynchus soongoricus (dataset B) and the ingroup taxa L. macrophylla (Willd.) A. Gray and L. racemosa Willd. (datasets A1, A2 and B). (3) In order to test whether the allelic variation that may be observed within A44 amplicons of the above samples was not a product of a putative multi-copy characteristic of the low-copy A44 marker, a further five samples from diploid taxa (one outgroup and four ingroup species) that did not show additive polymorphic sites and were not incongruent between plastid DNA and nrITS analyses in Kilian & al. (2017b) were also cloned; these samples were selected to represent the range of clades across the phylogeny (see Appendix 1 (wi.48.48206_Appendix_1.xlsx)). A44 PCR amplicons were ligated into the vector provided with the TOPO® TA cloning kit. Five to 11 clones were successfully sequenced from each of the 15 samples using the A44F and A44R primers ( Appendix 1 (wi.48.48206_Appendix_1.xlsx)).
Alignment — The sequences were aligned automatically using Muscle (Edgar 2004) and manually adjusted in PhyDE v. [version] 0.9971 (Müller & al. 2005). In the case of the non-coding plastid DNA loci a motif-based alignment with the criteria outlined by Kelchner (2002), Borsch & al. (2003) and Löhne & Borsch (2005) were applied to ensure states of homology for microstructural changes. Mononucleotide repeats (microsatellites) and hypervariable sections were excluded from the final alignment if homology assessment was critical. Inverted repeats were reverse complemented and coded as one binary character to account for a one-step mutation. For datasets A1 and A2 indels were coded as binary characters using the Simple Indel Coding (SIC) approach by Simmons and Ochoterena (2000) implemented in Seqstate v. 1.4.1 (Müller 2005) followed by manually checking the indel files. Indels were not coded for dataset B. In order to assess inter- and intraspecific variation, identical sequences were removed from all alignments. Therefore, the entire concatenated nrDNA dataset A1 in the final alignment for analyses contained 54 unique sequences representing 76 samples, while the entire concatenated plastid DNA dataset A2 contained 52 unique sequences representing 71 samples. In dataset B both direct and cloned low-copy nuclear DNA A44 locus sequences were included.
Phylogeny reconstruction and molecular dating — Aligned matrices were analysed using Maximum Parsimony, Maximum Likelihood and Bayesian Inference. Each analysis was conducted independently for the concatenated nrDNA dataset A1, the concatenated plastid DNA dataset A2, and the nuclear A44 locus dataset B.
Maximum Parsimony (MP) — MP analysed were conducted using the Parsimony Ratchet (Nixon 1999) implemented in PRAP (Müller 2004) in combination with PAUP* v. 4.0b10 (Swofford 2003) on the CIPRES Science Gateway v. 3.3 (Miller & al. 2010). Ratchet settings included 200 iterations with unweighting 25 % of the positions randomly (weight = 2) and ten random additional cycles. A strict consensus tree was conducted from all saved trees. Jackknife (JK) support was calculated in PAUP by performing a heuristic search with 10 000 JK replicates using the TBR branch swapping algorithm and a deletion of 36.79 % characters in each replicate. Starting trees were generated via stepwise addition with simple sequence addition.
Maximum Likelihood (ML) — ML bootstrapping (with 1000 replicates) were implemented in RAxML v. 8.0.0 (Stamatakis 2014) using the CIPRES Science Gateway v. 3.3. The data were partitioned using the GTRGAMMA and BIN GAMMA for nucleotide sequence data and indel partition, respectively, for both bootstrapping phases. The final tree topology was evaluated under a GTRGAMMA algorithm.
Bayesian Inference (BI) — Substitution models for Bayesian analyses were determined for each dataset and their partitions using the bModelTest package (Bouckaert & Drummond 2017) implemented in BEAST v. 2.3.0. BI analyses were performed using BEAUTi and BEAST v. 1.8.2 (Drummond & al. 2012).
For the nrDNA dataset A1 the following models were selected: K81 model for ITS1, and TN93 for both 5.8S rDNA and ITS2 regions, respectively, and the TVM + G for the ETS region. For the plastid DNA dataset A2 the following models were selected: the K81 model was selected for both the petD spacer plus intron and the 5′trnL(UAA)-trnF spacer and the Tim model for the following spacers: psbA-trnH, rpl32-trnL(UAG), 5′rps16-trnQ(UUG). For the nuclear A44 locus dataset B the GTR + I + G model was selected. The Dollo model was used for the indel partitions in analyses of datasets A1 and 2 (Drummond & al. 2012; Woodhams & al. 2013).
Divergence time estimations for datasets A1 and A2 — The molecular clock hypothesis was rejected for all regions, so uncorrelated lognormal relaxed clock models were assigned for all analyses (Drummond & al. 2006). Divergence time estimations were conducted only on the datasets A1 (nrITS and nrETS) and separately on A2 (plastid DNA). Kilian & al. (2017b) used two constraints based on Tremetsberger & al. (2012) for their divergence time estimation across the lettuce subtribe: one fossil-based calibration for an outgroup node using a “Cichorium intybus L. type” pollen fossil, and a secondary calibration point for the core Lactucinae node that was taken from the results of Tremetsberger & al. (2012). In the present study we used the divergence time estimations of the Lactuca lineage crown and stem ages from Kilian & al. (2017) as secondary calibration points in the nrDNA and plastid DNA datasets, i.e. nrITS stem (node unresolved) and crown: 9.23 myr (million years) (HPD [highest posterior density] 6.50–12.26) and plastid DNA stem: 7.76 myr (HPD 5.30–10.57) and crown: 7.18 myr (HPD 4.79–9.67). It is recommended to apply a uniform distribution for secondary calibration constraints, which gives all ages between an upper and a lower bound equal prior probability (Schenk 2016). In order to successfully run the divergence analyses under this more conservative uniform prior it was necessary to provide a starting tree that had been estimated using the less conservative normal prior, applied to the tree height with a mean of 8 myr (3 STD), using a lognormal, uncorrelated relaxed clock, birth-death speciation process. In order to estimate the starting tree, two independent Markov chain Monte Carlo (MCMC) runs were conducted and each chain was run for 100 million generations logging parameters every 10 000 generations. Tracer v. 1.6 was used to visualize log files, assess the stationarity on the log-likelihood curves and calculate the burn-in. The first 10 % saved tree files from each run were discarded, the remaining trees were combined in Logcombiner v. 1.8.2 and Treeannotator v. 1.8.2 was used to estimate the Maximum Clade Credibility (MCC) tree. This was then used as a starting tree in the following divergence time analyses. The ingroup node of the plastid DNA tree was calibrated under a uniform distribution with an age range of 4.70–10.57 myr, the ingroup node of the nrDNA tree under a uniform distribution with an age range of 5–12.26 myr, following the crown ages of the Lactuca lineage in Kilian & al. (2017b). Three independent MCMC runs were conducted. Each chain was run for 100 million generations logging parameters every 10 000 generations. MCC trees were generated following the same method as for the starting tree.
Branch support values from the above phylogenetic analyses (MP jackknife (JK), ML bootstrap (BS) and Bayesian posterior probability (PP)) were presented on the time calibrated MCC tree estimated from the BEAST analyses, using Treegraph v. 2 (Stöver & Müller 2010). The topologies from phylogenetic analyses of the three datasets A1, A2 and B were visually inspected for the presence of statistically supported incongruences between datasets.
Ancestral range estimation — Taxon distribution data were taken from the Cichorieae Systematics Portal (Kilian & al. 2009+). Samples were coded for the presence/absence of taxa in five broad geographic regions, which are based on and adapted from the World Wildlife Federation Terrestrial Ecoregions (Olson & al. 2001): W Palearctic (including Macaronesia) (A), E Palearctic (B), Tropical Africa (including the Arabian Peninsula; C), North America (D), and SE Asia and Australasia (E). W Palearctic (A) and E Palearctic (B) are separated by the Ural Mountains. We inferred ancestral distribution patterns using the time calibrated phylogeny (the MCC tree) from the BEAST analyses, which was analysed using the package Biogeography with Bayesian (and likelihood) Evolutionary Analyses with R scripts (BioGeoBEARS; Matzke 2012, 2013) in R v. 3.2.4. This analysis builds on previous biogeographic range estimation analyses (Ree & Smith 2008) by enabling the incorporation of the parameter +J that allows for founder-event speciation. Using BioGeoBEARS, we tested all available models: DEC, BAYAREA and DIVALIKE, each with and without the parameter +J and compared between models using AIC scores and tested each model with and without the +J parameter using likelihood ratio tests (LRTs).
Phylogeny reconstruction: datasets A1 (nrDNA) versus A2 (plastid DNA) — After removal of identical sequences, the ingroup in the alignment for the plastid DNA dataset A1 contained 50 unique sequences representing 70 samples, and the ingroup in the nrDNA dataset A2 contained 49 unique sequences, representing 66 samples, each terminal node in Fig. 1 represents a unique haplotype (plastid DNA) or ribotype (nrDNA; see Appendix 1 (wi.48.48206_Appendix_1.xlsx) for sample details). The alignment of the dataset A1 was 1085 base pairs (bp) long and included 19 coded indels; the alignment for the dataset A2 was 4147 base pairs (bp) long and included 138 coded indels. MP analyses of the plastid DNA dataset A1 resulted in 357 parsimony informative characters, a consistency index of 0.67 and a retention index of 0.71. MP analyses of the nuclear ribosomal dataset A2 resulted in 341 parsimony informative characters, a consistency index of 0.74 and a retention index of 0.82.
The results of phylogenetic reconstructions of datasets A1 (nrDNA) und A2 (plastid DNA) are compared in Fig. 1. The ingroup (Lactuca lineage) for both phylogenies received strong statistical support: nrDNA tree: 1 and 91 for PP and JK, respectively, and plastid DNA tree: 1 and 89 for PP and JK, respectively. Nodes with low support (<0.9 PP, 75 BS or <75 JK) were collapsed. Within the Lactuca lineage, both analyses revealed low statistical support for the backbone (collapsed nodes in Fig. 1), yet strong support was observed at the shallower nodes. Topologies within the Lactuca lineage showed some commonalities between both phylogenies, however there were marked differences. Two well-supported clades in the nrDNA phylogeny are referred to here as nuA (1 and 94 for PP and JK, respectively) and nuB (0.97, 93, 99: for PP, BS and JK, respectively; Fig. 1; Table 2). The L. aurea clade [L. aurea (Vis. & Pančić) Stebbins, L. glareosa Boiss. and L. variabilis Bornm.] was strongly resolved as monophyletic (1, 99, 100 in the nrDNA tree and full support in the plastid DNA tree for PP, BS and JK) yet it showed different topological positions: in the nrDNA tree it was resolved in a trichotomy with nuA and nuB, whereas in the plastid DNA tree this L. aurea clade was nested within a weakly supported (76 JK) clade containing the L. dissecta clade, L. indica clade and L. racemosa clade. The core Lactuca clade (L. saligna, L. serriola and L. virosa) was supported as sister to L. quercina L. (1, 76, 74 for PP, BS and JK, respectively) within the nuA clade of the nrDNA tree but was sister to the L. viminea clade [consisting of L. acanthifolia (Willd.) Boiss., L. orientalis (Boiss.) Boiss. and L. viminea J. Presl & C. Presl] in the plastid DNA tree with strong statistical support (1, 97, 100 for PP, BS and JK, respectively). The L. tatarica clade (Table 2) was resolved with strong statistical support in both trees: full support for PP, BS and JK in the nrDNA tree and 1, 95, 100 for PP, BS and JK, respectively, in the plastid DNA tree. In the nrDNA tree the L. tatarica clade was resolved as sister to the L. viminea clade (0.96, 75, 64 for PP, BS and JK, respectively; Fig. 1; Table 2) whereas in the plastid DNA tree as sister to L. quercina (1, 78, 100 for PP, BS and JK, respectively).
Node support with stem and crown node age estimates for the outgroup clades, the Lactuca lineage (ingroup) and the main clades within the Lactuca lineage according to divergence time estimations using nrDNA and plastid DNA datasets; PP: posterior probability; 95 % HPD: highest posterior density; myr: million years; NA: not applicable (no statistical support for node).
In both phylogenies the clade consisting of the Lactuca canadensis clade + L. plumieri was well supported (full support in the nrDNA tree and 1, 87, 100 in the plastid DNA tree for PP, BS and JK, respectively) and within this clade the SW European L. plumieri was resolved as sister to the L. canadensis clade (full support in both trees for PP, BS and JK), while the Azorean L. watsoniana was resolved as sister to clade consisting of all North American members of the L. canadensis clade (0.87, 61, 67 in the nr DNA tree and full support in the plastid DNA tree for PP, BS and JK). Variation was observed within the North American members of the L. canadensis clade: in the nrDNA phylogeny L. biennis (Moench) Fernald and L. floridana (L.) Gaertn. were both unique and ancestral to a clade representing four taxa [L. canadensis, L. hirsuta Nutt., L. graminifolia Michx. and L. ludoviciana (Nutt.) Riddell (sample a)] that all share a single identical ribotype (terminal name: 4 taxa** Fig. 1; Table 2) and are sister to a clade consisting of seven unique ribotypes all from samples of L. graminifolia var. mexicana McVaugh (= L. brachyrrhyncha Greenm.) labelled a – g (Fig. 1; Table 2). In the plastid DNA tree only L. floridana was found to have a unique haplotype, which was resolved as sister to a single haplotype corresponding to all other North American taxa (L. biennis, L. canadensis, L. graminifolia, L. hirsuta and L. ludoviciana; terminal name 5 taxa***). Lactuca brachyrrhyncha was not sampled in the plastid DNA dataset A2.
In the plastid DNA tree there was strong support for a sister relationship between the Lactuca canadensis clade + L. plumieri and the L. palmensis clade consisting of L. inermis Forssk., L. palmensis Bolle and L. tenerrima Pourr. (1, 87, 100 for PP, BS and JK, respectively). In the nrDNA tree the L. canadensis clade + L. plumieri was resolved within the nuB clade; its topological position with respect to other clades was unresolved but showed an association with the L. dissecta D. Don, L. inermis and L. perennis L. clades (Fig. 1). There were a number of clades that received strong statistical support in both trees, including the L. racemosa clade (L. macrophylla and L. racemosa) and the L. perennis clade (consisting of L. glauciifolia Boiss., L. intricata Boiss., L. perennis and L. undulata Ledeb.).
Phylogeny reconstruction based on the A44 low-copy region: dataset B — The complete alignment of dataset B consisted of 31 samples, of which 13 were cloned and sequenced (5 – 10 colonies sequenced per clone) and 18 represented direct sequences of the A44 region ( Appendix 1 (wi.48.48206_Appendix_1.xlsx)). Of the nine samples of diploid species that were cloned, four showed no allelic variation [Cicerbita alpina, Lactuca sibirica (L.) Benth. ex Maxim. c, L. undulata and L. viminea], one showed seven A44 copies (L. macrophylla), one showed three copies (L. racemosa) and one showed two copies (L. tatarica c; Appendix 1 (wi.48.48206_Appendix_1.xlsx)). Only minor variation was found between sequences from samples that had multiple copies and the sequences from cloned samples were resolved as monophyletic, however sequences of two cloned samples of the diploid L. racemosa were paraphyletic with respect to each other (Fig. 2). The cloned sequences of the additional outgroup diploid members Cephalorrhynchus soongoricus and L. bourgaei also showed no sequence variation. They were resolved in the phylogenetic analyses together with the directly sequenced C. polycladus in the same positions within the Lactuca lineage as in the plastid DNA tree by Kilian & al. (2017b). The inclusion of these taxa negatively affected clade support (tree not shown); they were therefore excluded from the final analysis of dataset B. In contrast to the cloned diploids, all five allopolyploid taxa from the L. canadensis clade (four North American taxa and one of the two samples of the Azorean L. watsoniana (L. watsoniana a; LAC-175; Appendix 1 (wi.48.48206_Appendix_1.xlsx))) exhibited two distinct allelic copies resolved in divergent clades in the Lactuca lineage: clone type A and type B (Fig. 2).
MP analyses of dataset B resulted in 148 parsimony informative characters, a consistency index of 0.62 and a retention index of 0.86. The two clone types A and B for the Lactuca canadensis clade were resolved in distinct clades within the Lactuca lineage (Fig. 2): clone type B within the larger clade (black star; 0.99 PP) of the Lactuca lineage and clone type A within a separate smaller clade in a polytomy with L. plumieri (1; 58; 78 PP, BS and JK, respectively; Fig. 2). Lactuca indica was sister to this clade that contained the North American and Azorean L. canadensis clade and L. plumieri, however with low statistical support (0.64; 58; 78 for PP, BS and JK, respectively). The larger clade (black star) that contained clone type B received high support at the stem and for nearly all shallow nodes within, yet the deeper nodes (at the backbone) were poorly supported and were therefore collapsed and represented as polytomies (Fig. 2). Within this larger clade, strong statistical support was observed for a clade (0.99 PP) containing a polytomy of the L. canadensis clade clone type B (0.99 PP), the L. racemosa clade (1; 93; 98 PP, BS and JK, respectively) and the L. tatarica clade (1; 100; 100 PP, BS and JK, respectively). Within the L. tatarica clade, L. oblongifolia was resolved as sister to a clade consisting of L. tatarica (sample a) and two L. sibirica samples (b and c), all from E Siberia. The only L. quercina that was sampled in the A44 analyses (L. quercina a) from SW Russia was also resolved within the L. tatarica clade and was sister to two sequences from the same cloned sample of L. tatarica (c) from NE Germany (Fig. 2; Appendix 1 (wi.48.48206_Appendix_1.xlsx)). The following four clades were also well supported within the larger clade (black star) of the Lactuca lineage: 1: the L. viminea clade (1; 100; 100 PP, BS and JK, respectively), 2: the core Lactuca clade (L. serriola and L. virosa; 0.99; 75; 76 PP, BS and JK, respectively), 3: the L. perennis clade (0.98; 68; 90 PP, BS and JK, respectively) and 4: the L. palmensis clade (full support for PP, BS and JK).
Likelihood statistics for the nrDNA dataset (A1) from ancestral area estimation models implemented in BioGeoBEARS: LnL, Log likelihood; AIC, Akaike Information Criterion; LRT, Likelihood Ratio Test.
Time-calibrated BI phylogeny — The time calibrated phylogeny based on dataset A1 (nrDNA) is shown in Fig. 3 and that based on A2 (plastid DNA) is provided in Suppl. Fig. 1 (see Supplementary Material online (wi.48.48206_Suppl_Fig_1.pdf)); estimated divergence times for both analyses are provided in Table 2. The stem age of the Lactuca lineage was estimated in the nrDNA tree as 6.6 myr (HPD 1.4–11.8) and in the plastid DNA tree as 8.7 myr (HPD 6.1–12). A late Miocene origin was therefore estimated for the onset of the Lactuca lineage. For the purpose of the present study the divergence time estimations only for the nodes relevant for discussion will be described below.
Within clade nuA, the core Lactuca clade was estimated to have diverged from L. quercina c. 2.6 mya (HPD 0.4–4.9) during the late Pliocene-early Pleistocene, and the L. tatarica clade diverged from the L. viminea clade 2.6 mya (HPD 0.5–4.8). Within clade nuB, the L. canadensis-plumieri clade is estimated to have diverged during the early Pliocene c. 4.8 mya (HPD 1.1–8.7) and the following stem nodes were estimated during the late Pliocene-mid Pleistocene for the entire L. canadensis clade and the clade consisting only of North American allopolyploid taxa: 2.4 mya (HPD 0.4–4.5) and 1.6 mya (HPD 0.3–3), respectively.
In the plastid DNA tree, the core Lactuca clade is estimated to have diverged from the L. viminea clade during the early Pliocene c. 3.9 mya (HPD 2.4–5.6) and the L. tatarica clade is estimated to have diverged from L. quercina 4.5 mya (HPD 3–6.2). The clade containing the L. canadensis clade + L. plumieri is estimated to have diverged 3.8 mya (HPD 2.2–5.5) and the following stem nodes were estimated during the late Pliocene-mid Pleistocene for the L. canadensis clade and the clade consisting of only North American allopolyploid taxa: 3.3 mya (HPD 1.6–5.3) and 2.5 mya (HPD 1.1–4.1), respectively.
Ancestral range estimations — BioGeoBEARS analyses identified the BAYAREALIKE +j model as the best fit model for both the nrDNA (dataset A1; InL -86.81, LRT < 0.05; Table 3) and plastid DNA datasets (dataset A2; InL -89.24, LRT < 0.05). Values for range expansion, range extinction and founder event parameters were 0.01, 0 and 0.02, respectively, for the analyses of the plastid DNA dataset and 0.019, 0.017 and 0.024, respectively, for the nrDNA dataset. This suggests that the model relied on range expansion and founder events more than extinction. Ancestral range estimations are given for statistically supported nodes in Fig. 3, corners represent the ranges instantaneously after cladogenesis and estimations at the nodes correspond to ranges instantaneously before cladogenesis for the ancestral biogeographic range analyses of the nrDNA dataset. See Suppl. Fig. 1 (wi.48.48206_Suppl_Fig_1.pdf) for the results of the biogeographic range analyses of the plastid DNA dataset.
According to the BAYAREALIKE +j model the ancestral area for the Lactuca lineage was region A (W Palearctic) in both analyses, this remained the dominant ancestral area from the origin of the Lactuca lineage until the early Pliocene. In both trees, according to the BAYAREALIKE +j model the stem node for the L. tatarica clade was inferred to be region A c. 2.6 mya in the nrDNA tree and c. 5.1 mya in the plastid DNA tree and the crown node (Fig. 3) for this clade c. 1.3 mya (nrDNA tree) or 3.7 mya (plastid DNA tree) is AB (W and E Palearctic). There was strong support for a sister relationship between L. oblongifolia and L. sibirica in the plastid DNA tree, unlike in the nrDNA tree where the resolution within this clade was poor and consisted of a polytomy with L. oblongifolia, a clade consisting of L. tatarica individuals and a clade consisting of L. sibirica individuals. According to the plastid DNA tree an LDD event from region AB to D was estimated at the split between L. sibirica and L. oblongifolia c. 2.7 mya ( Suppl. Fig. 1 (wi.48.48206_Suppl_Fig_1.pdf)).
The W Palaearctic (region A) was estimated as the ancestral area for the crown node for the clade consisting of the Lactuca canadensis clade + L. plumieri (Fig. 3), according to the BAYAREALIKE +j model. For the plastid DNA dataset, the stem and crown node ancestral ranges for the L. canadensis clade + L. plumieri was ambiguous between A (W Palearctic) and AD (E Palearctic and North America), receiving 67 % and 33 % support, respectively, c. 2.4 mya for the nrDNA dataset and c. 3.3 mya for the plastid DNA dataset.
Lactuca lineages native to North America: origins and intercontinental migration patterns — All analyses in the present study (nrDNA, plastid DNA and A44), confirm that the two Lactuca clades containing species native to North America are of independent origin: one is the Eurasian-North American L. tatarica clade containing one North American taxon: L. oblongifolia (= Mulgedium pulchellum); the other is the L. canadensis clade containing seven North American taxa and the Azorean L. watsoniana (Table 1; Fig. 1 and 2). We first discuss the diversity, origins and migration patterns for both clades inferred from phylogenetic analyses, and time calibration and ancestral biogeographic range estimations. We then discuss the novel insights into the composition and relationships of major clades within the Lactuca lineage revealed in the present study, and some taxonomic conclusions.
Both disjunct Northern Hemisphere Lactuca clades (L. canadensis and L. tatarica clades) are estimated to have originated at around the same geological time and both are of relatively recent minimum age; the late Miocene/early Pliocene to early Pleistocene (estimated stem and crown ages; Fig 3; Table 2). Ancestral biogeographic area estimations, however, indicate potentially different migration routes for the two clades (Fig. 3). The BAYAREALIKE model (+j) is estimated as the best-fitting model to explain ancestral biogeographic range estimations within Lactuca in both the nrDNA and plastid DNA data analyses. In contrast to DEC and DIVALIKE, this model supports widespread sympatry with no vicariance or special “event” during speciation, which may appear unusual for datasets of such widespread flowering plants (Matzke 2013). However, if the +j parameter is estimated with the BAYAREALIKE model, this may provide a good approximation for such ancestral biogeographic ranges. The +j parameter supports jump dispersal as the main dispersal mode and therefore suggests a strong influence of LDD in the evolution of Lactuca. It is important to consider that we sampled broadly across Lactuca in order to capture the diversity among the key clades identified by Kilian & al. (2017b). Therefore, only representatives of clades were sampled. Despite this, there was deeper sampling for the North American Lactuca clades compared to previous studies. We do not know the proportion of extinction within the Lactuca lineage, which could influence the extent (or lack) of vicariance estimated within the present dataset. The better fit of the BAYAREALIKE model compared to DEC and DIVALIKE to our data may also be the result of many taxa in the current dataset exhibiting broad and overlapping distributions. BAYAREALIKE+j has been shown as a good model to explain the biogeographic history of island lineages such as limpets in the Cape Verde Islands (Cunha & al. 2017) and was also found in biogeographic analyses of the butterfly family Rionidae [Lepidoptera; Espeland & al. (2015)]. For the purpose of the discussion we will focus on the ancestral biogeographic range estimations for the nrDNA dataset because statistical clade support was higher in these analyses (Fig. 1 and 3). Where appropriate we will also refer to biogeographic range estimations based on the analysis of the plastid DNA dataset provided in the supplementary material ( Suppl. Fig. 1 (wi.48.48206_Suppl_Fig_1.pdf)).
The Eurasian-North American Lactuca tatarica clade — The Lactuca tatarica clade (Table 1) includes three blue-flowered perennial diploid (2n = 18, Watanabe 2017) herbs: L. sibirica and L. tatarica, which are widespread in Eurasia, and L. oblongifolia from North America (Table 1). In accordance with the results by Kilian & al. (2017b), the present study resolved the L. tatarica clade in a sister group relationship with the L. viminea clade in the nrDNA tree, but with L. quercina in the plastid DNA tree. The low-copy nuclear A44 tree in the present study also reveals that L. quercina is closely associated with the L. tatarica clade (Fig. 1). Therefore, the two nuclear trees (nrDNA and A44) are inconsistent with respect to the position of L. quercina. This may reflect ILS and/or reticulation but may also indicate that the plastid DNA and nuclear A44 trees represent the species tree better than the nrDNA tree does. This suggestion is further supported by the fact that the AFLP analysis by Koopman & al. (2001) also resolved a sister group relationship of the L. tatarica clade and L. quercina.
The North American Lactuca oblongifolia has often been treated as a subspecies of L. tatarica (L. tatarica subsp. pulchella). However, in the plastid DNA tree it is sister to L. sibirica, whereas in the A44 tree it is sister to the clade including both L. tatarica and L. sibirica [if we disregard the NE German sample of L. tatarica (sample c), see below] and in the nrDNA tree, the topology within this clade is unresolved. Lactuca sibirica occupies the northernmost distribution range of all Eurasian Lactuca species. It is a boreal element extending from E Scandinavia in the west to Kamchatka, N Korea and N Japan in the east, with a distribution area chiefly situated north of c. 55°N in its W part and north of 40–45°E in its Central and E Asian part (Hultén & Fries 1986: map 1916; Meusel & Jäger 1992: map 537d; Dahl 1998). It mainly occurs in open forests and forest openings, at forest edges, or in shrub communities. Lactuca tatarica is characteristic of E European-Asian steppes, and also occurs in salt-steppes, semi-deserts and on sands. Its distribution ranges from E Europe across temperate Asia to E China. The northern limits range from c. 55°N in easternmost Europe to c. 45°N in E China. In Central Asia its area is contiguous to or little overlapping in the north with that of L. sibirica. The westernmost limits of its natural distribution are the sand dunes at the W coast of the Black Sea, while all more westerly occurrences in Europe are anthropogenic (Hultén & Fries 1986b: map 1917; Meusel & Jäger 1992: map 537d). There, L. tatarica was introduced in the second half of the 19th century (British Isles) and early 20th century (Baltic Sea area), likely with shoot-bearing root fragments or diaspores in the ballast sand of sailing ships used in the grain trade from the Ukrainian Black Sea coast (Krisch 1989). In recent decades, the species has also reached further inland, e.g. in Germany (Net-PhyD & BfN 2013: 466). Lactuca oblongifolia occupies a broad area in North America ranging from c. 70°N in the northwest to 25–30°N in the south, and with the easternmost extension (outliers reaching the E coast) at around 45–55°N (Hultén & Fries 1986b: map 1917 under L. tatarica subsp. pulchella; Meusek & Jäger 1992: map 537d under L. pulchella). The three species of the L. tatarica clade are morphologically similar (Lindberg 1936; Stebbins 1939; Kirpicznikov 1964), and L. sibirica and L. tatarica are in fact most reliably distinguished by vegetative features. Kisiel & Michalska (2009) found them to differ phytochemically in the sesquiterpene lactone profiles of their roots, namely in the compositions of germacranolide glycoside. Lactuca sibirica shares a rare compound with L. virosa and L. sativa that is missing in L. tatarica, while L. tatarica exhibits a number of unique constituents, so far not known in other species (Kisiel & Michalska 2009). With respect to the morphology of the Eurasian L. tatarica and North American L. oblongifolia, which occur in the same types of habitats, Stebbins (1939, under L. tatarica subsp. pulchella) concluded that “there is no characteristic in which overlapping has not been found”. It is not clear whether differences exist between the three taxa with respect to the underground parts, because data are scarce and unreliable; most likely the lateral roots of all of them are shoot-bearing. The overlapping morphological characteristics between these three taxa therefore somewhat correspond to the unclear sister group relationships revealed in the present study. The incongruence between the plastid DNA tree and A44 tree may reflect ILS and/or hybridization between L. sibirica and L. tatarica, with L. oblongifolia being a result of the hybridization. Hybridization giving rise to fertile F1 hybrids between L. tatarica and L. sibirica was shown by crossing experiments in Koopman (2001). Several aberrant counts of 2n = 27 (and 36) for material of L. tatarica (and L. sibirica) from Russia indexed by Watanabe (2017) may perhaps refer to spontaneous hybrids. It should be noted that cloned A44 amplicon sequences from L. sibirica (c) and from L. tatarica (c) from Siberia and Germany, respectively, are non-divergent and are resolved in their own distinct clades for each species (Fig. 2). The sample of L. bourgaei that was incongruent between plastid DNA and nrITS trees in Kilian & al. (2017b) also showed non-divergent A44 allelic copies after cloning and sequencing in the present study (data not shown). Incorporation of more low-copy nuclear markers will help to further investigate the plastid and nrDNA incongruences, for example via hybrid capture and target enrichment methods (Mandel & al. 2014).
Lactuca tatarica appears to be non-monophyletic in the A44 analyses, in contrast to the plastid and nrDNA analyses. The L. tatarica (a) sample from Siberian Russia is sister to a clade formed by the two accessions of L. sibirica from Siberian Russia; L. oblongifolia is sister to these three samples. In contrast, the L. tatarica (c) sample from NE Germany is resolved in a well-supported clade with L. quercina (a; from SW Russia, Northern Caucasus; Appendix 1 (wi.48.48206_Appendix_1.xlsx); Fig. 1). The position of the L. tatarica sample from NE Germany (c), where the species is a post-1980 neophyte (NetPhyD & BfN 2013: 466), may indicate introgression with L. quercina.
Origin of the North American Lactuca oblongifolia and the Bering land bridge: the L. tatarica clade —Ancestral biogeographic range estimations of the Lactuca tatarica clade were similar in the nrDNA and plastid DNA analyses. The clade was estimated to have originated in the W Palearctic (region A, Fig. 3) in the middle Pliocene, c. 3.4 mya (HPD 2.4–4.6; nrDNA dataset) and c. 5.1 mya (HPD 2.7–7.7; plastid DNA). Subsequent range expansion and diversification of the L. tatarica clade is estimated to have occurred in the Palearctic (region AB, W and E Palearctic, Fig. 3) during the late Pleistocene c. 1.29 mya (nrDNA dataset). In the nrDNA tree, the sister relationships within the L. tatarica clade were unresolved but according to plastid DNA, L. sibirica and L. oblongifolia are sister taxa and are estimated to have diverged from each other c. 3.21 mya in the mid Pliocene ( Suppl. Fig. 1 (wi.48.48206_Suppl_Fig_1.pdf)). Pollen records suggest that North America and Eurasia were connected by two different biogeographic regions in the past: one was the North Atlantic Land Bridge (NALB) that connected E North America and Europe and persisted until c. 15 mya (Milne 2006) yet possibly continually facilitated plant dispersal until much later in the late Miocene (Denk & al. 2010); the other was the Bering land bridge (BLB) that connected W North America with Asia and persisted until the Pliocene but with fluctuating and non-continuous land masses facilitating episodic plant migration until the Pleistocene (Tiffney 1985a; Wen & al. 2016). Pollen and radiocarbon evidence suggest that the BLB vegetation was predominantly graminoid herbaceous tundra with some steppe-like characteristics, similar to the current habitats of L. tatarica and L. sibirica (Elias & Crocker 2008). Lactuca oblongifolia is present in NE Alaska and L. sibirica occurs in forest-steppe zones in Far East Siberia including the Chukotka peninsula, the easternmost peninsula of Asia (Stebbins 1939; Kharkevich 1992). These are therefore the only two Lactuca taxa known to currently border the Bering Strait. Furthermore, L. oblongifolia is estimated to have a Pleistocene origin; it is therefore possible that the break up of the BLB through Pleistocene climatic fluctuations led to the geographic isolation of the ancestors of L. sibirica and L. oblongifolia leading to the genetic separation and disjunct distribution of these taxa between Eurasia and North America (Fig. 3; Suppl. Fig. 1 (wi.48.48206_Suppl_Fig_1.pdf)).
The North American-Azorean allotetraploid Lactuca canadensis clade — Phylogenetic analyses in the present study resolve a clade consisting of the North American-Azorean L. canadensis clade (Table 1; Fig. 1–3), in accordance with Dias & al. (2018). Lactuca biennis and L. floridana (Fig. 4A) were resolved as distinct from all other North American members of the L. canadensis clade in the nrDNA tree, in accordance with Schilling & al. (2015), and in the tree based on the A44 dataset, while only L. floridana was distinct in the plastid DNA tree. The finding that the two species L. biennis and L. floridana are sister to the remainder of the North American L. canadensis clade in the nrDNA tree is consistent with their morphological distinctness: they have predominantly pure-white latex, bluish (sometimes white and rarely yellow) corollas (Fig. 4A), fusiform, moderately compressed achenes with (2 or)3(or 4) median ribs on either side and with a stout short (0.1–<1 mm) beak (Fig. 5B, C), and a pappus with an outer row of minute hairs. In contrast, the remainder of the North American L. canadensis clade predominantly have brownish orange latex, yellow corollas (Fig. 4A), achenes that are broadly ellipsoid and strongly flattened usually with a single median rib on either side and with a filiform beak of 1–4.5 mm (Fig. 5D–G), and a pappus without an outer row of minute hairs. The white-flowered Azorean L. watsoniana from the L. canadensis clade (Fig. 4C, D) has pure-white latex, strongly flattened achenes usually with a single median rib on either side but combined with a short, stout beak, and a pappus without an outer row of minute hairs (Fig. 5H).
Allopolyploidy (2n = 34, x = 17; Watanabe 2017; Dias & al. 2018) is unique to the Lactuca canadensis clade within the Lactuca lineage (all other Lactuca lineage species have x = 8 or x = 9). As first assumed by Babcock & al. (1937), members of the L. canadensis clade originated via hybridization between an x = 8 and x = 9 species. The allopolyploid L. canadensis clade is, however, geographically isolated from all extant species that have a homoploid chromosome number of x = 8, and the North American L. oblongifolia of the L. tatarica clade is the only native sympatric x = 9 Lactuca species. Therefore, the ancestral hybridization event that led to the allopolyploid L. canadensis clade may have occurred between species that are now extinct or its members may have occurred in a region that is no longer inhabited by the progenitors, or indeed both scenarios may be the case. Furthermore, there is a geographical disjunction within the L. canadensis clade because L. watsoniana is endemic to the Azores while all other members are restricted to North America (Table 1). Here, we bring together the results of cloning the low-copy nuclear marker A44, phylogenetic analyses of nrDNA, plastid DNA and A44 data, and migration pathways based on ancestral range estimations of the nrDNA tree to discuss the ancestry of this allopolyploid clade and propose potential progenitors, the geographic location of the hybridization event and the migration pathways.
Elucidating the potential x = 8 and x = 9 progenitors of the allopolyploid Lactuca canadensis clade — Of the 14 samples from across Lactuca for which the A44 region was cloned, divergent copies were only found in the allotetraploid L. canadensis clade, supporting its suitability as a marker for investigating the hybrid origin of this group. Two divergent A44 clone types in well-supported clades were revealed for all four individuals sampled from North America (L. biennis, L. floridana, L. graminifolia and L. hirsuta) and in one of the two L. watsoniana samples (a; clone types A and B; Fig. 2). The 2n = 16 (x = 8) L. plumieri was resolved within a clade with all sequences from clone type A, consistent with direct sequencing of plastid and nrDNA (Fig. 1). Therefore, the L. canadensis clade likely originated via hybridization between L. plumieri (or an extinct ancestor) and an unknown 2n = 18 (x = 9) Lactuca species (Fig. 2). This is further supported by clear morphological similarities between L. plumieri and the North American L. biennis and L. floridana that are the genetically distinct members of the allopolyploid L. canadensis clade. Like L. biennis and L. floridana, L. plumieri also has blue corollas, moderately compressed achenes with (2 or)3(or 4) median ribs on either side, at most a very short stout beak (Fig. 5A), and a pappus with an outer row of minute hairs.
Analyses of the low-copy nuclear marker A44 enabled us to determine candidates for the x = 9 progenitor species. The largest clade in the A44 tree (black star; Fig. 2) consists of three well-supported clades: one contains all clone type B sequences from samples of the allotetraploid Lactuca canadensis clade, a second represents the L. tatarica clade (+ L. quercina; all members with x = 9) and the third represents the L. racemosa clade (endemic to the Caucasus and x = 8; therefore to be ruled out as a potential progenitor). Members of the L. tatarica clade + L. quercina (x = 9) are therefore candidates in the search for the x = 9 ancestor, because they are closely associated with clone type B in the A44 tree (Fig. 2).
Lactuca indica represents another candidate for the x = 9 progenitor species (or an extinct ancestor of this). This is supported by results of our analyses of the low-copy nuclear marker and morphological observations. Firstly, L. indica is resolved as sister to the clade consisting of the L. canadensis clone type A and the x = 8 L. plumieri, but with low support, in the A44 tree (Fig. 2), a topology differing from those based on directly sequenced plastid and nrDNA (Fig. 1). Secondly, L. indica has strongly flattened achenes with a single median rib on both sides and a filiform beak, features that are conspicuously similar to those of five of the members of the North American L. canadensis clade s. str. (Fig. 5D–G); and, with the exception of the filiform achene beak, L. indica is also similar to the Azorean L. watsoniana (Fig. 5I). In contrast, as mentioned above, achenes and pappus features of the remainder of the L. canadensis clade (L. biennis and L. floridana) are similar to L. plumieri, the most likely x = 8 progenitor (Fig. 5A–C). Thirdly, the presence of a conspicuous bowl-shaped nectary (0.1–0.2 mm high) surrounding the style base and persisting on the pappus disk of the mature achene is a synapomorphy of L. indica (c. 0.2 mm; Fig. 51) and its sister L. formosana (c. 0.1 mm). There is some variation in the shape and size of the nectaries in Asteraceae (Mani & Saravanan 1999), but no systematic investigations on tribal, subtribal or genus level in the Cichorieae exist. The conspicuous bowl-shaped nectary in species of the L. indica clade has not to our knowledge been addressed before in the literature, and seems exceptional in Lactuca. Therefore, it is notable that bowl-shaped nectaries to some extent are also present in members of the allotetraploid North American L. canadensis lineage: most clearly in L. biennis, where the nectary is c. 0.1 mm high and in L. floridana, where it is 0.05–0.08 mm high (Fig. 5B, C); in L. canadensis it is below 0.05 mm, while in the other species below 0.02 mm and therefore hardly noticeable. This may of course be explained by convergence, but the fact that there is also a similarity in achene shape may indicate that a L. indica clade progenitor has contributed to the ancestry of the L. canadensis clade. However, L. indica is, like L. plumieri, sister to clone type A of the L. canadensis clade. To explain this, and as originally proposed by Babcock & al. (1937), a more complex scenario of hybridization and polyploidization events involving three (or more) diploid species (Brysting & al. 2007) may have led to the origin of the allopolyploid lineage. Evidence for hybridization involving different combinations of diploid species, several of them multiple times, has also been shown for Glycine [Fabaceae; Doyle & al. (2004); Bombarely & al. (2014)].
In light of our phylogenetic analyses the following species therefore represent the candidates for progenitors (or the closest extant relatives of the now extinct progenitor species) of the allopolyploid L. canadensis clade: L. plumieri as the x = 8 progenitor, and as the x = 9 progenitor(s) L. indica (based on A44 analyses and morphology) and/or a member of the L. tatarica + quercina clade (based on A44 analyses). In order to narrow down the candidate progenitor species and to decipher the origin of the L. canadensis clade, we need to investigate two key factors more deeply: (1) the geographic origin of the ancestors and the potential geographic location of the hybridization event, and (2) the potential migration pathway(s) that led to the current distribution of the North American-Azorean L. canadensis clade.
Eurasia to North America migration of the Lactuca canadensis clade via land bridge(s) followed by colonization of the Azores — The Lactuca canadensis clade is estimated to have originated during the late Miocene to early Pliocene c. 4.8 mya (HPD 1.1–8.7) and the allopolyploid L. canadensis clade diverged from L. plumieri during the late Pliocene to early Pleistocene c. 2.7 mya (HPD 1.8–3.8) in the W Palearctic. The W Palearctic therefore represents a potential geographic region for the hybridization event between the Lactuca x = 8 and x = 9 species giving rise to the x = 17 L. canadensis clade (Fig. 3). The Azorean endemic L. watsoniana and the North American L. canadensis clade are estimated to have diverged from one another 1.6 mya (HPD 0.3–3.1; Fig. 3). The ancestral area for the crown node of the L. canadensis clade (node 6; Fig. 3) is estimated as either A (W Palearctic only; 67 %) or AD (W Palearctic with North America; 33 %). The Azores are indeed treated as part of the W Palearctic in our biogeographic analyses. We can, however, exclude the option that the hybridization event occurred in the Azores and was followed by LDD of the x = 17 clade to North America because no other Lactuca species are native to the Azores and the extinction of both progenitors (x = 8 and x = 9) of the x = 17 lineage in the Azores is unlikely. Dias & al (2018) therefore proposed a role of the BLB to explain the migration of the L. canadensis hueage from W Palearctic to North America, and subsequent LDD to the Azores from North America.
Polyploidization events prior to and facilitating LDD have occurred within the Asteraceae. Polyploidy rigour is likely to have aided in the successful colonization and rapid dispersal in North America for Lactuca. An example is Microseris D. Don, which is diploid in its ancestral area of North America yet allotetraploid in Australia as its derived area (Vijverberg & al. 1999). Similarly, plants of the silversword alliance (Argyroxiphium DC.) are diploid in their ancestral area yet tetraploid in the derived area of Hawaii (Baldwin & Wagner 2010). However, in other cases polyploidy has occurred after dispersal, e.g. in the amphitropical American disjunctions in Polemoniaceae (Johnson & al. 2012) or the Australian Lepidium L. [Brassicaceac; Mummenhoff & al. (2004)]. Qian (1999) showed that 48% of the native North American flowering plant genera are also native to Eurasia and studies using molecular tools have highlighted a strong influence of Beringia or the NALB facilitating the colonization of North America (Donoghue & Smith 2004; Liu & al. 2017). Therefore, we propose a W Palearctic or, considering the E Asian distribution of one of the x = 9 candidates (L. indica), an E Palearctic and not an Azorean origin for the North American L. canadensis clade. To support this further, a North American rather than European origin is likely for the Azorean L. watsoniana, because Alpine-Pyrenean origins, as is the case for L. plumieri (Wegmüller 1994), the x = 8 progenitor, are to our knowledge unknown for Macaronesian native taxa. When Macaronesian taxa do show a European origin, the association is typically with the Mediterranean region (Heleno & Vargas 2015), e.g. Bencomia Webb & Berthel. [Rosaceae; Helfgott & al. (2000)] and Sideritis L. [Lamiaceae; Barber & al. (2002)]. North American origins are, however, observed in Sedum L. (Crassulaceae) native to Madeira (Ham & Hart 1998) and likely explain the origin of the Azorean endemics Smilax azorica [Smilacaceae; Schaefer & Schoenfelder (2009)] and Solidago azorica Seub. [Asteraceae; Schaefer (2015)]. Previous studies also revealed a potential biogeographic relationship between the New World and Macaronesia in Bystropogon L'Hér. [Lamiaceae; Trusty & al. (2005)]. The North American members of the L. canadensis clade are strong dispersers and their broad distribution would facilitate dispersal to the Azores. Potentially two or more introductions to the Azores would have occurred, thereby contributing to the high levels of intra-specific genetic variation in L. watsoniana observed by Dias & al. (2016; see also Stuessy & al. 2014). Polyploid colonizers of islands, in particular neopolyploids, such as this relatively recent allopolyploid lineage in the evolution of Lactuca, are at an advantage for colonization and diversification in archipelagos (Crawford & Stuessy 1997; Stuessy & al. 2014). The current islands of occupancy for L. watsoniana are younger than the estimated age for L. watsoniana in the present study: São Jorge is 1.32 myr and all other islands of occupancy are <1 myr (Johnson & al. 2012; Sibrant & al. 2015; Ávila & al. 2016; Ramalho & al. 2017). However the western islands of the Azores, i.e. Flores and Corvo, are older [2.16 myr; Azevedo & Ferreira (1999) and 1.5 myr França & al. (2002), respectively]. Lactuca watsoniana populations may therefore have initially colonized the older, western islands and subsequently gone extinct, supporting a west-east migration path of the L. canadensis clade from North America (Dias & al. 2018).
Competing hypotheses for the hybridization event, geographic location and migration pathway in the ancestry of the Lactuca canadensis clade — We have provided arguments to support a Palearctic (Eurasian) origin of the North American Lactuca canadensis clade and a North American origin of the Azorean L. watsoniana. Now we discuss the geographic location of the hybridization event and migration pathway of the L. canadensis lineage to North America. To date, Lactuca species that are x = 17 have never been reported from the Palearctic. Furthermore, robust and vigorous allopolyploid lineages that are the result of hybridization between diploid progenitors, such as the members of the L. canadensis clade (Fig. 4B), would likely persist in their region of origin. It is therefore unlikely that the allopolyploid hybrid lineages would have gone extinct in Eurasia. In conclusion, we propose two competing hypotheses for the location of the hybridization event and migration pathway in the ancestry of the L. canadensis lineage. In both hypotheses, dispersal via wind would have facilitated LDD of Lactuca achenes at periods when the NALB and BLB were continuous island chains rather than unbroken land bridges:
Hypothesis 1: Migration via both the NALB and BLB. Lactuca plumieri is the closest extant relative to the x = 8 progenitor species and is restricted to SE Europe with an Alpine-Pyrenean distribution. In contrast, the candidate species L. indica and members of the L. tatarica clade are present in E Asia. The BLB could therefore have facilitated the migration of the x = 9 candidate, and the NALB for L. plumieri, to North America. Ancestors of both may have co-occurred in North America and hybridized giving rise to the x = 17 lineage in North America. The diploid progenitors would have subsequently gone extinct in North America.
Hypothesis 2: Migration via the NALB only. The hybridization event may have occurred in a region either where species extinction has been common or where populations of widespread species were subject to extinction due to geological and or climatic factors. Therefore, the hybridization event may have occurred on the NALB between ancestors of L. sibirica (x = 9), because this species reaches NW Europe, and L. plumieri. While it was generally assumed that the NALB persisted only until c. 15–10 mya, more recent evidence suggests that it likely provided a closely spaced island chain facilitating plant migration until the late Miocene (Tiffney 2008; Denk & al. 2010; Wen & al. 2016). The NALB is a potential explanation for the disjunct distributions seen today in Eurasian and North American native oak species, based on genetic diversity analyses (Denk & al. 2009). The eventual break-up of the NALB would have caused the progenitor species populations in that region to decline, leading to the isolation of the L. canadensis lineage from its L. sibirica and L. plumieri progenitors, which are now either extinct or are restricted to Eurasia.
The present study reveals a potential role for both the NALB and BLB in shaping the biogeographic history of Lactuca. Therefore, we highlight novel biogeographic evidence for more recent migrations across the NALB and BLB to North America compared to previous studies. Analyses in the present study give strong insights into the origin of native Lactuca species in North America, but further research using more genetic loci, e.g. via hybrid capture methods (Mandel & al. 2014), will reveal deeper understanding of the diversity patterns and origins of the North American lineages. It should be noted that because the clone type B clade is itself part of a larger polytomy, we cannot entirely exclude the possibility that the 2n = 18 ancestry is located in one of the other clades of this polytomy. Finally, our considerations are based on a single low-copy nuclear marker. Therefore, to arrive at a clearer hypothesis on the origin of the allotetraploid L. canadensis clade, further research is ongoing using phylogenomic approaches and inference of subgenomes present in allopolyploid lineage taxa, as well as including all 2n = 18 taxa of the entire Lactuca clade.
Novel insights into the diversity of the North American members of the Lactuca canadensis clade — The North American members of the Lactuca canadensis clade comprise some seven species. The clade is widespread in North America, occurring from Yukon in the north to Mexico and Guatemala in the south. Its members occur in a range of habitat types, such as mesic forests and woods, especially in openings and clearings, edges of woods, thickets and forests, stream banks and swamps, prairies, as well as disturbed habitats (McVaugh 1972; Doležalová & al. 2002; Strother 2006). Taxonomy and geographic distribution of the North American clade outside of the U.S.A. and Canada is in need of revision. Members of the North American clade reported for the Caribbean (Acevedo-Rodríguez & Strong 2012) are L. canadensis, L. floridana and L. graminifolia. Whether these occurrences are native or introduced is unclear. A further species, the endemic L. jamaicensis Griseb., was described in the 19th century from Jamaica (Grisebach 1861); our investigation of the type material revealed that it is in fact conspecific with L. indica and must therefore represent a human introduction, most certainly via SE Africa in the course of the 18th to 19th century slave sugarcane trade. For Mexico, Villaseñor (2016) listed five native species of Lactuca, one of which, Lactuca intybacea Jacq., is an introduced species of Launaea Cass. (Kilian 1997) and the other, L. oblongifolia (under L. pulchella) is, according to our revision of herbarium material in the course of this study, likely a misidentification of the introduced L. serriola, which also holds true for its report from Texas (Strother 2006, under Mulgedium pulchellum). The other three species reported by Villaseñor (2016) are of the North American clade: L. ludoviciana, L. graminifolia and the largely neglected Mexican L. brachyrrhyncha Greenm. (Greenman 1899) The most commonly reported species of the North American allopolyploid clade in Mexico (including its southern states) is L. graminifolia. McVaugh (1972) distinguished three varieties: besides the typical variety he described var. arizonica McVaugh from SW U.S.A., and var. mexicana McVaugh for the plants in S Mexico and Guatemala. Lactuca graminifolia var. mexicana from C and S Mexico shows in the nrDNA analysis a unique ribotype resolved in a separate clade sister to the L. canadensis clade (Fig. 1). The seven samples sequenced here for the nrITS region (a – g) come from the southern states of Chiapas and Oaxaca and the central states of Querétaro and San Luis Potosí. The number of nucleotide substitutions—three each in ITS and ETS—suggests that they are well distinct. The main morphological difference given by McVaugh (1972) is the shorter beak of the achene of 1–1.7(-2) mm compared to (1.5-)2–4 mm in typical L. graminifolia. McVaugh's L. graminifolia var. mexicana strongly resembles L. brachyrrhyncha described from C Mexico, the most conspicuous feature of which is an even shorter beak of only c. 0.5 mm (Fig. 5F, G). Lactuca brachyrrhyncha was apparently neglected by McVaugh and many other botanists. Given the variability of the beak length in L. graminifolia var. mexicana and the otherwise strong similarities with L. brachyrrhyncha, we assume that the type collection of L. brachyrrhyncha represents a population with particularly short-beaked achenes and that both names refer to the same taxon. The leaves in the type collection of L. brachyrrhyncha are all entire and linear-lanceolate, whereas they are deeply pinnatifid in the type material of L. graminifolia var. mexicana, but leaf shape shows high infraspecific variation within American Lactuca species. Concluding from our analysis that the populations in C and S Mexico and Guatemala previously attributed to L. graminifolia are better recognized as a separate species, L. brachyrrhyncha provides the name for it at species rank.
We know from our cultivations of Lactuca biennis, L. canadensis, L. floridana, L. graminifolia and L. hirsuta at the University of Tennessee Gardens that at least these, but probably all other L. canadensis clade members, are strictly biennial and self-compatible. Four of the five members of the sister group to L. biennis and L. floridana, the widespread and morphologically diverse species L. canadensis, L. graminifolia (incl. var. graminifolia and var. arizonica), L. hirsuta and L. ludoviciana, all exhibit an identical ribotype (sister to L. brachyrrhyncha) and haplotype, according to nrDNA and plastid DNA analyses, respectively. They are not always easy to distinguish morphologically, because of strong variation and often shallow differentiation. Moreover, their status is sometimes not fully clear: L. hirsuta, with a scattered occurrence across a wide area, is usually strikingly distinct from L. canadensis because of its indumentum; it may be suspected, however, that L. hirsuta represents a morph of L. canadensis that occurs repeatedly and individuals of self-compatible variants such as this morph can perpetuate by selfing.
Hybridization is also likely when we take morphological observations into account: Lactuca graminifolia beginning flowering in early spring and L. canadensis in summer co-occur in southeastern states of the U.S.A. and, when the flowering periods overlap, fertile hybrids are known to become locally abundant (Whitaker 1944). Lactuca canadensis can hybridize with L. biennis (forming L. ×morsii B. L. Rob.), and hybrid plants exhibit morphological characteristics intermediate between the two species. This group may undergo phenotypic plasticity in different habitat types giving rise to high levels of morphological diversity but exhibiting loose genetic reproductive barriers and therefore enabling hybridization in regions where species may overlap. Self compatibility has likely facilitated rapid geographic range expansion and the colonization of novel habitats. Recent records of L. canadensis from Brazil as a result of introductions (Monge & al. 2016), and the same explanation for the occurrence of this species in Haiti and the Dominican Republic (Acevedo-Rodríguez & Strong 2012), indicate that L. canadensis is potentially invasive in a number of regions because it favours disturbed habitats and can rapidly become established.
Novel insights into genetic diversity of the Lactuca lineage — In this study, expanded taxonomic sampling of the Lactuca lineage and using a further two nuclear markers, as well as direct sequencing and cloning of a low-copy nuclear marker (A44), resulted in greater resolution within the Lactuca lineage as well as shedding light on the incongruent patterns between nuclear and plastid DNA markers. Here we discuss the outcome for two taxa, L. serriola and the L. racemosa clade.
Lactuca serriola, the wild progenitor of lettuce and originating from SW Asia (Kuang & al. 2008), is known to have a large and variable gene pool and is an invasive species in many parts of Europe, in North America and in Australia (Hooftman & al. 2006; D'Andrea & al. 2017). In the past c. 20 years this taxon has rapidly increased its geographic range across Central and W Europe including NW Europe and parts of Scandinavia, potentially caused by global climate change, increased availability of disturbed ruderal habitats that provide ideal conditions for this pioneer taxon and increased opportunity for hybridization with conspecific crop species (Hooftman & al. 2006). The single L. serriola sampled here from its native range in Europe was distinct from the three samples from its introduced range in North America. It is notable that within North American samples, infraspecific variation is also found in the plastid DNA dataset. Given the genetic variation observed here, further sampling incorporating greater genetic loci and climatic data analyses would be highly beneficial to better understand causes of and level of invasiveness of L. serriola (see also D'Andrea & al. 2017).
Kilian & al. (2017b) resolved a clade in their nrITS tree that consisted of two SW Asian taxa: L. racemosa and L. macrophylla. Both have overlapping distributions in the wider Caucasus region (Meusel & Jäger 1992: 315, map 539d). The nrITS analyses by Kilian & al. (2017b) incorporated two accessions of each taxon and revealed that L. macrophylla was non-monophyletic, suggesting hybridization between these taxa or ILS. All analyses (nrDNA, plastid DNA, and A44, Fig. 1 and 2) in the present study corroborated this clade. According to the plastid DNA analysis, geographic structuring is apparent within L. racemosa between samples from the Caucasus (L. racemosa a; Fig. 1) and a typical high-altitude population (2200 km) in NE Turkey, Sarikanis (L. racemosa b; Fig. 1). Cloned sequences of the A44 region from L. racemosa (sampled from Turkey) and L. macrophylla reveal that neither taxon is monophyletic, because allelic copies from each taxon were resolved in the same clades (Fig. 2). These results therefore confirm the likely occurrence of hybridization between these taxa, and this was relatively recent (Pleistocene) according to their time of origin based on nrDNA divergence time estimations (Fig. 1 and 3). In the Caucasus both taxa occur at high altitudes yet in different ecological regions; L. racemosa is commonly associated with subalpine birch forests of the Greater Caucasus Mountains (Togonidze & Akhalkatsi 2015), whereas L. macrophylla is associated with subalpine tall herbaceous vegetation (Nakhutsrishvili & al. 2006). Currently, a phylogenetic study with a focus on Lactuca in SW Asia is underway and will provide more detail on diversity patterns (Güzel & al. in prep.).
Lactuca brachyrrhyncha Greenm. in Proc. Amer. Acad. Arts 34: 578. 1899. – Syntypes: [Mexico] Tlalnepantla, Valley of Mexico, Federal District, 2225 m, 6 Jul 1898, C. G. Pringle 6883 ( E00394943!, GOET001764!, JE00009358!, K000222233!, M0030839!, NY00180415!, S-G-3516!, US00119867!).
= Lactuca graminifolia var. mexicana McVaugh in Contrib. Univ. Michigan Herb. 9: 370. 1972, syn. nov. – Holotype: Mexico, Chiapas, Mpio. Teopisca, 1770 m, 19 Aug 1966, D. E. Breedlove 15041 ( MICH1107492!; isotypes: CAS0003144!, ENCB003713!, LL00374396!).
Distribution — C and S Mexico and Guatemala. The northern delimitation of its distribution area to Lactuca graminifolia and possible introgression in the contact areas need further investigation.
Lactuca indica L., Mant. Pl. Alt.: 278. 1771. – Lectotype (designated by Merrill in Bot. Mag. (Tokyo) 51: 192, t. 3. 1937): Osbeck 13, Herb. Linn. No. 950.8 ( LINN!).
= Lactuca jamaicensis Griseb., Fl. Brit. W. I.: 384. 1861, syn. nov. – Syntypes: Jamaica, W. T. March 2004 ( GOET001769!, K000222235!); Jamaica, St Marys, 1839, G. McNab 61 ( GOET001770!); Jamaica, G. McNab ( GOET001771!).
Distribution — Used as salad but also occurring as a ruderal. Native to E and (part of) SE Asia. It was introduced in historical times to SE Africa, where it is naturalized. In the 18th or 19th century it was introduced, likely from SE Africa, to Jamaica, which was for a long time its only place of occurrence in the Americas. Only in the late 20th century was the species introduced also to Brazil (Monge & al. 2016).
We thank the Verein der Freunde des Botanischen Garten and Botanischen Museums Berlin and the Hesler Fund of the University of Tennessee for funding of the molecular lab work and the herbaria GH, GOET, US and NY for the loan of material. The use of high performance computing resources at the Scientific Computing Service of the Freie Universität Berlin and of the CIPRES Science Gateway of the San Diego Supercomputer Center at the University of California San Diego is gratefully acknowledged. We are grateful to Bettina Giesicke, Doreen Weigel and Julia Pfitzner for excellent support in the lab, to Phuong Thao Nguyen for support finalizing sample files for submission, to Randy Small for help in development of the A44 marker, and to Thomas Denk and a second anonymous reviewer for their valuable comments and suggestions to improve the final version of the manuscript.