Agricultural practices routinely increase interactions between crops and their wild relatives, often resulting in hybridization (Ellstrand et al., 1999; Warwick and Stewart, 2005). Repeated hybridization events can lead to gene flow, allelic introgression, and significant evolutionary change in wild populations (Ellstrand et al., 1999, 2013). In extreme cases, hybridization can produce new hybrid lineages (e.g., Secale cereale L. × S. montanum Guss. [Sun and Cooke, 1992], Manacus vitellinus Gould × M. candei Parzudaki [McDonald et al., 2001], Helianthus annuus L. × H. debilis Nutt. [Whitney et al., 2006]). More generally, crop gene introgression into wild or weedy relatives could have serious consequences for the genetic diversity and genetic composition of populations, analogous to those diversity consequences created by introducing species into ecological communities (Sax and Gaines, 2003). Introduced alleles compete for space in population gene pools as introduced species compete for niche space in invaded communities. With this premise, one could explore consequences of crop allele introgression using metrics from community ecology. Yet, to date, such studies continue to rely on conventional population genetic parameters to the exclusion of community diversity indices.
Slobodkin (1961) was among the first workers to suggest that introduced species might increase species diversity of recipient communities. Rosenzweig (1985) corroborated this prediction with the caveat that increased diversity would occur only at regional and local scales, with introductions reducing species diversity at global scales. Although community ecologists frequently predict increased species diversity at local or regional scales with species introductions, population biologists often predict the opposite when considering allele introductions, largely due to three influential papers in population genetics. First, Slatkin (1987) demonstrated that directional gene flow has a homogenizing effect on genetic diversity between populations. Second, prior to the recent genetically engineered crop agricultural revolution, Ladizinsky (1985) revealed that crops were generally less genetically diverse than their wild relatives. Finally, Ellstrand et al. (1999) hypothesized that directional gene flow from genetically depauperate and locally abundant crops to their genetically diverse and less locally abundant wild relatives would lead to reductions of genetic diversity in the wild relative, through genetic swamping. Consistent with this hypothesis, near extinction of endemic species involved in crop–wild hybridization events has been documented for multiple crops, including rice (Oryza rufipogon Griff. subsp. formasana Masam. & S. Suzuki [Kiang et al., 1979]), cotton (Gossypium darwinii G. Watt and G. tomentosum Nutt. [Wendel and Percy, 1990]), and mulberry (Morus rubra L. [Burgess et al., 2005]). By contrast, new breeding strategies may lead to increased genetic diversity at single (transgenic breeding) or multiple (mutation breeding) crop-specific loci after cultivars hybridize with wild populations (Lusser et al., 2012; Hartung and Schiemann, 2014). Therefore, the predicted genetic consequences of crop–wild hybridization remain unclear.
Theoretical population geneticists have devised many metrics for quantifying allelic diversity within populations (Berg and Hamrick, 1997). In assessing population-level polymorphisms, so-called alpha-diversity metrics focus on estimating number of alleles per locus or frequency of heterozygous genotypes (e.g., Nei and Roychoudhury, 1974). Unfortunately, these metrics do not distinguish among alleles. Considering this, metrics can produce equivalent estimates of genetic diversity regardless of whether alleles in the gene pool represent the original (wild) alleles or replacement (conventional or transgenic crop) alleles (e.g., Nei's heterozygosity [HN], allelic frequency [A], temporal changes in allele frequency [Ftemporal], effective population size [Ne; Schwartz et al., 2007). Furthermore, many of these metrics fail to identify changes in frequency of particular alleles (e.g., measures of evenness that assess the relative frequency of alleles or species, including the Margalef or Brillouin indices; Heip et al., 1998), and where genetic metrics do represent multilocus estimates, the resulting value averages genetic patterns across loci. We expect the averaging effect of multilocus estimates could hinder detection of new alleles introgressed at some loci (e.g., percent polymorphic loci, number of alleles per polymorphic locus), especially when selection intensity varies among loci (Barton and Hewitt, 1985; Hartman et al., 2013). Therefore, genetic introgression may be severe at some loci while nonexistent at other loci, and multilocus estimates may describe an intermediate effect, at best (Baack and Rieseberg, 2007). At worst, diversity measured across loci could miss allele frequency changes and substitutions at specific loci. If a conservation goal is to maintain a naturally occurring quantity and quality of allelic diversity in populations of wild or weedy relatives (e.g., Allendorf et al., 2001; Mallet, 2005; Edmands, 2007), we must be able to accurately measure changes in allele frequency (alpha-diversity [α-diversity], including the Shannon–Weiner index [H], the Simpson diversity index [D], evenness) and substitutions (beta-diversity [β-diversity], including the Jaccard similarity coefficient or estimates of biotic homogenization). Conclusions made from the genetic consequences of crop allele introgression may be limited by the tools available (because genomic tools may not be available for nonmodel organisms, such as wild relatives) and may benefit from using community ecology metrics, especially those that assess allelic frequency changes (i.e., α-diversity [Whittaker, 1972]), substitutions (i.e., β-diversity [Whittaker et al., 2001]), and homogenization (Olden and Poff, 2003; Olden et al., 2004).
Genetic homogenization can occur when crop alleles introgress into wild populations; the genetic uniqueness of a population is predicted to decline, with a resulting increase in genetic similarity of two hybridizing gene pools (i.e., a decrease in β-diversity [Hobbs and Mooney, 1998; McKinney and Lockwood, 1999; Olden and Poff, 2003; Olden et al., 2004]). Historically, biotic homogenization has been applied to taxonomic or functional groups (Hobbs and Mooney, 1998; McKinney and Lockwood, 1999). To our knowledge, this is the first attempt to use biotic homogenization metrics to characterize genetic homogenization in any biological population. Through the process of introgression, one common allele could be exchanged with another, resulting in no net loss of genetic diversity (Olden and Poff, 2003; Olden et al., 2004), although the quality of genetic diversity could have changed. Alternatively, introgression could replace a rare allele with representatives of an already common allele within the population, thus representing a loss of allelic diversity (Olden and Poff, 2003; Olden et al., 2004). Furthermore, a new rare allele, such as a transgene, may replace some representatives of a resident, common allele, thus increasing allelic diversity. However, rare alleles are expected to be the most easily lost from small populations (Fuerst and Maruyama, 1986). Crop allele introgression into small populations of wild plants should lead to the loss of such alleles but may also introduce new rare alleles from the crop population. Finally, the degree of population divergence could also be measured by using similarity indices (i.e., Jaccard similarity coefficient [Jaccard, 1901]) or cluster analyses (e.g., Pritchard et al., 2000).
We illustrate this alternative approach to measuring genetic consequences of hybridization using a multilocus data set on wild (Beta vulgaris L. subsp. maritima) and putative crop–wild hybrid beet populations (B. vulgaris subsp. vulgaris × B. vulgaris subsp. maritima (L.) Thell.) scattered throughout Europe (Bartsch et al., 1999). Each putative hybrid population was located near one of three nontransgenic crop populations—sugar beet, red beet, or Swiss chard. We compared the genetic diversity of wild and putative crop–wild hybrid beet populations to estimate consequences of introgression, prior to the introduction of genetically engineered B. vulgaris crops, to contribute to a comprehensive risk assessment of the probability that transgenes or other novel crop traits, if introduced, would be transferred to the rare, wild relative, and its potential evolutionary consequences. In their analysis using typical population genetic metrics (e.g., number of alleles, number of alleles per locus, percent polymorphic loci, and Nei's heterozygosity), Bartsch et al. (1999) failed to detect any significant change in genetic diversity in crop–wild hybrid populations relative to wild populations, based on conventional population genetics (and often multilocus) metrics. Here, by comparing species diversity indices with traditional population genetic metrics, we assess the genetic consequences of crop–wild hybridization in the B. vulgaris crop–wild–weed complex and the relative sensitivity of these two approaches to the effects of gene flow. We ask: (1) When parental populations share significant amounts of genetic diversity, which diversity metrics (multilocus vs. single-locus, population genetic vs. community diversity metrics) are sensitive enough to detect hybridization events? and (2) What are the genetic consequences of crop–wild hybridization? Viewing introgression as analogous to species invasion, we suggest that increased genetic diversity may likewise be an undesirable out-come depending on the protection goal in species conservation. In cases where “genetic integrity” of natural populations is of foremost importance, any detection of “invasive alleles” should be weighed against potential benefits of adaptive evolution in a dynamic environment.
Study system—The common beet (B. vulgaris, Amaranthaceae) has been cultivated for more than 2000 yr in the Mediterranean (Ford-Lloyd and Williams, 1975). Domestication of subspecies vulgaris has resulted in three commonly cultivated forms: red beet, sugar beet, and Swiss chard. Within this species complex, cultivated beet (B. vulgaris subsp. vulgaris) is sexually compatible with a wild relative, sea beet (B. vulgaris subsp. maritima [Bartsch et al., 1999; Hanelt, 2001; Bartsch, 2010]). Cross-pollination between cultivated and wild subspecies is typically mediated by wind pollination, although insect pollination can also occur (Bartsch et al., 1999; Hanelt, 2001; Bartsch, 2010). However, farmers typically remove F1 crop–wild hybrids from vegetative root production fields because the plants are easily identifiable (as they flower earlier than the biennial crop plants). Therefore, we are only concerned with the consequences of gene flow and introgression from the crop plants to the wild relative. Wild B. vulgaris subsp. maritima is typically found along the coasts of Cape Verde, the Canary Islands, and the Atlantic coast of Europe. It is also found as far east as India (Ford-Lloyd and Williams, 1975). Weed beets are annual, hybrid derivatives of crop and wild beet populations, and below we use the term weed beet interchangeably with crop–wild hybrid beet populations (Boudry et al., 1993; Arnaud et al., 2010; Bartsch, 2010).
Data set—We used data for 19 wild populations and 20 putative hybrid populations, and 14 sugar beet cultivars, five Swiss chard cultivars, and seven red beet cultivars (Bartsch et al., 1999). The number of sampled individuals per population or cultivar varied from five to 106. Wild plants were collected from their complete range across Europe and putative hybrid plants were collected from northeastern Italy, one of the most important sugar beet seed production areas in Europe (see fig. 1 in Bartsch et al., 1999).
Data were generated from isozyme analysis of 12 polymorphic loci (for details see Bartsch et al., 1999). Unequal sample sizes allow for some populations to appear to have greater genetic diversity simply because more plants had been sampled. Therefore, we removed from all analyses any population from which fewer than 10 individuals were genotyped (one putatively wild population of the 19 sampled and two putatively hybrid populations of the 20 sampled were removed). Some populations were physically close to each other and may have been a single genetically coherent population, despite (possible) geographic separation. To ensure that each of the 36 wild or putatively hybrid populations and 26 crop cultivars sampled was an evolutionarily independent unit, we used a Bayesian clustering method to group populations within the categories wild, putative hybrid, or crop that could not be considered significantly different (Pritchard et al., 2000). From this clustering, we were left with eight hybrid populations, 10 wild populations, and two crop populations (sugar beet vs. chard/red beet). Each clustered population included more than 30 individuals. To equalize sample size of these populations, we then randomly selected the genotypes of 30 individuals per population to analyze below. The data set is freely available from the Dryad Digital Repository ( http://dx.doi.org/10.5061/dryad.2d95h; Campbell et al., 2016).
Genetic metric calculations—To provide statistics comparable with the original Bartsch et al. (1999) paper, we estimated the genetic diversity of each population by calculating the number of alleles (A), number of alleles per polymorphic locus (Ap), percent polymorphic loci (P), and Nei's heterozygosity (HN) (Nei, 1978) for each locus. We then repeated these analyses to calculate a multilocus estimate for each population, to determine whether multilocus analyses might have masked any effects of hybridization and homogenization. For instance, the multilocus estimate of number of alleles per polymorphic locus was calculated as the total number of alleles detected in any locus divided by the number of polymorphic loci in a particular population. Next, we calculated the genetic diversity of each population using the Shannon–Weiner index (H), Simpson diversity index (D), and McIntosh evenness index (E), calculating both single and multilocus estimates using the software Species Diversity and Richness (Seaby and Henderson, 2006). Specifically, for each individual within each population, multilocus genotypes were defined combining the genotypic information of loci (as in Siol et al., 2008). Multilocus diversity was then measured using the Simpson diversity index model corrected for finite sample size (Pielou, 1969):
where the number of individuals with multilocus genotype i are represented by ni and the total number of multilocus genotypes is represented by N. Finally, to measure the similarity of genetic composition among hybrid populations with either wild or crop populations, we calculated the Jaccard similarity coefficient for all pairwise combinations of hybrid populations with wild or crop populations. There are many indices for measuring biotic homogenization but, as reviewed by Olden and Poff (2003) and McKinney (2004), most previous studies have used the Jaccard similarity coefficient and so we follow this convention.
To assess the effect of hybridization on rare alleles in populations, we first identified alleles that significantly contributed to gene diversity (SIGNALs [see Kamala et al., 2006] ; see description of the method in following sentence) and subtracted this value from the total number of alleles (A) to estimate the number of rare alleles within a population. To identify SIGNALs, we calculated the 95% confidence interval around each allele frequency using the following equation (Snedecor and Cochran, 1967):
where qij = 1 − pij and N = the number of alleles genotyped. We then used a repeated-measures ANOVA (assuming a Poisson distribution) in SPSS (IBM SPSS Statistics v. 22; IBM, Armonk, New York, USA) to determine whether the number of rare alleles (non-SIGNAL alleles) differed between wild and putative hybrid populations, where locus was the within-subjects repeated measure.
Detecting hybridization with population genetic vs. community ecology diversity metrics—Of the four population genetic metrics and three species diversity indices, all calculated using a multilocus approach, only one (HN) detected a change in genetic diversity in hybrid populations (Fig. 1). Relative to populations of wild sea beet, hybrid populations exhibited statistically significantly larger HN (Mann–Whitney U = 16, z = -2.09, P = 0.037). However, we did not detect differences in other multilocus metrics of genetic diversity, including percent polymorphic loci (U = 40, z = 0.04, P = 0.10), number of alleles per polymorphic locus (U = 40, z = 0.04, P = 0.10), total number of alleles (U = 30, z = −0.84, P = 0.40), Shannon–Weiner's H(U = 23, z = −1.47, P = 0.14), Simpson's D (U = 26, z = −1.20, P = 0.23), or McIntosh's E (U = 47, z = −1.41, P = 0.16).
In contrast to the multilocus results, single locus comparisons using species diversity metrics were more informative. Relative to wild sea beet populations, hybrid populations exhibited larger Shannon–Weiner diversity (H, at 11 of 12 loci, sign test: P = 0.003, e.g., Fig. 2), larger Simpson's D (at 10 of 12 loci, P = 0.02), and larger McIntosh's E values (at 11 of 12 loci, P = 0.003). Moreover, single locus comparisons of diversity using traditional population genetic metrics were less sensitive than community ecology metrics to changes in allele diversity or composition. Relative to wild sea beet populations, hybrid populations exhibited more alleles (at 10 of 12 loci, P = 0.02), but hybrid populations did not differ significantly from wild populations in percent polymorphic loci (decreased at nine of 12 loci, P = 0.073) or number of alleles per polymorphic locus (increased at eight of 12 loci, P = 0.19).
The consequences of hybridization for genetic diversity— Across the 12 loci, putative hybrid beet populations possessed only one-quarter of the rare alleles found in wild populations (χ2 = 22.5, df = 1, P < 0.001; Fig. 3). Based on the Jaccard similarity coefficient, hybrid populations were 10.8% (±SE = 0.6%) more similar to chard cultivars than wild populations (one-sample t test: t = 18.66, df = 47, P < 0.001). Furthermore, hybrid populations were 3.7% (±SE = 1.1%) more similar to sugar beet cultivars than wild populations (t = 3.45, df = 47, P = 0.001). Finally, based on paired comparisons, hybrid populations were significantly more similar to chard than sugar beet cultivars (paired t test: t = -6.62, df = 47, P < 0.001).
When transgenes or other novel crop traits escape domestic plants and introgress into wild populations, it is relatively easy to use genetic markers to detect their escape. However, multilocus approaches to detecting hybridization between conventional (nontransgenic) crops and wild beet populations appear to be relatively insensitive, because these organisms share a substantial amount of genetic diversity. In contrast, single-locus comparisons of diversity, especially using species diversity metrics, detected increased genetic diversity in putative hybrid weed beet populations that can be attributed to crop–wild hybridization. Furthermore, hybridization may lead to reduced frequencies of rare alleles in wild beet populations. Finally, as predicted, hybridization led to genetic homogenization of wild beet populations, where hybridizing beet populations seem to be more genetically similar to crop cultivars than wild relatives and specifically more similar to chard or red beet than sugar beet cultivars. This approach, when used prior to the introduction of transgenic crops, can provide a baseline estimate of potentially unwanted allelic introgression from nontransgenic crops into wild relatives and contribute to the risk assessment of the potential for transgenes to introgress, and the evolutionary consequences of gene flow from crops to their wild relatives. Below, we compare our conclusions with those of Bartsch et al. (1999), explore some possible explanations for our conclusions, and discuss the limitations of the proposed methods for detecting the genetic consequences of crop–wild hybridization.
Using multilocus estimates of genetic diversity, Bartsch et al. (1999) detected a small increase in genetic variation in weed beets relative to their wild progenitor (differences in percent polymorphic loci [P], number of alleles [A], number of alleles per loci were minor), with a dramatic increase in heterozygosity (HN). Similarly, using a multilocus approach here, we failed to detect significant genetic diversity differences among weedy and wild populations, except where we detected significantly larger HN in weed beet populations. Upon closer examination of the dynamics of two alleles, MDH2-1 (a common red beet/ chard allele) and ACO1-2 (a common sugar beet allele), Bartsch et al. (1999) could detect gene flow from crop to wild populations where alleles common in crop populations (and relatively rare in wild populations) were also found in relatively high frequencies in weed beet populations. Because sugar beet was the focus of genetic transformation breeding and thus assessment of the risk of transgene escape, it was important to document the frequency of transfer of a nontransgenic allele common to sugar beet, ACO1-2, into weed beet populations. Our repeated, single-locus comparisons among weedy and wild populations revealed more pervasive increases in genetic diversity after introgression at 10 or more loci, especially when using community ecology metrics. With the use of the Jaccard similarity coefficient, we can now provide a more refined description of the relative frequency with which weed beet populations hybridize with sugar beet or Swiss chard parental populations and found that weed beet populations have more evidence of chard than sugar beet allele introgression.
Recently, another study (Uwimana et al., 2012) attempting to detect crop allele introgression in wild or weedy populations has noted similar methodological difficulties in detecting hybrids as those encountered by Bartsch et al. (1999), where crop-specific alleles are unavailable. Detection using shared alleles may be most problematic when hybridization events are rare, such as between Lactuca serriola L. and L. sativa L. in Europe (7% hybridization rate), when detected using multilocus simple sequence repeat (SSR) markers and analyzed with Bayesian clustering methods (either STRUCTURE or NewHybrids; Uwimana et al., 2012). These types of studies are important when wild gene pools are being conserved as in, for example, in situ germplasm banks. Introgression by crop alleles would diminish the likelihood that wild populations would contain rare but useful genetic diversity for crop breeding. Furthermore, when a crop-specific allele is known (such as a transgene or a naturally occurring allele), detection relies on the successful introgression of those specific alleles at a frequency high enough to be detected (e.g., Warwick et al., 2008). Although our approach may help solve both of the problems encountered by other studies (i.e., detecting crop introgression when crop-specific alleles are unknown and describing the genetic consequences of hybridization beyond detecting hybridization), new approaches to estimating species diversity, specifically information theoretic approaches measuring the degree of divergence (rather than similarity, as measured by current metrics), may also improve our ability to estimate the genetic and evolutionary consequences of crop–wild hybridization (e.g., Abou-Moustafa, 2014).
This study used a pre-existing, high-quality data set that employed protein isozymes to detect genetic variation and is known to have rather low genetic variation compared to more recent molecular tools including microsatellites or single-nucleotide polymorphisms. As we mentioned above, even studies that have employed more modern molecular tools may have difficulty finding markers that distinguish crop from wild populations (e.g., Uwimana et al., 2012), and thus traditional population genetic metrics may not reveal a signature of hybridization, despite ongoing gene flow. Ecological diversity metrics can be used with any codominant molecular marker and may help to identify patterns of gene flow even when distinctive crop alleles are absent because community diversity metrics can detect shifts in not only the quantity of diversity but also the identity of alleles within populations.
Through this process, we applied multiple statistical tests on a single data set, which will inflate the occurrence of a Type I error (Verhoeven et al., 2005; Waite and Campbell, 2006). Detecting false positives may be preferable over a Type II error (i.e., false negatives), especially when performing a risk assessment for a rare plant (García, 2003). In addition, the community diversity measures we used to detect crop–wild hybridization belong to the Hill's family of measures (Magurran, 2004). Our results were consistent across indices belonging to the Hill's family of measures, suggesting the results we obtained were robust and not due to an inflation of Type I error. Considering this, more often than not, these measures produce the same end results, as we would expect, and decrease the suspected error associated with multiplicity.
Loci are not individual units; rather they often have levels of linkage disequilibrium associated with each other. Therefore, treating loci as individual units in a diversity measure, as we do, rather than using a multilocus approach, implicitly violates the structure of genomes. However, when this occurs, each linked locus would provide essentially the same message of introgression or lack thereof. When attempting to detect rare events of introgression or events where shared alleles may be exchanged to alter their frequencies, linkage disequilibrium will either accentuate the event, because large portions of the crop genome will introgress, or disguise the event as even more rare, because large portions of the wild genome will be resistant to introgression. When the event is disguised because of linkage disequilibrium in the wild genome, traditional multilocus approaches fail, as revealed here. This tool is more powerful than traditional population genetic metrics in detecting recent introgression events (yes/no), but without the multilocus approach it will not accurately describe the proportion of the genome that has introgressed crop alleles. In that case, a multilocus approach may be more appropriate.
Risk assessment has, rightly, focused on assessing the likelihood that transgenes inserted in sugar beet populations may escape into wild or weed beet populations. From this assessment, we have documented that crossing with crop beet populations has altered the genetic composition of wild beet populations, making them slightly more similar to crop beets. Interestingly, it seems as if crossing specifically with chard or red beets has driven more homogenizing change than crossing with sugar beets in wild beet populations. There may be several reasons for this. First, the putative hybrid weed beet populations tend to be physically closer to the chard or red beet growing locations than the sugar beet breeding locations. Proximity increases the likelihood of mating between taxa. Second, reduced rates of mating between sugar beet and wild beets may be a consequence of some modern sugar beet breeding practices. For instance, breeders extensively use recessive genetic male sterility factors (e.g., Honma et al., 2014). Maintaining and propagating those agents of sterility in sugar beet breeding populations might reduce the chance of genetic introgression and subsequent homogenization if sugar beet pollen carrying such alleles “meets” wild/weedy mother beets.
In summary, we encourage other workers to use species diversity indices to detect introgression, especially in crop–wild systems where populations share a significant amount of genetic diversity. This approach may provide helpful genetic diversity information to determine whether conservation programs are meeting protection goals associated with genetic integrity for endangered wild species (e.g., Cartagena Protocol on Biosafety to the Convention on Biological Diversity, 29 January 2000; http://bch.cbd.int/protocol/parties/).
 The authors acknowledge the Natural Sciences and Engineering Research Council (NSERC) Discovery Grants Program (no. 402305-2011) and Ryerson University for funding; M. Lehnen, J. Clegg, M. Pohl, I. Schuphan, and N. C. Ellstrand for their original work; and A. Caceido for her constructive criticism of the manuscript.