Of the 13 taxa composing the Mallard complex, 4 occur in North America: the sexually monochromatic American Black Duck (A. rubripes), Mexican Duck (A. [platyrhynchos] diazi), and Mottled Duck (A. fulvigula), and the dichromatic Mallard (Anas platyrhynchos). Although morphologically distinct, inferring the evolutionary relationships of this group is confounded by extensive genic sharing due to incomplete lineage sorting and ongoing hybridization. The objective of this study was to examine the underlying cause (i.e. incomplete lineage sorting vs. contemporary gene flow) of phylogenetic uncertainty. Whereas most taxa were fairly structured at mitochondrial DNA, a “starburst” pattern of divergence consistent with a rapid radiation was recovered with 17 nuclear introns. Furthermore, nuclear-based divergence estimates and tests of population structure recovered Florida and West Gulf Coast Mottled Ducks as well-differentiated and genetically diagnosable from each other and the remaining taxa, whereas Mallards, American Black Ducks, and Mexican Ducks were indistinguishable. In general, neither population structure analyses nor coalescent-based gene flow estimates conclusively identified the presence of hybrids or significant gene flow, suggesting that genetic similarity within the group is largely influenced by incomplete lineage sorting. However, we also cannot reject potentially high levels of gene flow. Moreover, inconsistent relationships among species trees indicated that phylogenetic results were sensitive to which individuals were included. Taxa within the New World group are phenotypically distinguishable, yet genetically similar and seemingly lack the apparent reproductive isolation that is consistent with early stages of (incomplete) speciation. Future work should focus on genomic regions under selection to better understand the stage of speciation among the various incipient forms.
Selection can cause rapid phenotypic divergence in the absence of genomic differentiation between populations and species (Orr and Smith 1998, Funk and Omland 2003). Consequently, with the exception of key trait-specific genes, genomic divergence between incipient morphs—particularly for neutral markers—is affected by the time since isolation, population size, and the rate of introgressive hybridization (Grant and Grant 1997a). Such genetic similarities, stochastic lineage sorting, and differing gene histories (i.e. selected vs. neutral) can result in inconclusive and/or discordant phylogenetic relationships among different morphological and genetic markers (Omland 1997a, Carstens and Knowles 2007, Zink and Barrowclough 2008, Humphries and Winker 2011). Reconstructing phylogenetic history, however, can be achieved by maximizing the number of samples and genomic coverage and using Bayesian methods that incorporate genealogical differences across markers (Drummond and Rambaut 2007, Liu 2008, Kubatko et al. 2009). Establishing phylogenetic relationships can help us better understand the cause of phenotypic and genetic discordance, particularly when such discord can lead to incorrect taxonomic designations (Cicero and Koo 2012).
Phenotypic–genetic discords are typically associated with rapid and/or recent adaptive radiations where morphological traits are under strong selection (e.g., trait-based assortative mating and/or niche-based selectivity; Greenberg et al. 1998), whereas the remaining genome is largely influenced by neutral processes (Humphries and Winker 2011). For instance, whereas adaptive traits can cause rapid diversification in avian lineages through premating isolation (Mayr and Ashlock 1991, Grant and Grant 1997a, Price 2008), retained genetic similarities are often attributed to strong dispersal ability (Greenwood 1980), chromosomal stasis (Ellegren 2010), and relatively low levels of reinforcement (Grant and Grant 1997b). Regardless, given suitable genomic coverage, coalescent-based approaches appear capable of resolving such complex histories (Chung and Ané 2011, Leaché and Rannala 2011, Lanier and Knowles 2012).
Phylogenetic relationships within the Mallard complex (Anas platyrhynchos and allies) have proven difficult to resolve, owing to a recent radiation, widespread interspecific hybridization, and substantial phenotypic–mitochondrial–nuclear discordance (Livezey 1991, Johnson and Sorenson 1999, Lavretsky et al. 2014). Of the 14 taxa, the most confounded relationships are those within the New World (NW) group comprising the sexually dichromatic Mallard and the monochromatic American Black Duck (A. rubripes; “Black Duck”), Mottled Duck (A. fulvigula), and Mexican Duck (A. [platyrhynchos] diazi). Mitochondrial (mt) DNA haplotypes are polyphyletic among these taxa, suggesting a recent radiation (Avise et al. 1990, Johnson and Sorenson 1999, McCracken et al. 2001, Lavretsky et al. 2014), and ongoing hybridization between Mallards and each of the monochromatic species complicates phylogenetic inferences (Heusmann 1974, Hubbard 1977, Avise et al. 1990, Dwyer and Baldassarre 1993, Merendino et al. 1993, McCracken et al. 2001, Pérez-Arteaga et al. 2002, Pérez-Arteaga and Gaston 2004, Williams et al. 2005a). Lavretsky et al. (2014), for example, demonstrated that the posterior support for the NW monochromatic taxa doubled when Mallards were excluded, suggesting a confounding effect of contemporary introgression. In the absence of fixed nucleotide differences in mtDNA and nuclear (nu) DNA, allelic frequency differences are paramount to phylogenetic reconstructions. Although coalescent methods account for incomplete lineage sorting, contemporary hybridization can bias tree reconstructions (McDade 1990, 1997, Heled et al. 2013). Consequently, resolution depends on sampling breadth of individuals and loci, and specifically on the number of individuals with mixed ancestries included in the analysis (e.g., the number of F2, F3, etc. hybrid individuals present in datasets). Being phenotypically diagnosable (Palmer 1976, Livezey 1991) but genetically similar (Lavretsky et al. 2014), the NW group is an excellent system for studying phenotypic–genetic discordance that is typically associated with recent radiations (Freeland and Boag 1999, Degnan and Rosenberg 2009, Campagna et al. 2012).
The monochromatic Black Duck, Mottled Duck, and Mexican Duck are endemic to North America (Johnsgard 1978). The Black Duck is distributed east of the Mississippi River and has migratory cycles typical of other North American waterfowl, whereas Mottled Ducks and Mexican Ducks have more restricted distributions and are sedentary. Mottled Ducks are endemic to two disjunct regions, with the first extending along the Texas–Louisiana coastline (West Gulf Coast [WGC]) and the second in Florida (FL) (Stutzenbaker 1988); these allopatric populations are genetically differentiated (McCracken et al. 2001, Williams et al. 2005a). Mexican Duck distributions extend throughout central Mexico and into parts of southern Nevada, New Mexico, and Texas (Hubbard 1977, Pérez-Arteaga et al. 2002, Peters et al. 2014b). In contrast, the dichromatic Mallard has a Holarctic distribution that extends across North America, Europe, and Asia, with strong mitochondrial differences between Eurasia and North America (Avise et al. 1990, Kulikova et al. 2005), but little to no nuclear differentiation among populations across this range (Kraus et al. 2013). Once found primarily west of the Mississippi River, environmental degradation (Livezey 1991, Green 1996, Johnson and Sorenson 1999, Mank et al. 2004) and release programs (Heusmann 1974, Soutiere 1986, Hepp et al. 1988) caused an expansion of the Mallard's range across North America leading to increased interspecific competition and hybridization with the monochromatic endemics. Growing interactions with Mallards have negatively influenced Black Duck populations since the 1950s (Ankney et al. 1987, Avise et al. 1990, Dwyer and Baldassarre 1993, Merendino et al. 1993, Rhymer 2006), leading to concerns over the possibility of extinction by introgressive hybridization (Rhymer and Simberloff 1996). Moreover, the taxonomy of both Mexican Ducks (Hubbard 1977, Pérez-Arteaga et al. 2002) and Mottled Ducks (Bielefeld et al. 2010) have gone through various revisions, and continue to be debated. Given the impact of taxonomic decisions on conservation (Stutzenbaker 1988, Chesser et al. 2011), information on evolutionary relationships and population structure, including estimates of gene flow and molecular differentiation, are required.
Given the weak support for phylogenetic relationships within the NW Mallard group (Lavretsky et al. 2014), the objective of this study was to examine the underlying cause (i.e. incomplete lineage sorting vs. gene flow) of phylogenetic uncertainty. We do this using a five-fold larger sample size, and (1) compare genetic differentiation among taxa in mtDNA and 17 nuclear loci, (2) estimate rates of gene flow and time since divergence between the dichromatic Mallard and each of the monochromatic species, and (3) infer phylogenetic relationships while examining the influence of stochastic sampling (random subsampling of individuals) on species tree reconstructions. In this study, we treat incomplete lineage sorting as the null hypothesis. Alternatively, if gene flow is playing a dominant role, then we expect some individuals to be assignable to taxon-specific populations whereas others will appear to have admixed genomes (i.e. hybrids) and evidence of non-zero gene flow. Finally, contemporary genetic similarities may be the result of recent historical introgression, and even perhaps repeated events due to glacial cycles (Waltari et al. 2007). However, we acknowledge that distinguishing such a scenario from incomplete lineage sorting may not be possible with the current molecular dataset.
We added 98 individuals (19 or 20 individuals per taxon) to the sample of Lavretsky et al. (2014) for a total of 123 individuals comprising 5 recognized species or populations (Table 1; taxonomic designations based on AOU). In general, sampling spanned each taxon's range (Figure 1); however, Black Duck samples were restricted to the most northeastern part of their range where Mallards are sparse, and therefore, this population may be less influenced by recent introgression (Figure 1). Moreover, Black Ducks were collected from the USFWS waterfowl parts collection survey, and therefore likely include individuals migrating from more northern breeding locations where Mallards are absent or rare (Johnsgard 1978). Additionally, samples for Mottled Duck populations were obtained at the Hunter Parts survey, whereas Mexican Ducks were salvaged from hunters in Mexico. In order to limit the influence of hybrids on analyses, individuals were chosen based on established “pure” wing characteristics (Carney 1992); however, we acknowledge that plumage characteristics are ineffective past the F2 stage as hybrids become phenotypically indistinguishable from parental types after multiple backcrosses (Avise et al. 1990, Kirby et al. 2000). Finally, Mallard sequences were obtained from Peters et al. (2014a, 2014b).
Sample sizes of each operational taxonomic unit used in this study.
Genomic DNA was isolated from each sample using a Qiagen DNeasy blood and tissue kit (Qiagen, Valencia, California, USA) following manufacturer's protocol. Previously optimized primers were used to amplify and sequence 17 nuclear introns (Supplemental Material Table S1 (131-04-08-s01.doc); Peters et al. 2012) and 640 bp of the mtDNA control region (Sorenson and Fleischer 1996, Sorenson et al. 1999). PCR and DNA sequencing protocols are described in detail in Lavretsky et al. (2014). Final products were sent to the DNA Analysis Facility at Yale University for automated sequencing on an ABI 3730 (Applied Biosystems, Life Technologies, Carlsbad, California, USA). Sequences were aligned and edited using Sequencher v. 4.8 (Gene Codes, Ann Arbor, Michigan, USA). Sequences were archived in Genbank (see Supplemental Material Table S2 (131-04-08-s02.xlsx)).
Gametic phases of nuclear alleles were either algorithmically determined with the program PHASE v. 2.1.1 (Stephens and Donnelly 2003) or by applying methods described in Peters et al. (2007) for heterozygous sequences containing indels; in the latter case, we compared the ambiguous 3′-end with the unambiguous 5′-end of forward and reverse sequences to resolve the composition and placement of gaps and the linkage of polymorphisms to those gaps. Sequences resolved with the latter method were included as known alleles in PHASE. Additionally, Mallard sequences were all resolved with >95% confidence from a larger dataset that included extensive allele-specific priming (Peters et al. 2014b) and were also treated as known alleles in PHASE runs. PHASE was run for 1,000 iterations after a burn-in of 1,000 steps and a thinning interval of 100. Of the 2,091 sequences (123 individuals × 17 loci), the gametic phases for 1,857 sequences (89%) were resolved with greater than 90% posterior probability. Therefore, we chose the phase reconstructions that received the highest posterior probabilities for each individual per locus for further analyses.
Relationships among Individuals
A mtDNA haplotype network was constructed using the median-joining algorithm in the program Network v. 220.127.116.11 (Bandelt et al. 1999). In addition, unphased nuclear data were concatenated for a total of 5,659 aligned base pairs and a consensus nuclear network was calculated using NeighborNet with equal angle parameters and averaging ambiguous states as implemented in SplitsTree4 (Huson and Bryant 2006). Finally, pairwise Φst estimates for each locus were calculated in Arelquin v. 3.5 (Excoffier and Lischer 2010). We tested for a correlation between Φst values estimated from mtDNA and nuDNA using a Mantel test in the program ZT (Bonnet and Van de Peer 2002).
For each nuclear locus, alleles were coded as 1 to n, where n is the total number of alleles observed for a given locus, and entered into STRUCTURE v. 2.3.4 (Pritchard et al. 2000), which uses Bayesian clustering methods to determine the number of genetic populations and to assign individuals to those populations. We tested K = 1–10 populations using 10 replicates of each value of K and 500,000 Markov chain Monte Carlo (MCMC) steps following a burn-in of 100,000 steps. The optimum K was determined by calculating ΔK in the program STRUCTURE HARVESTER (Earl and VonHoldt 2012). Final STRUCTURE outputs were based on the optimal clustering alignment across all 10 replicates for each optimum K using a FullSearch algorithm as implemented in the program CLUMPP (Jakobsson and Rosenberg 2007). The nuclear data were insufficient for assigning individuals to populations in STRUCTURE employing an Admixture model, which likely resulted from extensive allelic sharing among these taxa. Therefore, we used the No Admixture model and independence among allele frequencies to test for subtle population structuring that may be present (Pritchard et al. 2000). We also partitioned the data into the two major subgroups detected by STRUCTURE (see Results) to test for finer structure that might be masked when analyzing the full data set.
Estimates of Gene Flow and Divergence Time
Rates of gene flow and time since divergence were estimated from the combined mtDNA and nuDNA datasets using isolation with migration (IM) models (Hey and Nielsen 2004, 2007). IM assigns posterior probability density estimates for population sizes, divergence time, and migration rates from non-recombinant sequence fragments using Bayesian Markov chain Monte Carlo (MCMC) algorithms (Nielsen and Wakeley 2001). To meet the assumption of no intralocus recombination, all nuDNA were filtered for recombination using the program IMgc (Woerner et al. 2007) with weight given to maximize fragment length while maintaining the largest proportion of each population per dataset. IM analyses were run for a minimum of 10,000,000 generations following a burn-in of 1,000,000 generations; effective sample sizes (EES) were >50 for all parameters (Hey and Nielsen 2004, 2007).
Time since divergence from Mallards was simultaneously estimated with gene flow rates (i.e. IM; Hey and Nielsen 2004, 2007). Years since divergence (T) was derived as T = t/μ, where t is the time since divergence parameter scaled to the geometric mean of per-locus mutation rates (μ) estimated in IM. We used an average nuclear mutation rate of 1.2 × 10−9 substitutions per site per year (Peters et al. 2008) and an average mitochondrial mutation rate of 4.8 × 10−8 substitutions per site per year (Peters et al. 2005). Multiplying these rates by the per-locus fragment lengths (Supplemental Material Table S1 (131-04-08-s01.doc)) resulted in a geometric mean of 3.2 × 10−6 substitutions per locus per year.
Species Tree Reconstructions
Program *BEAST v. 1.7.4 (Drummond et al. 2012), which uses MCMC to estimate the posterior distribution of the species tree given the results from each gene tree (Heled and Drummond 2012), was used to reconstruct multi-species trees (Coalescent Yule process) using the nuclear data. Given that ignoring recombination provided stronger support for phylogenetic relationships but did not appear to bias topologies within the Mallard complex (Lavretsky et al. 2014; see also Lanier and Knowles 2012), full sequences were used in all phylogenetic analyses. *BEAST ran slowly and failed to converge when analyzing the full nuclear data set of 123 individuals (∼246 alleles per locus) sequenced for 17 loci. Therefore, to effectively run *BEAST, a total of 10 individuals per taxa were randomly chosen without replacement for 2 separate analyses; this was repeated 5 times for a total of 10 species trees. By doing so, we were able to examine the sensitivity of phylogenetic reconstructions to stochastic sampling, as similar and well-supported relationships between replicates would strengthen conclusions. Each locus was tested for the most appropriate substitution and clock models. Base-pair substitution models and rate parameters (i.e. gamma distribution, invariable sites) were tested in MEGA v. 5.1 (Tamura et al. 2011) and ranked based on Bayesian Information Criterion (BIC). Molecular clocks were tested for each locus by reconstructing gene trees in *BEAST v.1.7.1 with a strict clock (null model) or a Bayesian uncorrelated log-normal relaxed clock (alternative model). Bayes Factors (BF) calculated in Tracer v. 1.5 (Rambaut and Drummond 2009) were used to distinguish between models (i.e. a log BF < 3 or BF > −3 provided support for the null hypothesis of a strict clock; Li and Drummond 2012). Species trees were then reconstructed with appropriate substitution and molecular clock models (Supplemental Material Table S3 (131-04-08-s03.doc)). A piecewise linear and constant root population size model with UPGMA starting trees (Sneath and Sokal 1973) was used for each analysis, which consisted of 500,000,000 MCMC iterations with sampling every 5,000 steps for a total of 100,000 trees and a burn-in of 10%. All runs were analyzed in Tracer v. 1.5 (Rambaut and Drummond 2009) to confirm that effective sample sizes (ESS) were ≥100 for all parameters (Rambaut and Drummond 2009). A “consensus” species tree was reconstructed by summarizing the entire posterior tree set derived from all 10 species trees. Finally, posterior tree sets were visualized with the DensiTree program (Bouckaert 2010), and subsequently superimposed over their respective species tree.
Genetic Differentiation and Population Structure
Two lineages corresponding to previously described A and B haplogroups (Avise et al. 1990, Johnson and Sorenson 1999, Kulikova et al. 2005) were observed in the mtDNA haplotype network (Figure 2A). Seven Mallards, three American Black Ducks, one Mexican Duck, and one WGC Mottled Duck had group A haplotypes, whereas all others had B haplotypes. Two notable B-group haplotypes included one consisting of 12 Mexican Ducks, 3 WGC Mottled Ducks, and 2 Mallards, and another with 72% of all FL Mottled Ducks (also see McCracken et al. 2001). All other haplotypes were polyphyletic within the B haplogroup, and some haplotypes were shared between Mallards and Black Ducks and between WGC Mottled Ducks and Mexican Ducks; the FL Mottled Duck was the only taxon that did not share haplotypes with any other taxon. ΦST values corresponded to network patterns; significant ΦST values were observed for all pairwise comparisons except between Mallards and Black Ducks (ΦST = 0.023; Figure 3 and Supplemental Material Table S4 (131-04-08-s04.doc)). Mexican Ducks and WGC Mottled Ducks were similarly differentiated from each other and from the remaining taxa (ΦST = 0.07–0.14), whereas Florida Mottled Ducks were the most differentiated overall (ΦST ≥ 0.31).
Similar mtDNA-like structuring was not observed in nuDNA. Specifically, the NeighborNet appeared “star-like,” demonstrating that taxa were broadly polyphyletic and indicating that many polymorphisms were shared among taxa (Figure 2B). However, the 2 Mottled Duck populations tended to cluster together, suggesting some differentiation in allelic frequencies. These interpretations were further supported by pairwise ΦST values that indicated extensive genomic sharing and similar allele frequencies across taxa (Figure 3 and Supplemental Material Table S4 (131-04-08-s04.doc)). Only 1–2% of the variation was explained by differences among Mallards, Black Ducks, and Mexican Ducks; however, 2.5–6.5% of the total genetic variation was explained by differences between FL and WGC Mottled Ducks (mean ΦST = 0.042) and between each Mottled Duck population and the other 3 species (mean ΦST = 0.024–0.064). On average, ΦST values for mtDNA were about 5 times larger than values for nuDNA, but mtDNA and nuDNA differentiation was significantly correlated among the 10 pairwise comparisons (Mantel test, r = 0.842, P = 0.017). Program STRUCTURE analyses corroborated ΦST estimates. First, the best-supported number of populations was K = 2 when analyzing all 5 populations together. Under this model, 19 of the 24 Black Ducks, 24 of the 25 Mallards, and all Mexican Ducks were assigned to population one, whereas all Mottled Ducks were assigned to population two (Figure 4A). Sub-clade analyses did not provide additional resolution among the Mallards, Black Ducks, and Mexican Ducks; although K = 2 was the best-supported model, only a single Black Duck was assigned to the second population (Figure 4B). However, sub-clade analyses revealed that most FL and WGC Mottled Ducks were assigned to separate populations, although 5 WGC Mottled Ducks clustered with FL Mottled Ducks (K = 2 was the best-supported model; Figure 4C).
Gene Flow and Divergence Estimates
Migration estimates suggested nearly equal bidirectional gene flow between Mallards and each of the monochromatic taxa, and although consistent with low to moderate levels of gene flow, the estimates were also consistent with no gene flow (Figure 5). Specifically, the lowest bin (∼0 gene flow) was contained within the 95% highest posterior distributions for all estimates of gene flow rates. The posterior distributions for gene flow were flat between Black Ducks and Mallards, and from Mallards into Mexican Ducks; thus for these species, the data are consistent with both no gene flow and high rates of gene flow (Figure 5).
Time since divergence suggested that FL Mottled Ducks have been diverging from mallards for the longest time (390,000 years; 95% CI = 230,000–600,000 years), followed by Mexican Ducks (325,000 years; 95% CI = 190,000–600,000 years), WGC Mottled Ducks (245,000 years; 95% CI = 150,000–600,000 years), and Black Ducks (180,000 years; 95% CI = 100,000–400,000 years) (Figure 6). While these divergence times appeared to be staggard, the confidence intervals were broadly overlapping among all pairwise comparisons.
Phylogenetic analyses using the multispecies coalescent most frequently supported the 2 Mottled Duck populations as sister groups (8 of 10 trees) and grouped Mallards, Black Ducks, and Mexican Ducks as a monophyletic group (7 of 10 trees; Figure 7). These 2 groups were also most frequently supported when examining the entire posterior set of trees across runs and the resulting consensus tree. However, the inferred sister relationships varied considerably among the individual species trees. For example, Mallards were recovered as being sister to Black Ducks in 7 trees and sister to Mexican Ducks in 3 trees, and each of these relationships received high posterior support in at least one analysis. Likewise, among the separate analyses, both phylogenetic placements of the WGC Mottled Duck as sister to the FL Mottled Duck or as part of the Mallard–Black–Mexican group received strong posterior support, and one tree had high posterior support for the earliest split being between the Mexican Duck. Integrating results from all 10 trees into a consensus tree, all phylogenetic relationships received low posterior support, suggesting that tree topologies were sensitive to which samples were included in the analysis.
Whereas the majority of pairwise comparisons among species were significantly structured at mtDNA, the group was weakly differentiated across nuclear markers (Figure 3 and Supplemental Material Table S4 (131-04-08-s04.doc)). Differences in sorting rates are likely sufficient to explain much of the variance between these marker types; ΦST values for mtDNA were 5 times larger than, but significantly correlated with, values for nuDNA, which is consistent with expectations based on mtDNA having one-fourth the effective population size of nuDNA (Zink and Barrowclough 2008). This weak differentiation is likely due to a recent and rapid radiation, coupled with gene flow between the Mallard and each of the monochromatic species, which hinders our ability to confidently reconstruct phylogenetic relationships.
Although there were few frequency differences (Figure 3) within the nuDNA dataset, subtle population structure was recovered. Specifically, STRUCTURE results, ΦST values, and coalescent trees all supported the 2 Mottled Duck populations as being most differentiated from the other taxa (Figure 4A) and from each other (Figures 7 and 4C; Supplemental Material Table S4 (131-04-08-s04.doc)). Significant differentiation between these populations is also corroborated by mtDNA, allozymes, and microsatellites (McCracken et al. 2001, Williams et al. 2005b). Elevated levels of differentiation in the Mottled Duck populations as compared with the other taxa might be attributable to their relatively smaller population sizes and sedentary behavior (Stutzenbaker 1988, Ballard et al. 2001, Bielefeld et al. 2010). In addition, the distributions of Mottled Ducks coincide with possible glacial refugia (Waltari et al. 2007), which is consistent with these populations diverging in allopatry since the last glaciation. Such demographic and temporal attributes would result in higher molecular sorting rates in these populations as compared with those with larger population sizes (i.e. Black Ducks or Mallards; Kimura and Ohta 1978) and suggests that neutral genetic drift might explain the population divergence. Interestingly, however, if demographic pressures are the primary cause of marker sorting, then why does the Mexican Duck (also sedentary with a small population size) not show similar trends?
Phylogenetic relationships among Mallards, Black Ducks, and Mexican Ducks remain inconclusive despite examining 18 independent loci. However, whereas Mallards and Black Ducks were not significantly structured at either mtDNA or nuDNA, Mallards and Mexican Ducks were significantly differentiated in mtDNA (Figures 2A, 2B, and 3; Supplemental Material Table S4 (131-04-08-s04.doc)). One possible explanation for the apparent mito–nuclear discordance is that the sorting rate of nuDNA is too slow to track their recent divergence (McCracken and Sorenson 2005, Zink and Barrowclough 2008), which is consistent with the observed correlation and five-fold difference between mtDNA and nuDNA ΦST values. Alternatively, the discord could be a result of a hybridization bias where male Mallards pair with female Mexican Ducks and hybrids backcross into the Mexican Duck population. Although Mallard abundance has steadily declined by ∼4% per year in Mexico (Pérez-Arteaga and Gaston 2004), past hybridization might have introduced Mallard alleles into the population (Scott and Reynolds 1984). Furthermore, the greatest opportunities for contemporary hybridization likely occur in the southwestern part of the U.S. where Mexican Duck populations continue to regularly interact with Mallards, and introgressed alleles have the potential to percolate into southern Mexican Duck populations. Indeed, gene flow estimates were consistent with high levels of gene flow from Mallards into Mexican Ducks (Figure 5), although posterior distributions also included zero gene flow (i.e. complete isolation). More comprehensive sampling of Mexican Ducks across their range is needed to better test hypotheses regarding the nuclear similarity between these species.
Stochastic Sampling and Hybridization
Inconsistent phylogenetic reconstructions based on 17 nuclear loci for Mallards, Black Ducks, Mexican Ducks, and the 2 populations of Mottled Ducks demonstrate the difficulties in resolving evolutionary relationships of recently radiated and currently hybridizing taxa. Despite a substantial increase in sample sizes relative to Lavretsky et al. (2014), relationships remained inconsistent across replicated species trees. The most common species tree was concordant with ΦST estimates and STRUCTURE results, supporting 2 primary lineages: one lineage consisting of Mallards, Black Ducks, and Mexican Ducks, and one consisting of FL and WGC Mottled Ducks (Consensus Tree; Figure 7). However, the regular occurrence of various other relationships demonstrates that reconstructing these phylogenetic relationships was sensitive to stochastic sampling (Figure 7). Although we suspect that the inconsistencies among trees partially resulted from the inclusion of introgressed alleles, IM analyses were unable to conclusively demonstrate gene flow between Mallards and each of the monochromatic species (Figure 5). Furthermore, the posterior distributions of times since divergence were broadly overlapping among all pairwise comparisons when using models that incorporated gene flow (IM; Figure 6), emphasizing the difficulties in reconstructing the history of divergence and phylogenetic relationships within this group. Comparing the results of the isolation–migration models with those from the multispecies coalescent suggests that incomplete lineage sorting due to a rapid radiation might be contributing to phylogenetic uncertainties more so than hybridization. However, using a six-fold larger sample size (but one-third the number of loci), Peters et al. (2014a) found significant evidence of gene flow from Mallards into WGC Mottled Ducks, suggesting that gene flow could be playing a role in the inconsistent placement of WGC Mottled Ducks among phylogenetic trees. Regardless of the cause of inconsistencies among replicated species trees, the strong posterior support observed in some replicates provides a false confidence for relationships within this group. Interpreting the well-supported trees as evolutionarily likely or correct could have significant implications if applied to taxonomy, conservation, etc. (DeSalle et al. 2005, Oyler-McCance et al. 2010).
Future work will benefit from distinguishing between the effects of incomplete lineage sorting and hybridization within datasets. Although increasing sample sizes might offer higher resolution, knowledge about the frequency and geography of ongoing hybridization can further minimize the influence of contemporary introgression by excluding individuals from such areas a priori. For example, increased geographic sampling across the Mexican Duck's range with subsequent genomic assays and comparisons between Mexican Ducks and Mallards could establish parental genotypes and help identify individuals with a hybrid ancestry. This would allow a direct assessment of the influence of hybridization on species tree reconstructions for this group.
Dichromatism is presumed to be under sexual selection in populations where species recognition and the partner's quality must be accurately assessed amidst other species and in short time periods (Johnsgard 1968). However, once selection is relaxed dichromatism can quickly be lost (Wiens 2001), as in numerous island taxa (Webster 1980). Such a scenario has been suggested for Black Ducks; Heusmann (1974) hypothesized that selection favored darker plumage that would be less conspicuous among the dark timber of northeastern North America. Moreover, although “pure” Mexican Ducks are distinguishable from Mallards, their monochromatic plumage is similar to female Mallards (Huey 1961, Hubbard 1977) and likely the ancestral state of the entire Mallard clade (Johnsgard 1961, Omland 1997b, Johnson 1999). Alternatively, while the presence of “vestigial” Mallard characters that have been described in Black Ducks and Mexican Ducks were considered to be due to recent hybridization (Hubbard 1977, Livezey 1991), these may also be remnants of a recent dichromatic ancestor within the NW taxa (Omland 1997b).
Nuclear data revealed that Mallards, Black Ducks, and Mexican Ducks are 3 morphologically differentiated populations that are genetically indistinguishable (Ankney et al 1986, Hepp et al. 1988), much like the sexually dichromatic Chestnut Teal (Anas castanea) and monochromatic Grey Teal (A. gracilis) in Australia (Dhami et al. 2013). The plumage–genetic discrepancy can be explained by either (1) neutral alleles moving freely between populations coupled with selection inhibiting or preventing alleles at other loci from introgressing, or (2) recent divergence among taxa with rapid phenotypic divergence that is not tracked by neutral variation (Winker 2009). Under the first scenario, neutral markers might provide false signals of divergence due to hybridization swamping the evolutionary signal (Palmer 1976, Johnson et al. 1999, McCracken et al. 2001, Kulikova et al. 2004), whereas under the second scenario the time since divergence has been insufficient for drift to have had a major influence on neutral allele frequencies (Avise et al. 1990, Omland 1997b). Furthermore, this group might be best represented by nearly simultaneous divergence and a hard polytomy (Hoelzer and Melnick 1994), rather than a simple bifurcating tree, as has been suggested for other groups of ducks that have undergone a rapid radiation (Bulgarella et al. 2010). The identification of diagnostic markers that might be under selection will be instrumental in understanding the evolutionary histories of these taxa.
Considering Marker Variance in Taxonomy
Species recognition in avian lineages has been the subject of extensive debate due to the high variance of pre- and post-zygotic isolating mechanisms among genera (Grant and Grant 1992, 1997a). Without observable isolating mechanisms, taxonomic status is often based on morphometric data, niche partitioning, genetic relatedness among individuals, and the phylogenetic species concept (Mayr 1963, 1982). Among the NW taxa, extensive genic and phenotypic sharing has led to several taxonomic revisions; currently, 3 of the NW groups are considered species, one pair of subspecies (Mallard & Mexican Duck), and 2 subpopulations (FL & WGC Mottled Ducks; Table 1). However, our results largely disagree with these designations. In particular, the 2 Mottled Duck subpopulations are nearly as divergent from each other as they are from the other taxa, and they might constitute different taxonomic units (e.g., subspecies; Callaghan 2005, Bielefeld et al. 2010). In contrast, Mallard, Black Duck, and Mexican Duck genetic relationships are shallow despite strong morphological differences. The discordance between morphological and genetic traits is suggestive of an adaptive radiation (Freeland and Boag 1999, Degnan and Rosenberg 2009, Campagna et al. 2012) where selective or intrinsic factors influence morphological traits while the remaining genome is largely unaffected (Palmer 1976, Humphries and Winker 2011). A recent radiation is also supported by the “starburst” nuclear tree (Aleixandre et al. 2013; Figure 2B) and the overlapping estimates of time since divergence from Mallards (Figure 6).
In such instances of a rapid radiation accompanied by phenotypic–genetic discordance, a few genes might be responsible for maintaining species integrity (specifically, maintenance of those characters that lead us to recognize different species or subspecies), whereas shared polymorphisms are retained throughout the majority of the genome and/or can freely introgress between species. Under such a scenario, each taxon examined in this study could be considered a different species under the genic species concept (Wu 2001). Alternatively, numerous species develop reproductive barriers only after secondary contact when genetic incompatibilities are built up and lead to species barrier reinforcement (Short 1969, Grant and Grant 1992). Although these species might be genetically cryptic (Grant and Grant 1997a), until speciation genes are uncovered the weak or nonexistent genetic differentiation suggests that the NW taxa may be incipient morphs. In general, selection on genomic regions responsible for species integrity needs to be stronger than gene flow rates in order to resist amalgamation (Slatkin 1987, Charlesworth et al. 1997, Wu 2001). Higher genomic coverage is necessary (i.e. through next-generation sequencing) to successfully uncover and test for the presence/absence of speciation genes, and resolving the evolutionary relationships of the NW Mallards may require thousands of loci (e.g., African rift-lake cichlids; Keller et al. 2012). Nevertheless, speciation is a dynamic process and studies of recently radiated taxa will need to consider the adaptive advantages of populations that are at present unexplained by molecular divergence, but yet maintain species integrity (Price et al. 2003).
We are grateful to Ken Richkus with U.S. Fish and Wildlife Service and the North American Flyway wingbees, Eduardo Carrera and Ducks Unlimited-Mexico, Ruben Del Castillo, Todd Scott and Wingshooters Lodge-Mexico, and Kevin G. McCracken with the Institute of Arctic Biology & Department of Biology and Wildlife at the University of Alaska Fairbanks for the contributions of samples to this study. This research was funded by Ducks Unlimited Richard H. G. Bonnycastle Fellowship in Wetland and Waterfowl Biology, American Museum of Natural History Chapman Grant, the National Science Foundation (DEB-0926162), Ohio Waterfowl Association Graduate fellowship, the UC-Davis Museum of Wildlife and Fish Biology, and a Research Initiation Grant from the Research Council at Wright State University. We also thank A. Engilis Jr. at the UC-Davis Museum of Wildlife and Fish Biology and the anonymous reviewers for comments on previous drafts.