A recent study by Zink et al. (2013) raises questions about how to interpret negative results in studies when the distinctness of a species of conservation concern is in question. Zink et al. found no evidence for genetic or ecological distinctness of the coastal California Gnatcatcher (Polioptila californica californica). We discuss why the genetic markers they chose were not well suited to the question of distinctness and how they overinterpreted negative results in their genetic and ecological analyses. We reanalyze their genetic data and find evidence that several genetic loci show significant differentiation in the coastal California Gnatcatchers. We provide recommendations for best practices in determining distinctness in phenotype, genetics, and ecology for California Gnatcatchers and other populations of conservation concern.
If land developers and conservationists are to work together, they require objective scientific evidence upon which to base their actions. In situations involving threatened or endangered species, the question of whether a population is distinct in phenotype, genetics, or habitat is often key, and all parties must determine how studies that report negative results should be interpreted. What conclusions should be drawn when scientists fail to find evidence for distinctness? Not finding something does not mean it is not there. And the likelihood of an absence being a true absence increases with search effort. These ideas form the basis of statistical hypothesis testing, a core underlying principle of the scientific process.
A recent study by Zink et al. (2013) raises questions about how to interpret negative results and what constitutes reasonable search effort in a case with high conservation stakes. Zink and colleagues found no evidence for genetic or ecological distinctness of populations of the California Gnatcatcher in the coastal sage scrub of southern California (Polioptila californica californica or coastal California Gnatcatcher). This finding, combined with similar results from a previous genetic study (Zink et al. 2000), contrasted with a century of prior work documenting the occurrence of a distinct population of gnatcatchers in southern California based on evidence of physical differences (summarized in Mellink and Rea 1994). On the basis of the negative results in Zink et al. (2013), land developers petitioned the U.S. Department of the Interior and U.S. Fish and Wildlife Service (USFWS) to remove the California Gnatcatcher from listing under the U.S. Endangered Species Act (ESA; Thornton and Schiff 2014), which could potentially open 197,000 acres of currently protected habitat to human development (Sahagun 2014).
Prior to the current debate, we have had no involvement in the issue of the coastal California Gnatcatcher. This Commentary reflects our critique of the basic science in Zink et al. (2013), and its application to taxonomy and conservation. Specifically, we discuss why their choice of genetic markers was not well suited to the question of distinctness. We also address how negative results were overinterpreted in their genetic and ecological analyses. We reanalyze their genetic data and find evidence that several genetic loci actually show significant differentiation in the federally listed coastal California Gnatcatchers. Finally, we provide recommendations for best practices in determining distinctness in phenotype, genetics, and ecology in coastal California Gnatcatchers and in other cases with important taxonomic and conservation implications.
Marker Choice and Search Effort
The California Gnatcatcher was first recognized as a distinct species based on differences from other gnatcatchers in song and morphology (Atwood 1988, Monroe et al. 1989). Phylogenetic data supported this decision (Zink and Blackwell 1998), placing the California Gnatcatcher as the sister species of the Black-tailed Gnatcatcher (P. melanura), with which it occurs sympatrically. Despite the relatively recent decision to elevate the California Gnatcatcher to species level, it is important to recognize that subspecies variation within the group, including the occurrence of a distinct form in southern California, had already been well described over the past century (summarized in Mellink and Rea 1994).
In the first genetic study of these subspecies, Zink et al. (2000) showed that coastal California Gnatcatchers did not possess a unique set of mitochondrial DNA (mtDNA) haplotypes compared to southern populations in Baja California and, therefore, did not meet the criteria set for “reciprocal monophyly.” When each of two populations has a unique set of alleles, and alleles from each set share a more recent common ancestor with one another than with alleles from the other set, biologists term this scenario “reciprocal monophyly” (Kizirian and Donnelly 2004). We will discuss below why reciprocal monophyly is an overconservative threshold for assessing distinctness, but for now we simply note Zink et al.'s (2013) reasoning in selecting markers for their study.
The rationale behind looking at more than a single marker is that mtDNA, being a single inherited unit, can be affected by pressures like natural selection, or even by chance events, and therefore sometimes does not accurately reflect the population's history (Edwards et al. 2005, Edwards and Bensch 2009). To address this potential problem, it is advisable to supplement mtDNA with markers from the nuclear genome. Collecting nuclear data was, in fact, a major recommendation of the USFWS (2011) in response to a previous petition to delist the coastal California Gnatcatchers, which followed the publication of Zink et al. (2000).
For their nuclear markers, Zink et al. (2013) looked at 7 nuclear DNA introns, 1 nuclear exon, and 2 mtDNA regions (i.e. 1 more mtDNA region than their previous study, but still considered 1 linked locus). The problem with this marker choice is that nuclear introns do not mutate as quickly as mtDNA; consequently, unit for unit, they contain less signal of the population history (Hare 2001). Nuclear introns also achieve reciprocal monophyly more slowly than mtDNA because of their larger population size—4 chromosomal copies compared with the haploid mtDNA genome (Palumbi et al. 2001). In short, given that coastal California Gnatcatchers were already known to lack reciprocal monophyly in mtDNA (Zink et al. 2000), one would not expect to find reciprocal monophyly in the handful of additional nuclear markers chosen by Zink et al. (2013). This is not to say that all types of nuclear markers are always a poor choice for assessing distinctness, even with recent divergence. In these cases, however, the nuclear markers should have high mutation rates, as was requested in the case of the gnatcatchers (USFWS 2011), or should be assayed in high numbers (e.g., Wagner et al. 2013).
What makes their marker choice all the more perplexing is that Zink and colleagues have been vocal critics of doing precisely what Zink et al. (2013) did—namely, using nuclear DNA loci to assess population distinctness and reciprocal monophyly (Zink and Barrowclough 2008, Barrowclough and Zink 2009). Given their own negative views about the use of nuclear DNA in phylogeography, it is ironic that Zink and colleagues have chosen to rely on such data to prove their point. At the end of this commentary, we will discuss best practices for marker choice in studies like this.
Reanalysis of Publicly Available Genetic Data
Zink et al. (2013) never tested whether the coastal California Gnatcatchers were genetically distinct. Instead they based their conclusions largely on qualitative patterns in pie charts, like those that appear in their figure 1. They also referred to their table 1, which describes nucleotide diversity within and between populations but does not test for differentiation. They reported a nonsignificant FST value for the ND2 gene, but it appears they conducted a global test across all populations, instead of specifically testing whether coastal California Gnatcatchers were distinct from southern populations.
Their figure 3, which they reported as an analysis of genetic structure across geographic distance, is not an appropriate test because pairwise points are not independent from one another and, therefore, should not be analyzed for significance with regression. Given pairwise comparisons across all populations (not just nearest neighbors), it is far from clear what the expected pattern would be in the case of geographically structured genetic variation. Furthermore, Zink et al. (2013) did not explain which of the 13 original populations were pooled or removed to arrive at the 9 populations used to generate the 36 pairwise points on the plot.
Here, we analyze genetic data from Zink et al. (2013) and show that, despite the chosen genes being few and relatively poor for addressing differentiation, statistical tests actually support divergence of the coastal California Gnatcatchers. We downloaded Zink et al.'s (2013) data from GenBank (KC863990–KC864745) and conducted a standard and widely accepted test of population differentiation (Excoffier et al. 1992), analysis of molecular variance (AMOVA), between populations for each locus. We conducted tests by dividing the populations in two separate ways (Figure 1). For test 1, we separated the recognized subspecies californica (Los Angeles south to San Telmo) from more southern populations (Misión San Fernando south to Cabo San Lucas), based on Atwood's (1991) quantitative morphological subspecies boundaries and the USFWS initial listing boundary of 30°N latitude. For test 2, we restricted our analysis to those samples assigned to the northern subspecies californica (Los Angeles to San Diego) and an adjacent, more southern subspecies atwoodi (Ensenada to San Telmo), basing the dividing line on Mellink and Rea's (1994) subspecies boundaries. We did not reanalyze the BFIB-5 locus because these data likely combine alleles from two different (i.e. paralagous) genes, and thus they are not appropriate for phylogeographic analysis (Appendix Figure 2). We assigned sequences from the remaining loci to the populations described above, using DnaSP version 5.10.1 (Librado and Rozas 2009), and exported these as Arlequin project files. We then tested for significant differentiation using an AMOVA in Arlequin version 3.5 (Excoffier and Lischer 2010).
Results from test 1 show that according to Atwood's (1991) boundaries, the californica subspecies of California Gnatcatchers is significantly differentiated from southern populations at 2 of the 7 nuclear loci: ACON (FST = 0.062, P = 0.014) and TGFB-2 (FST = 0.077, P = 0.0049). Results from test 2 show that 2 of 7 nuclear loci (1 the same as above and 1 different) are significantly differentiated between the northern subspecies californica and atwoodi: ACON (FST = 0.087, P = 0.046) and MC1R (FST = 0.195, P = 0.001); the mtDNA locus ND2 is also significantly differentiated (FST = 0.336, P = 0.016). Another nuclear locus, CEPUS, was nearly significant (FST = 0.060, P = 0.051). This is not terribly surprising upon reinspection of the pie charts in figure 1 of Zink et al. (2013), which shows that northern populations have private alleles at the TGFB-2 and MC1R loci. This is also true of the ACON and ND2 loci, although these data are not shown in figure format in their paper.
In summary, according to our analysis of Zink et al.'s (2013) original data, the threatened californica subspecies of California Gnatcatchers is genetically differentiated from Baja populations at 29% (2 of 7) of the nuclear loci examined. Additionally, californica is differentiated from atwoodi at 29% (2 of 7) of the nuclear loci as well as mtDNA, which suggests that californica has a smaller geographic range and could be restricted to the United States. Importantly, 3 of the 7 nuclear loci examined show differentiation between californica and other populations in at least 1 of 2 different sets of comparisons.
We do not claim that these analyses are the final word on differentiation of coastal California Gnatcatchers, as they are based on only a small fraction of the genome. Additionally, AMOVA tests, because they compare population means and variances, do not speak to diagnosability of individuals. However, given that neutral nuclear DNA loci are lagging indicators of differentiation (Zink and Barrowclough 2008), the population-level differences from the AMOVA results lend credence to a genetic basis for, or parallel to, the phenotypic diagnosability of individuals described in previous work (Atwood 1991, Mellink and Rea 1994). Further, the AMOVA results show that the underlying genetic data contained in Zink et al. (2013) can produce positive results, at odds with their conclusions, when analyzed with a standard method.
Species Concepts Influence Interpretation of Genetic Data
The ESA listing of the coastal California Gnatcatcher was originally based both on its status as a subspecies and on evidence for its distinctness (USFWS 1993). It seems reasonable that taxonomic recognition and distinctness should always go hand in hand, but unfortunately the empirical question of distinctness and the debate over what to call distinct units (sometimes called debate over “species concepts”; Zink and McKitrick 1995) are not as decoupled as they might be. Likely for this reason, the ESA also provides for protection of “distinct population segments” (DPSs), which currently lack taxonomic recognition and are instead designated on the basis of empirical evidence for discreteness, significance, and population status.
Zink et al. (2013) was the latest in a series of works on genetics, taxonomy, and species concepts by the first author. Summarizing these works, Zink has advocated that all populations showing reciprocal monophyly should be recognized as species, whereas all potential taxa not meeting this criterion should go without taxonomic recognition (McKitrick and Zink 1988, Zink 2004, Zink and Johnson 2006). In other words, by defining species through reciprocal monophyly, but requiring the same criterion for subspecies recognition, Zink effectively believes that the subspecies and DPSs covered by the ESA should not be given names (Patten 2010). Zink's views on subspecies are not widely accepted by his peers (Remsen 2005, Winker et al. 2007, Patten 2010) and reflect a rather extreme adherence to the phylogenetic species concept (Mayden 1997, Avise 2000). In fact, many studies support the utility of the subspecies rank, while admitting that subspecies described long ago should be reevaluated with modern methods (Phillimore and Owens 2006, Price 2008, Pruett and Winker 2010, Winker 2010).
The expectation of reciprocal monophyly for taxonomic or legal recognition simply cannot be reconciled with what we know today about how biodiversity is generated. As the field of phylogeography has moved to using next-generation sequencing, we have seen populations distinctive in phenotype, but indistinguishable in mtDNA, become well resolved with thousands of single nucleotide polymorphisms (SNPs). One recent example is the case of Wilson's Warbler (Cardellina pusilla). Until recently, genetic markers, including mtDNA, could only discriminate broad differences between the 1 subspecies in eastern and the 2 subspecies in western North America, but not between the 2 western subspecies (Kimura et al. 2002, Irwin et al. 2011). Using a panel of 96 SNPs, Ruegg et al. (2014) resolved Wilson's Warblers into genetic clusters that conform closely to the subspecies boundaries. They also found more fine-scale differentiation within subspecies that coincides with biogeographic regions. A potential criticism is that, with enough markers, every sampling location (and every individual, for that matter) will eventually show its own distinctive genetic pattern. This does not appear to be the case with the Wilson's Warblers study, in which some geographically distant localities (like those in western Alaska) showed genetic cohesiveness even when the analytical program was asked to continue dividing up the localities into ever finer units (see Ruegg et al. 2014: supplementary fig. 1).
New sequencing methods have also revealed how divergence and speciation can proceed despite gene exchange, leading to situations where reciprocal monophyly is not expected (Rheindt and Edwards 2011). The genome, it turns out, is a porous boundary, especially in birds, which have fewer postzygotic reproductive barriers than other vertebrates (Grant and Grant 1992, Price and Bouvier 2002). When hybridization occurs, some genes flow freely between diverging lineages while other genes resist movement, resulting in monophyly at some genomic regions but not at others (Wu 2001). This is nicely illustrated in the case of the Hooded Crow (Corvus cornix) and Carrion Crow (C. corone). These two species with distinct plumage patterns are not differentiated from one another in mtDNA (Haring et al. 2007). A recent study of whole genomes (Poelstra et al. 2014) revealed that, in fact, most of the genome demonstrates allele sharing between species, partly due to hybridization. Meanwhile, the phenotypic differences appear to be encoded in a small portion of the genome that resists movement between species; and it is these regions alone that show reciprocal monophyly.
How to Interpret “Background Tests” of Niche Differentiation
Another problem with Zink et al. (2013) is the authors' interpretation of their ecological analysis, specifically the niche tests developed by Warren et al. (2008). The way these tests work is that, first, the predicted niches are compared using a randomization procedure to see whether they are statistically different. Warren et al. (2008) called this a “niche identity” test, which provides statistical confidence to what had previously been the subjective endeavor of visualizing niche predictions on a map and eyeballing whether they were different. Using this test, Zink et al. (2013) found strongly significant differentiation between the niches of coastal California Gnatcatchers and other populations. Their figure 7 shows this highly significant niche difference with an arrow far outside the null distribution.
There are confounding factors with this kind of test, however—principal among them that temperature and rainfall show continuous trends with latitude and are thus strongly correlated with geographic distance. No matter what species you might be looking at, or even if you just picked random points on a map, you would find that points at higher latitudes experience cooler and more seasonal temperatures than points at lower latitudes. This is not particularly interesting. To address this, and to provide a more robust test of niche differentiation, researchers developed additional tests that attempt to control for this “background effect” (Warren et al. 2008, McCormack et al. 2010).
When implementing these “background tests,” Zink et al. (2013) found that coastal California Gnatcatchers were not more differentiated in temperature and rainfall than their geographic position would predict, but neither were they more similar. In other words, they failed to reject the null hypothesis that niches and background areas were equally divergent. From this result they concluded that the coastal California Gnatcatcher is a habitat generalist. This is unsupported by their results. The proper interpretation is that by failing to reject the null hypothesis of the “background test,” it cannot be disentangled whether the strong niche differences from the “identity test” represent differences important to the California Gnatcatchers or arise simply from clinal trends in climate variables. No one, however, should doubt whether the coastal California Gnatcatchers live in a different climate than populations to the south; figure 7 in Zink et al. (2013) shows that they most certainly do.
Best Practices for Assessing Distinctness
We are in a time of new technologies and advances in genetic and ecological analyses. Because of time and money constraints, not every study can avail itself of the latest methods. But those studies that purport to demonstrate results with important societal consequences have a scientific obligation to use either the best available methods or to couch their conclusions in the appropriate level of uncertainty. Our critique of Zink et al. (2013) is not just that the authors did not always use the latest methods, but that the conclusions they drew from negative results were too sweeping and too assertive, given the already known and acknowledged limitations of both their data and the applied analyses.
So what is the best way to test for distinctness of coastal California Gnatcatchers? Below, we provide recommendations for best practices in this case and in future cases in which the distinctness of potentially threatened or endangered populations is in question.
Discrete phenotypic differences were the original basis for listing the coastal California Gnatcatchers under the ESA (USFWS 1993). “Discrete” means that differences must not vary smoothly from one population to another, but instead must show a discontinuity (i.e. step cline) in their character values. Over the years, many studies on the California Gnatcatcher have delimited changes in phenotypic characters consistent with discrete variation, affirming the distinctness of P. c. californica from subspecies to the south (Grinnell 1926, Van Rossem 1931, Phillips 1980, Atwood 1991). There is evidence for phenotypic discontinuities at even finer scales. Using phenotypic traits, Mellink and Rea (1994) described a new subspecies from within the californica subspecies. This subspecies, P. c. atwoodi, shows discrete differences in the brightness of the white on the breast feathers, among other traits, and on this phenotypic basis it is recognized by authoritative taxonomic references (Dickinson et al. 2003, del Hoyo et al. 2006).
The study of Skalski et al. (2008), cited by Zink et al. (2013), claimed to invalidate the discreteness of the variation found by Atwood (1991), but in reality they analyzed little of the original data. Skalski et al. (2008) also made some questionable claims. For example, they claimed that by showing two trends, one relating breast-plumage brightness to latitude and another relating breast-plumage brightness to specimen age, these results “indicate that the specimens from the more northerly latitudes were collected earlier than the specimens from more southerly latitudes” (their fig. 2). That is certainly one hypothesis. Another possibility is that breast brightness varies with latitude in addition to age. These hypotheses can be disentangled by restricting analysis to specimens of similar collection year, which is exactly what Mellink and Rea (1994) did in their determination that variation was discrete among northern subspecies, including P. c. californica.
The fact remains that, despite all the previous work, the California Gnatcatcher has never been the subject of a comprehensive phenotypic analysis, using modern methods and all available specimens and controlling for potential sources of error like specimen age. In some ways, the methodology for collecting phenotypic data has not changed dramatically in the past 50 years. Researchers still commonly use analog calipers to measure morphological features like the length of the bill or wing. Two recent developments, however, have improved our ability to assess differences in birds, neither of which has yet been applied rigorously to the question of the California Gnatcatchers.
First, plumage reflectance analysis, using a modern spectrophotometer that can measure light over the full range of avian vision, can detect subtle physical distinctions among populations (e.g., Maley and Winker 2007) and test whether the birds themselves are able to notice these differences perceptually (Vorobyev et al. 1998, Maia et al. 2013). Second, multivariate statistics like principal component analysis can often reveal phenotypic variation that would otherwise be overlooked in univariate tests (Milá et al. 2010, Aleixandre et al. 2013). Although the study of Atwood (1991) used precursors of these methods, equipment and analyses have undergone considerable development in the past 20 years. A comprehensive analysis that joins morphological and plumage reflectance data in a multivariate analysis, and which tests for the discreteness of variation against a null hypothesis of smooth clinal change (Patten 2010), is the minimum effort required to confirm or refute the phenotypic distinctness of the coastal California Gnatcatchers.
Methods for genetic analyses have gone through a revolution in the past 10 years with the advent of next-generation sequencing (Lerner and Fleischer 2010). To assess differentiation in the California Gnatcatchers (and other, similar cases), densely sampled SNPs are ideal, like those employed in the Wilson's Warbler study of Ruegg et al. (2014) or in other studies that used even higher numbers of SNPs (Harvey and Brumfield 2014). The technique used to generate these SNPs falls under a broad category of methods called “genotyping by sequencing” (GBS; Baird et al. 2008), which can provide thousands of SNPs with the power to resolve difficult taxonomic conundrums (Wagner et al. 2013), to compare genomic and phenotypic signatures of divergence (Baldassarre et al. 2014), and to uncover loci potentially linked to targets of selection, like those governing plumage color (Parchman et al. 2013). The generation and analysis of GBS data is well developed and affordable (Davey et al. 2010, Narum et al. 2013) and should only become more so with time.
Prior niche modeling on the coastal California Gnatcatchers investigated the extent of their habitat (Rotenberry et al. 2006) and the potential effects of climate change (Preston et al. 2008) but did not compare these habitats with those of populations in Baja California. To test whether these habitats differ, “old-fashioned methods” using measuring tape and sampling grids might trump high-tech solutions. There are few enough sampling points that a field survey of plant species and vegetative characteristics would not be too onerous and would stand the best chance of detecting the qualitatively described habitat differences in the coastal California Gnatcatchers (Atwood 1991, Mellink and Rea 1994).
Niche modeling would help complement these “ground-truthed” data and would provide a more continuous assessment of habitat, compared with the necessarily patchy coverage of any method drawing only from sampling points. Niche modeling should use environmental variables that describe both climate (e.g., rainfall and temperature data used by Zink et al. ) and the vegetation itself, like those available from remote-sensing satellites (e.g., vegetative cover, greenness, and height, which were not used by Zink et al. ). Remote-sensing data, because they describe aspects of the niche likely to vary over smaller spatial scales than climatic data, may show more promise for finding evolutionarily important differences among gnatcatcher habitats.
Finally, data collection and analysis should be carried out either free of any financial conflict of interest or, at minimum, with such conflicts stated openly. A conflict of interest occurs in any situation in which the impartiality of a study might be undermined by a competing interest, often a funding source. The work of Zink et al. (2013) was funded by developers (Sahagun 2014), which we believe represents a conflict of interest that should have been acknowledged in their paper. The paper does not disclose that the R. Thornton thanked in the Acknowledgments for “securing funding” is a lawyer who has represented developers. Thornton is representing the National Association of Home Builders in the petition to delist the gnatcatcher. The presence of a financial conflict of interest does not necessarily invalidate a study or imply malfeasance. Yet many journals have decided that financial conflicts of interest should be stated openly because of the demonstrated effect of “sponsorship bias” (Lesser et al. 2007). The risk of sponsorship bias is even greater with negative results, which rely more heavily on the researcher's assumptions and interpretation. Especially when it comes to threatened or endangered species, explicit funding statements provide important context for researchers, the media, and the public in assessing a study's rationale, methods, and interpretation.
The manuscript benefited enormously from the advice and critical input of K. Winker, K. Garrett, M. Patten, P. Unitt, A. Zellmer, B. Milá, and an anonymous reviewer. E. Ridley organized and sorted the GenBank data. This project was funded by an endowment to Occidental College by the late Robert T. Moore and Margaret C. Moore.