The coevolutionary interactions of pathogens and their hosts are likely to be a widespread mechanism that results in the maintenance of genetic variation. Alternatively, highly variable species may be in a transient state, with their variation reflecting directional selection and new selection pressures. With those insights in mind, we set about to study the House Finch (Carpodacus mexicanus), some of whose populations are arguably the most variable among North American birds with regard to plumage coloration in males of the same age. In addition, we were also attracted to House Finches by our observations and those of others (McClure 1989, Power and Human 1976) that this species is highly unusual not only for its color variation, but for its remarkably high incidence of disease, particularly avian pox, which of course raised the question of whether pathogens and plumage color might be related. Lastly, the possibility of recent changes in disease incidence was raised by the first published report (Power and Human 1976) of pox disease in mainland populations of this common species, which reported a severe outbreak in 1972. Accordingly we set out to determine whether there is any evidence of a link between plumage color variation and pox and whether extreme variation in color and high pox-incidence might be new conditions.
To address our goals, we collected new data through our own field efforts, studied specimens in museums, reviewed available literature, and amassed data from bird banders over a wide part of the species' current range. Although they do not establish causation, our straightforward data and analyses show strong temporal and spatial links between pox and plumage coloration and add potential new insights to work on this interesting species. We found the following major results: birds in southern mainland California had a much lower incidence of pox disease in the first half of the twentieth century than in the second half; over the same period, red coloration has gone from being characteristic of more than three-fourths of southern California males to a much lower incidence today, with orange and yellow males having become much more common. At present, there are strong macrogeographic associations between high pox incidence and high plumage color variation (Zahn and Rothstein 1999).
Hill's (1990, 1991, 1992, 1993a, b, 1994a, b, c) past work has produced much of what was previously known about plumage color variation and its consequences in this species. Unfortunately Hill's critique of our paper adds confusion, but no new insights, to this interesting system. His frenetic attempt to discredit virtually every aspect of our paper is totally unconvincing, contradicts itself, and highlights serious weaknesses in his own work. The putative major problems that Hill alleges deal with the following: our methodology of representing plumage coloration; our temporal criteria for separating samples in analyzing possible historical changes in pox incidence and plumage coloration; our criteria for analyzing possible patterns in present day macrogeographic variation in plumage color and pox. In addition, Hill misrepresents our paper by alleging that we made statements and conclusions that in fact do not appear in our paper.
We address the misrepresentations first. Hill (2001) repeatedly argues that we concluded that pox disease is the “singular cause” for plumage variation or “the primary or sole source of temporal or geographic variation in male plumage coloration.” In fact, we were careful to never state our conclusions in such absolute terms. For example, our abstract states that high incidences of yellow and orange males “may be related to a high incidence of avian pox” (Zahn and Rothstein 1999). No where does our paper state that pox is the only or major determinant of color variation. We would suggest that Hill did nothing further than read the title to our paper were it not for the fact that even it states there is a “possible relationship” between pox and plumage variation.
Another misrepresentation is Hill's supposed confusion over our suggestion that “the high level of variation [in plumage coloration of male House Finches] is a new phenomenon”. Hill suggests that maybe we meant that “there were few or no yellow or orange males” (Hill 2001) in the early part of the last century, a suggestion that he then rebuts. This is a pointless discussion by Hill because we never stated that “there were few or no yellow or orange males.” Instead, we clearly stressed that early workers, such as Michener and Michener (1931), described the yellow to red range of colors that exists today. Furthermore, our Figure 1 and associated text (Zahn and Rothstein 1999) show that 23.3% of males in a museum sample from the early 1900s were yellow or orange. To further confuse matters, Hill later acknowledges that we recognized that there have always been reports of yellow and orange males and even quotes us as stating “Non-red variants existed historically …” So it is unclear why Hill raised that red herring in the first place.
Hill's claimed confusion over our use of the phrase “high variation” is similarly hard to understand. Given the data we presented, certainly Hill could understand that “high variation” referred not to the total range of hues but the proportional representation of hues. It should have been clear to Hill that what we meant by higher variation was a switch from populations in which the majority of males were of one color (red) to ones in which there was a much more even representation of three colors. For example, our museum data showed the following color percentages in the early versus the late twentieth century: 76.6% red, 23.4% orange, and 1.1% yellow, versus 51.3% red, 35.9% orange, and 12.8% yellow (Zahn and Rothstein 1999). Clearly, the second sample shows more variation by any use of the word.
Despite Hill's misrepresentations of our conclusions, he would have valid points to make if our results were unreliable because of faulty methodology. We first address Hill's criticisms of our color scoring methodology. One of us (Zahn) did all color scoring by matching male colors to one of 13 color chips that ranged from red to yellow and that are from a widely available source (Smithe 1975). We had a good rationale for using those particular 13 colors because they represented all of the hues we found among several hundred museum specimens as well as all males captured at our Santa Barbara County field sites. Had other hues been present, we would have used additional chips. Furthermore, Smithe's (1975) color chips were appropriate because they represent colors prevalent in birds and are not a “haphazard collection” as Hill incorrectly states. To categorize the 13 Smithe chips, we had 22 people classify each one as red, orange, or yellow. Those people also ranked the chips on a continuum from red to yellow. All 22 people categorized 10 of the 13 chips the same way (i.e. red, orange, or yellow). Amongst the remaining three chips, 21 called one chip orange whereas only one person called it red, 21 called one chip red whereas one called it orange, and 21 called the last chip yellow whereas one called it orange. Clearly our results are replicable. As a final check, we matched the 13 Smithe chips to chips in the Munsell color system (Munsell Color Company 1976). The designations of our 22 judges as red, orange, or yellow agreed in each case with the Munsell system.
Hill lambastes us for using the Smithe chips first and for not using Munsell chips directly and for collapsing our color categorizations into red, orange, and yellow, instead of the true continuum represented by those descriptors. There are two reasons why we chose our color system, historical comparisons were a key goal of ours and those had to rely on studies in which early workers categorized birds as red, orange, and yellow, and the Munsell chips were not available for all of our work. Addressing the latter reason first, the Munsell chips are part of an expensive set that we were allowed to use only in a single indoor location. Because we wanted to use a common rating system in a number of museums and at a range of field sites, we opted for Smithe's chips. Given the consistency of the ratings by our 22 judges, it is clear that our system is highly replicable and that it introduced no problems. Hill's sarcasm that our method was akin to measuring wing lengths with “finger widths”, reflects poorly on his intentions, which should have been an objective attempt to elucidate the truth, not an attempt at ridicule.
Secondly, we conducted most of our analyses in terms of red, orange, or yellow categorizations because we were interested in historical comparisons and in contemporary comparisons involving a large part of the species' range. So we had to rely on other people. Grinnell (1911) and Michener and Michener (1926, 1931) reported their historical data in terms of red, orange, or yellow categorizations and we had to do the same for meaningful comparisons. The former author reported that red males made up 92.5% of 94 male specimens collected before 1911 in coastal areas of southern California counties from Santa Barbara to San Diego. The Micheners reported that 85% of males were red among 1,563 specimens banded at Pasadena, in the 1920s. Our assessment of pre-1950 museum specimens found that 76.6% of 94 were red. Those three sets of historical data are in stark contrast to available mainland southern California data for recent decades (Zahn and Rothstein 1999): W. L. Principe reported that 12.2% of males were red among 459 banded at Pasadena from 1991–1995. Our 1994–1995 banding at four sites in Santa Barbara County found that 29.4% of 323 males were red. Lastly, 51.3% of 78 museum specimens collected after 1959 were red. Although there is currently considerable spatial heterogeneity in coastal regions of southern California, it is clear from those diverse sources of data that there was a large decline in the proportion of red male House Finches during the 1900s.
We applied statistical testing to our own data from museum specimens in two ways. A chi-squared test showed a significant difference in the proportions of red males before 1951 versus after 1960 (χ2 = 12.03, df = 1, not 3 as mistakenly cited in our paper, P < 0.001). Hill (2001) never mentions the chi-squared test but criticizes a correlation analysis we presented as showing only a weak relation (rs = 0.26, although the P value was <0.0005). The correlation coefficient was indeed low because the test was inherently conservative. A correlation analysis would be best suited to detecting a progressive color change over the entire time period covered by our data. By contrast, the true relationship indicated by our data and the hypothesis that pox is implicated in the temporal color shift is that color changed over a short period roughly coinciding with appearance of pox. Although it was not well suited to our data, we carried out the correlation analysis for two reasons. First, it did not involve any temporal cut-offs chosen by us, so it eliminated any issues concerning arbitrariness. Secondly, we wanted to reflect the fact that there is a continuum of colors, and the correlation analysis allowed us to use the red to yellow ranking of our 13 color chips, not just the three color designations.
Hill further criticizes our museum analyses by raising the issue of collecting biases, but fails to note that collecting biases were minimized because the specimens we assessed were housed in six museums and were therefore collected by a considerable assortment of individuals over many decades. There is no reason why collecting biases would shift from inflating the proportion of red individuals in the early 1900s to deflating that proportion in the late 1900s. Furthermore, the most likely collecting bias in a common species that is a commensal of man, as is the case for House Finches, would have been a preference for uncommon individuals, which may mean that old museum specimens have a disproportionate representation of yellow and orange birds.
Hill's second major criticism is that our choice of temporally partitioning the museum data as pre-1951 versus post-1959 in our analysis of color changes reflects our “preconceived notions” about pox, which is simply false. We explicitly stated that we chose those cut-offs because there were no museum specimens from the 1950s. So our cut-offs were dictated by the independent variable (year) not by the dependent variable or by any preconceived notions. We next pointed out that the lack of specimens from 1950s provided a fortuitous link with the first published report of pox in California, in which Power and Human (1976) documented a severe outbreak at Santa Barbara in 1972. However, Power and Human's account makes it clear that pox was noted at Santa Barbara in years prior to 1972 and that it occurred over at least a 340 km span of coastal California in the winter of 1972–1973. Three of our 19 museum specimens collected from 1961–1970 had missing toes, providing further evidence for the onset of pox sometime before 1972. Therefore, we decided it was most objective to simply let the data partition themselves, that is, pre-1951 versus post-1959 given the absence of specimens from the 1950s. If Hill objects to that, the solution is to delete our 1961–1970 data, which means that 51.7% or 30 of 58 birds were red in years after the severe outbreak of pox in 1972, which still results in a significant comparison with the pre-1951 period (χ2 = 10.05, df = 1, P < 0.01).
We also note that Hill's multi-pronged but ineffective criticism of our museum data ignores the fact that those data are consistent with all other sources of plumage color data, as noted above. In addition, neither the Micheners nor Grinnell noted pox lesions or a common sign of past pox infection—missing toes—in any of the birds from the early 1900s they examined. We found pox indicators in 37.6% of 663 males we banded in Santa Barbara County between 1993 and 1996. Similarly, McClure (1989) found that about one-third of thousands of House Finches he banded in Ventura County from 1977–1987 had pox at some time in their lives (see also Thompson et al. 1997). Principe, who worked only 8 km from the Michener's 1920s Pasadena site found active pox tumors in 1991–1995 on approximately 25% of males during fall, which is when pox infections peak (Harrison and Harrison 1986, Zahn 1999). Among our museum specimens, none of 94 males collected before 1950 had missing toes compared to 12 of 78 collected after 1959 (P < 0.0001, Fisher exact test). As with our color information, diverse types of data on pox collected by different people all point to the same result: a temporal shift. Hill can focus on the uncertainty concerning the exact timing of the shifts, which our data indicate occurred sometime between 1950–1970, but that does not support his attacks on the issue of whether those shifts occurred at all.
Because data indicated historical shifts in both color and pox incidence in southern California, we explored the possibility of a relationship between color and pox by considering the potential physiological links between pox and carotenoids and by determining whether there are currently spatial links between pox and color. The latter approach brings us to Hill's third major criticism, which is that we used unjustifiably subjective criteria to assess current macrogeographic patterns of variation in plumage color. Our macrogeographic analyses employed recent data from four areas, mainland southern California (as described above), the eastern United States, Hawaii, and San Nicolas Island (110 km off the coast of southern California). The eastern data on color come from specimens in two museums and from two banders who sent us feather samples from males they banded. Colors were scored with the same methodology as described above, either at those museums or in Santa Barbara using the banders' feather samples (not by the banders as Hill mistakenly states). As we reported above, that methodology is replicable and objective. We find it odd that Hill attacks the reliability of our new data that show that red coloration predominates in eastern House Finches, because he criticizes us for not citing his data, which he says show the same result! We did not cite his data because of its methodological flaws, which we discuss below.
It is true that some or all of the recent data we cited for three of the areas in our macrogeographic assessments (mainland southern California, Hawaii, and San Nicolas Island) were collected by other people who did not use our methodology. Those people simply categorized birds as red, orange, or yellow. However, such data are reliable as shown by the nearly 100% uniformity in our panel of 22 color judges. The spatial link between decreased proportions of red males and the occurrence of pox is clear from our data. Color categorizations of House Finches have also been acceptable to other researchers (van Riper 1994, Thompson et al. 1997), so our work is not unique in that respect. As with our data on temporal shifts, our data on macrogeographic patterns come from diverse sources, not just our own work. Hill can belittle such evidence as a “hodgepodge” of information, but our methodology shows that red, orange, and yellow categorizations are reliable across different people. Furthermore, all evidence we amassed for both temporal and macrogeographic trends is consistent and its diverse nature is a strength of our paper.
In attacking our paper, Hill not only criticized our methodology, he also touted his methods as superior. Unfortunately, Hill has misrepresented the implications of his scheme for quantification of plumage coloration, and that makes it difficult for others to be certain of the hue of birds he has studied. Hill (1993b) assigns a composite plumage score to each male by summing values based on three characteristics: hue (the red–orange–yellow continuum), chroma (degree of color saturation, such that pink is a low chroma red), and tone (total reflectance). Because those birds vary in all three characteristics (Hill 1998), that method results in one numeric score that represents three distinct variables. Hill equates high composite color scores with increased brightness and redness. Although there may be a correlation here, highly chromatic orange males can have higher scores than low chroma red ones. Furthermore, brightness is a vague term that relates to both chroma, and tone, but not to hue, yet Hill stresses hue when he equates high color scores with red. Hill's (1998) own data on spectrophotometer output show weaknesses in his composite scores based on his visual assessments of hue, chroma and tone (as used in all of his prior papers). Hill (1998) reported that only his hue and chroma values were correlated with readings from the spectrophotometer. Although he pointed out that tone contributes less numerically to his composite color score than do hue and chroma, Hill's (1998) results show that his composite scores have a component—tone—that adds noise and is essentially a random variable. In recognition of that, Hill (1998) stated that “I find it relatively easy to assign a hue score to patches of feathers, more difficult to assign a saturation [chroma] score, and very tough to assign a tone score.” Hill's own data and perceptions thus agree with the near unanimity of our 22 judges and because they validate the reliability of color categorizations based on hue, they validate our methodology for categorizing colors.
We chose to focus on only one aspect of coloration—hue—because it is unknown how House Finches integrate hue, chroma, and tone and it is not even clear if those three axes of color variation are completely meaningful to birds given that the Munsell system is based on human perception. In addition, we knew that the historical data on color and recent data collected by other workers, all of which categorized birds as red, orange, or yellow, were based on hue alone. We do not dispute Hill's general findings that female House Finches prefer males with high composite color scores under his scheme, but it is not clear just what is important to females until Hill analyzes the effects of hue, chroma, and tone separately. Thus, conclusions by Hill such as “it appeared to be the red pigmentation of males and not a correlated character that the female House Finches were choosing” (1990) may be invalid. Hill's attempts to determine the cues females use for mate choice are further complicated by the fact that humans and birds perceive color differently and only the latter perceive ultraviolet light, as acknowledged by Hill (1998). Yet even Hill's (1998) spectrophotometric methodology does not involve UV reflectance, which may be a problem because yellow bird plumages often show some reflectance in the UV range (J. Endler pers. comm.). Perceptual differences between humans and birds do not affect our paper (Zahn and Rothstein 1999), as we focussed on differences in plumage reflectance with no assertion as to whether those differences are important to birds.
Besides attacking our methodology and advancing the primacy of his own methods, Hill (2001) presented data that he claims are counter to our findings. The data in Hill's Figure 1 have all been presented before. Hill categorizes his study sites as with and without pox, but presents no data on pox. He argues that birds from pox-free sites are not consistently brighter than birds exposed to pox. Because Hill's composite plumage score confounds three variables (whereas we dealt with just one—hue), comparisons between our data and his are difficult to interpret. For similar reasons, Hill's failure to find greater plumage color-score variation at pox sites is not easily interpreted. It is clear that there is a strong association between pox and increased variation when color is represented by hue only, as in our data. We tested for increased variation in recent years in two ways. First, we used a simple hue scoring system with yellow, orange, and red equaling 1, 2, and 3 respectively, which follows the system of Thompson et al. (1997) except that they called the intermediate category “mixed red and yellow/gold” instead of orange. With that system, our pre-1951 museum series for southern California had a mean score of 2.725 and a variance of 0.214. The post-1959 sample had a mean of 2.385 and with a variance of 0.499, was significantly more variable as the F ratio is 2.332 (P < 0.001). Because our data are not normally distributed, we also quantified plumage-color diversity by eschewing scores and instead calculating Shannon-Wiener diversity indices for the color classes (Zar 1999). Those indices were 0.2605 and 0.4228 for the pre-1951 and post-1959 samples, respectively. The latter had significantly (P < 0.0001, t = 4.487, df = 167) greater color diversity using Hutcheson's test (Zar 1999).
Even if Hill had quantified plumage color in a more interpretable way relative to our assessment of hue, the data in his Figure 1 would have little value in assessing validity of our results. As shown by our data for Santa Barbara County (Zahn and Rothstein 1999), there is considerable spatial and temporal heterogeneity in plumage color. Among our four sites, one had red males at 52.8% (n = 36 males) and another at only 3.2% (n = 31) in 1994. At the latter site in 1995, red males were at 23.1% (n = 13). The data we used in our temporal and spatial assessments came from numerous sites. All samples were collected over two or more years and many involved hundreds or thousands of birds. By contrast, Hill's data for all samples in his Figure 1 other than Michigan were each collected in a single month at a single site (two nearby sites in the case of New York) and had n values of only 7 to 81 males. The bottom line is that one can not use a single small sample collected over a short time period in one season to categorize the coloration of House Finches within a region. Ironically, the charge that Hill applied to our museum data, namely that our samples have little global validity because they are clumped in time and space, applies instead to his own data.
Lastly, regarding Hill's data, we address his suggestion that because males at San Jose (a purported pox area) were as red as ones from the East (where pox is rare or absent), the link between pox and color is weakened. Other problems aside (such as Hill's color scoring scheme and the lack of global validity), we note that our data on temporal trends deal solely with southern California and none of our macrogeographic comparisons involve northern California, where San Jose is located. Because northern and southern California differ in many ways, we cannot assess Hill's data. We do not know the incidence of pox in northern California nor the degree of color variation and Hill's sparse data are of little help here.
Hill refers to our Discussion section as “Perhaps the weakest part of the paper …” In that section, we argued that differential diet uptake of carotenoids is unlikely as a complete explanation for plumage color variation in House Finches. Instead, we recognized that carotenoid pigments in birds, or the precursors of those pigments, must come from the diet but suggested that pigments are not likely to be limiting in nature. We argued that variation in color is more likely to be related to ability to use ingested carotenoids. We further suggested that pox, either through direct effects on uptake of carotenoids (such as through pathogenic effects on the intestine), and general factors that reduce a bird's condition (e.g. diseases, ectoparasites) are the primary factors responsible for a bird's failure to become red. Hill strongly attacked our suggestions and in doing so gave insufficient weight to mounting evidence concerning carotenoid metabolism in birds (Olson and Owens 1998) and undue importance to his own feeding experiments with captive birds (Hill 1992).
We need not review the evidence concerning carotenoid metabolism, other than to state that it is widely recognized that carotenoids are naturally abundant in plants and that there are links between an animal's ability to use carotenoids and its condition (Hudon 1994, Olson and Owens 1998). For House Finches in particular, Thompson et al. (1997) showed that birds afflicted with pox during molt are more likely to grow non-red feathers than birds not afflicted. The latter study is clearly applicable to our findings indicating spatial and temporal links between pox and was done in our primary area of focus, southern California. Besides pox, Thompson et al. (1997) found that intense mite infestations during molt were also related to a decreased likelihood of growing red plumage. Hill briefly acknowledges that Thompson et al. (1997) showed that pox affects color, but, instead of admitting that this supports the conclusions in our paper, he argues that that effect does not mean that pox “is the primary or sole source of temporal or geographic variation” in plumage coloration. Of course, we never argued that it was. We merely argued that a bird's condition is likely to influence its coloration and that pox is one of a number of things that can depress condition.
In his early work (e.g. Hill 1992), Hill attributed all plumage color variation in House Finches to diet and differential foraging ability. Evidence for that viewpoint seems to come from three sources. As in a previous rebuttal (Hill 1994) to a critique of his differential foraging ability hypothesis by Hudon (1994), Hill (2001) cites the same single study (Slagsvold and Lifjeld 1985) showing that carotenoids are limiting for birds. But the species in that study is mainly a carnivore, whereas the House Finch is primarily herbivorous, and carotenoids are so widespread in plant matter that they may be limiting only for animals that are primarily carnivorous (Hudon 1994, Olson and Owens 1998). In a second line of putative evidence, bright male House Finches (under Hill's composite scoring scheme) provided more food for their offspring and therefore seemed to be better foragers than dull-colored males (Hill 1991). However, that result is consistent with both our condition hypothesis and Hill's foraging-ability hypothesis because birds in the best condition are likely to be the ones best able to feed both themselves and their offspring. In arguing for the importance of diet, Hill (2001) states that Hill and Montgomery (1994) “provided evidence that there are differences among males in access to nutritional resources during molt.” The latter paper showed that bright males grow feathers more quickly and begin to molt earlier than dull males. Although that result is consistent with bright males being better foragers for all aspects of food, including carotenoids, it is also consistent with such males simply being in better condition as regards all factors affecting condition, including disease. In that paper, Hill and Montgomery (1994) stated that “reduced plumage brightness of males in the drab Alviso population is a result either of reduced access to carotenoid pigments or of reduced ability to metabolize carotenoids (e.g. due to parasites or poor health).” Hill and Montgomery's suggestion that disease may be important in limiting a finch's ability to metabolize carotenoids agrees with our general conclusion yet strangely it is not acknowledged in Hill's (2001) critique.
The third apparent reason for Hill's defense of the importance of diet deals with his feeding experiments, which do indeed demonstrate a clear effect of diet on the coloration of captive House Finches. In those experiments, Hill showed that birds fed a special diet deficient in carotenoids molted into dull plumage. Birds fed the same diet but given a red carotenoid, canthaxanthin, grew bright red plumage. The first of those results was completely predictable, because all workers agree that carotenoid pigments in animals must come from dietary intake (see discussions in Hudon 1994 and Zahn and Rothstein 1999). The second result has no bearing on what occurs in nature as canthaxanthin is not present in the finches' diets nor is it the pigment responsible for their red color. Those results show only that people can control bird coloration by feeding them unnatural diets, a trick long known to zoo keepers. Those results do not show that the range of naturally occurring diets controls or even influences coloration. Even the first workers to use diet to manipulate coloration in captive House Finches, concluded that the dietary intake of carotenoids is necessary but “not completely sufficient to explain color variation in native birds” (Brush and Power 1976).
In questioning the primacy of diet, we noted that there were no major plant perturbations in California in the mid-1900s when House Finch coloration apparently shifted. Hill (2001) retorted that there were “massive changes in the biota starting around the turn of the century.” In fact, the major changes to California's flora that have displaced native plants that dominate the open habitats used by House Finches took place by the mid-1800s (Mensing 1998). Indeed, changes to open habitats occurred so early after the European colonization that there is even considerable controversy concerning the original nature of those habitats (Hamilton 1997).
Remarkably, after criticizing us for doubting his early diet-as-key-factor hypothesis, Hill's (2001) critique acknowledges “that a variety of factors combine to determine expression of carotenoid-based coloration” and that “degree of parasitism” is one of these factors. So Hill is free to modify his hypothesis, but without admitting that his initial diet hypothesis was overly simplistic, whereas we are not. Furthermore, the bottom line of Hill's critique is that after all the disparagement of our methods, logic, and data, he comes to the same general conclusion we reached, namely there is a “possible relationship” between pox and coloration after all.
We thank John Endler and Kathleen Whitney for their valuable comments on this manuscript.