Controversy has sometimes arisen over whether there is a need to accommodate the limitations of survey design in estimating population change from the count data collected in bird surveys. Analyses of surveys such as the North American Breeding Bird Survey (BBS) can be quite complex; it is natural to ask if the complexity is necessary, or whether the statisticians have run amok. Bart et al. (2003) propose a very simple analysis involving nothing more complicated than simple linear regression, and contrast their approach with model-based procedures. We review the assumptions implicit to their proposed method, and document that these assumptions are unlikely to be valid for surveys such as the BBS. One fundamental limitation of a purely design-based approach is the absence of controls for factors that influence detection of birds at survey sites. We show that failure to model observer effects in survey data leads to substantial bias in estimation of population trends from BBS data for the 20 species that Bart et al. (2003) used as the basis of their simulations. Finally, we note that the simulations presented in Bart et al. (2003) do not provide a useful evaluation of their proposed method, nor do they provide a valid comparison to the estimating- equations alternative they consider.

Estimando Tendencias Poblacionales con un Modelo Lineal: Comentarios Técnicos

*Resumen*. A veces ha surgido controversia sobre la necesidad de considerar las limitantes del diseño de muestreo al estimar cambios poblacionales a partir de datos de conteos de aves. Los análisis de muestreos como el Muestreo de Aves Reproductivas de América del Norte (North American Breeding Bird Survey [BBS]) pueden ser bastante complejos; es natural preguntarse si esta complejidad es necesaria, o si los aná lisis estadísticos son desmedidos. Bart et al. (2003) proponen un análisis muy simple que sólo involucra regresión lineal simple, y contrastan su enfoque con los procedimientos basados en modelos. Nosotros revisamos los supuestos implícitos en el método que ellos proponen y documentamos que estos supuestos no son probablemente válidos para muestreos tales como el BBS. Una limitante fundamental de un enfoque basado exclusivamente en el diseño es la ausencia de controles para factores que influencian la detección de aves en los sitios de muestreo. Mostramos que el hecho de no modelar los efectos del observador en los datos de muestreo lleva a sesgos substanciales en las estimaciones de las tendencias poblacionales de las 20 especies que Bart et al. (2003) usaron como la base de sus simulaciones a partir de datos del BBS. Finalmente, notamos que las simulaciones presentadas en Bart et al. (2003) no brindan una evaluación útil del método que proponen ni tampoco ofrecen una comparación vá lida para la alternativa de estimación de ecuaciones que ellos consideran.

Bart et al. (2003) describe a design-based approach for estimation of population change from count survey data. They promote their method as “simple, self- weighting, and versatile” (p. 371), and conduct simulations based on data from the North American Breeding Bird Survey (BBS) and the International Shorebird Survey. They contrast their estimator with an estimating-equations estimator, and suggest that their procedure is superior to the alternative approach. The estimation technique advocated by Bart et al. (2003) has serious conceptual and practical limitations; its use is likely to lead investigators to invalid conclusions about population change. In this note, we identify a few of these deficiencies, and suggest that users be cautious in implementing methods that stress simplicity while ignoring critical design issues of the survey and important features of the data. We show that the proposed analysis is based on unreasonable assumptions about the nature of count data, that the evaluation of the method is conducted under conditions that predispose it to be favored, and that the Bart et al. (2003) analyses provide biased estimates of population trends from BBS data.

### DESIGN-BASED ANALYSES

Bart et al. (2003) place great importance on the notion that their estimator is design-based, and briefly dismiss model-based analysis with suggestions that it is biased, complicated, and that “different methods perform best in different situations” (p. 367). They say that design- based estimators “assume only that the sampling plan was followed and that the sample size is large enough to make inferences based on the central limit theorem” (p. 367). The second half of this assertion is false: no normality assumption is required for design-based analysis. The first half is crucial, and virtually never satisfied by count surveys such as the BBS upon which we focus our discussion. Design-based procedures are based on assumptions that values of quantities measured at sample sites are used to estimate a population parameter, and that a random sampling scheme is used as the basis of assessing the precision of the estimator. To meet these assumptions, the procedure must have the following features: (1) There is a well-defined finite population of sites. (2) The data are collected subject to a clearly defined scheme for random sampling of the sites. (3) Associated with each site there is a quantity that can be measured without error. (4) The population parameter to be estimated is defined as a function of the site-specific quantities. We address each of these points below.

#### Finite population of sites with clearly defined scheme for random sampling

Although there is an element of randomness in selection of starting points of BBS survey routes, the routes cannot be viewed as random selections from a sample frame. The most obvious deficiency is that BBS routes do not survey habitats >0.4 km from secondary roads. Any BBS sampling frame would be restricted to roadsides along a subset of roads. Even along roadsides, route selection procedures are not based on a random selection from a predetermined sampling frame of possible routes, and routes often cross other routes. Sampling intensity varies temporally and regionally, without regard to a preestablished design (Sauer, Fallon, and Johnson 2003). Finally, the definition of the BBS sample unit is vague, since counts cover an unknown area and area covered by a route undoubtedly differs due to quality of observers (Link and Sauer 1998a). BBS sampling methods cannot guarantee either a census or a known fixed area of sampling, facts well known to the originators of the survey (Robbins et al. 1986). Consequently, one cannot conceive a “population of sites” from which BBS routes are sampled, except as an abstraction (i.e., as a model, Link and Sauer 1998a).

From their discussion and simulations, it is clear that Bart et al. (2003) regard BBS routes as a simple random sample from a finite population, sampled as replicates at the continental scale. A great deal of evidence based on both the limitations of defining a sample frame at local scales and the need to impose regional strata indicate that this is an oversimplification (e.g., James et al. 1996, Peterjohn et al. 1995, Link and Sauer 1998a, Sauer, Fallon, and Johnson 2003). As Bart et al. (2003) acknowledge, samples from many other surveys, such as the International Shorebird Survey, are even less appropriately viewed as random selections from sampling frames.

#### Measurable quantities at sites

Definition of counts at sites has been the source of considerable controversy in the analysis of count data, as the distinction between counts, indexes, and actual population sizes is often obscured. Bart et al. (2003:368) suggest that analysis of count survey data should begin with

“an estimate of, or an index to, population size *Y _{j}* in year

*j*; and that any adjustments to account for change in detection rates (e.g., due to change in average observer ability) have been made. Methods for incorporating such adjustments into the trend analysis, rather than making them beforehand, will be discussed in a future paper. The

*Y*may thus be regarded as the true population sizes, if survey methods permit an unbiased estimate, or more generally, as the expected values of the survey means in each year.”

_{j}This statement sets the stage for an inappropriate analysis. In the first sentence *Y _{j}* is the population size; an unnamed “estimate of, or index to”

*Y*is to be used as the basis of trend analysis. In the third sentence,

_{j}*Y*is used for the index itself, but with the assurance that it may “be regarded as the true population size.” The qualification that the index may be regarded “more generally, as the expected value of the survey means” is tautological. Are we interested in doing an analysis of trends in the expected values of the survey means, or trends in the animal populations?

_{j}The difference between indices, estimates, and true population sizes is neither superficial nor insignificant. It is simply incorrect to treat indices or estimates as true population sizes in trend analysis, without taking into account the manner in which (necessarily model- based) “adjustments” were made. Nor need the scientific community wait for further research to show how to simultaneously adjust data and perform trend analysis; there is already a substantial literature (James et al. 1996, Fewster et al. 2000, Link and Sauer 2002).

#### Site-specific variation

Bart et al. (2003) choose to overlook all variability except that which occurs among sites. Given the amount of within-site variability associated with BBS data, the availability of site- specific covariables associated with observer characteristics known to contribute to this variability, and the documented confounding of such variability with population trend, application of the method to BBS data would be irresponsible. A first step to a more realistic description of site-specific variation would be to address measurement error (e.g., Fuller 1987); in our view the capacity for model-based analysis of site-specific variation is a necessary component of analysis of BBS data. Count surveys such as the BBS do not have detectability estimation as a component of the design, and the only way to accommodate factors that influence counts is through site- and time-specific covariables (Link and Sauer 1998a, Bennetts et al. 1999). We discuss this in more detail in a later section.

#### Defining a parameter of interest

Bart et al. (2003: 368) define trend as follows:

“We assume that a scatterplot of the true population sizes, plotted against time, would reveal a pattern that is well described using an exponential curve. We do not assume the true population sizes fall on this exponential curve (this would be an assumption typical of model-based approaches), only that the exponential curve would describe the pattern in a useful way.”

This statement is misleading and vague. First, one can hardly say that assuming the population sizes fall precisely on an exponential curve is typical of model- based approaches. Second, the assumption as stated is vague: does it mean that

where ϵ_{j}are independent and identically distributed mean-zero variables? Or are the errors assumed to be additive on the log scale, viz., In either case, are we to assume homoscedasticity? The implicit answer seems to be that it does not matter. There seems to be no way to judge, either: the exponential curve is merely a “useful” description. A more precise definition of usefulness is needed.

At any rate, none of this ambiguity is necessary. From their simulation study, it is apparent that Bart et al. (2003) define *R _{exp}* by β*, where {α*, β*} is the minimizer of

*Y*to denote a true population total in year

_{j}*j*.

It is important to note that Bart et al. (2003) choose *R _{exp}* as their parameter, but then estimate a different quantity. The quantity they estimate is

*R*defined by

_{lin},*b**, where {

*a**,

*b**} is the minimizer of

*b*may often closely approximate the parameter β, as demonstrated in Bart et al.'s (2003) Table 1. (We note that Table 1 was computed assuming population sizes falling precisely on the linear curve, a condition which favors the approximation.) On the other hand, it is possible that the approximation will not be so close: for example, the collection of population sizes {1207, 1009, 251, 512, 655, 356, 377, 469, 556, 389, 659, 266,673,477,871,609,743,1074,1064,1150} has

*R*= 1.0218 and

_{exp}*R*= 1.0321. It might be suggested for these data that the exponential curve is not a “useful” description of the pattern. However, usefulness cannot be defined, and discrepancy from the posited pattern cannot be measured, without resort to a model- based analysis.

_{lin}Our experience has been that much pointless discussion on the topic of trend analysis arises due to a failure to begin by defining just what trend is. The methods described by Bart et al. (2003) have the virtue of being based on a precise (though unclearly articulated) notion of trend, but we argue that analytical methods should be based on the definition, rather than an approximation. We suggest that a better definition of trend is the geometric mean rate of change over a particular interval: this definition does not require qualitative assessments as to the usefulness of exponential or linear patterns, and seems to closely approximate the informal usage of the term “trend.” See Link and Sauer (1998a) for a discussion of this definition of trend.

### HISTORICAL PERSPECTIVE

We view the Bart et al. (2003) approach as motivated by the same general design considerations that guided earlier investigations of count survey data, but as failing to adopt the lessons learned by applications of those methods. The innovation claimed by Bart et al. (2003) is that site-specific trends can be aggregated in a design-based framework to estimate an overall trend. Geissler and Noon (1981), and Geissler (1984) used similar ratio estimators of site-specific population change in estimation of a composite change as an average of site-specific change. These early efforts and a number of subsequent developments (Geissler and Sauer 1990, James et al. 1990, Link and Sauer 1994) shared the notion of averaging site-specific estimates, but were generally complicated by two technical issues: There was controversy about (1) the appropriate way to estimate change for each route in the face of needed covariates and model-fitting problems (Link and Sauer 1997a), and (2) the need to weight each route to accommodate detectability and consistency of survey. Bart et al. (2003) dismiss these issues as either model-based complications or as items requiring future innovations. These issues are critical components of the analysis; for example, observer covariates are an absolute necessity to avoid bias in trend estimates (e.g., Sauer et al. 1994, Link and Sauer 1998b). In the next section, we specify the consequences of omitting observer covariates.

### SITE-SPECIFIC COVARIATES

The need for site-specific covariates for factors that influence counting efficiency is well established for the BBS and most count-based surveys. In the BBS, Sauer et al. (1994) clearly documented the bias in estimation associated with the failure to include observer information as covariates in BBS analysis. Kendall et al. (1996) documented the presence of further observer effects associated with the first year of counting on routes. Link and Sauer (1998b) explicitly modeled a “new observer” effect that expresses the increase in counts associated with improvement in observer quality over time in the BBS. James et al. (1996) included observer covariates in their semiparametric analysis of BBS data. A purely design-based estimation procedure cannot accommodate observer covariates. Here, we illustrate the consequences of their omission. Bart et al. (2003) analyze data from 20 species in the BBS. Although they label the analysis “true trend,” it is actually a complete analysis of a subset of the larger BBS dataset, which even in its entirety, would not provide a true trend at the population level. We analyzed these same species for the same time period using all available data, but following the estimating-equations approach described in Link and Sauer (1994). Note that this is not the estimating-equations procedure followed by Bart et al. (2003; see below). We performed our analyses with and without controls for observer effects, and predicted that omission would overall lead to more positive trend estimates as documented in the literature (Sauer et al. 1994, Link and Sauer 1998a). Letting *EEW* denote the estimating effects estimator *with* observer effects, and *EEWO* denote the estimating effects estimator *without* observer effects, we conducted paired *t*-tests and found results consistent with our prediction (mean difference *EEWO* − *EEW* = 0.20% per year, one-sided paired *t*-test, *t*_{19} = −2.05, *P* = 0.03). We believe that the Bart et al. (2003) analysis, as implemented in this paper for BBS data, will lead to biased estimates because of its failure to incorporate observer effects.

We also note that avoiding the bias due to ignoring observer effects comes at an unavoidable cost. Adding controls for observers diminishes precision. In the present case variances from estimates with observer effects were almost twice as large as variances from estimates without observer effects. It is however, a false economy to suppose that one may use an unrealistically small estimate of precision in making inference about population trends, or in planning future studies. It is sometimes true that biased estimates have smaller mean squared error than corresponding unbiased estimates: there is a trade-off between accuracy and precision. However, it is also sometimes true that biases lead to incorrect inferences. Conclusions drawn from monitoring programs and associated analyses must withstand scrutiny. Systematic bias is a fatal flaw, unless demonstrably of inconsequential magnitude.

### SIMULATIONS

Many practitioners, seeing the simulation results presented in Bart et al. (2003), would be convinced that the method they present is at least as good as other published works, and in some cases much better than existing methods. We believe that deficiencies of the simulations limit their usefulness.

First, the “data sets” considered (and treated as hypothetical populations) correspond to no real populations, despite Bart et al.'s desire to “make maximum use of real data” (p. 369). As noted previously, BBS data include a large component of variability due to observers. This, and other sources of variation which could well be confounded with population change, have been ignored. Bart et al. (2003) have simply defined away many of the possible problems which necessitate a model-based analysis. In our view, this narrow view of what is to be estimated is not relevant.

The satisfactory performance of Bart et al.'s estimator is simply a consequence of the design-based sampling in their simulation, and the satisfactory approximation of *R _{exp}* by

*R*in the hypothetical populations. The relevant questions to be addressed are whether the design-based estimator will work when sampling is not design based and when nonlinear patterns of change exist, but the Bart et al. simulations provide no information on these questions. In addition, to apply Bart et al.'s proposal to the BBS, the question must be addressed as to whether observer effects can be overlooked. These questions have not been addressed.

_{lin}Using our *EEWO* (estimating effects estimator *without* observer effects) and *EEW* (estimating effects estimator *with* observer effects) results described above, we compared our results to the Bart et al. (2003) results. We predicted that in the absence of controls for observer effects our results would correspond to those of Bart et al. (2003), but that with observer effects controlled for, the trend estimates would be lower. Letting *D* denote Bart et al.'s design-based estimator, we conducted paired *t*-tests, finding *P*-values of 0.48 for comparison of *D* and *EEWO,* and 0.03 for comparison of *D* and *EEW,* consistent with our predictions.

The simulations presented by Bart et al. (2003) present an alternative estimating-equations estimator in a rather poor light. The comparison is unfair: the estimating-equations estimator does not estimate *R _{exp},* but rather, a precision- and abundance-weighted average of site-specific trends. The performance of this estimator of population trend must depend on the definition of population trend, and on the appropriateness of the weights applied. We cannot comment on the weights chosen by Bart et al. (2003), except to note that they are not the same as those described in Link and Sauer (1994, Geissler and Sauer 1990), appearances notwithstanding. Although it is not our intent to defend the estimating-equation procedures discussed in Bart et al. (2003) or those presented by Link and Sauer (1994), we direct readers to other simulations (Thomas 1996) and comparisons (Link and Sauer 1996, Sauer, Hines, and Fallon 2003) of analyses based on estimating equations and other route regression approaches, which we believe more satisfactorily evaluate the performance of alternative methodologies.

## DISCUSSION

Analyses of count data can be controversial, and a great deal of caution is needed to avoid weakening our credibility as scientists and managers by making unwarranted statements based on flawed analyses. Simplicity is a great virtue in analysis of survey data, but, as the comment attributed to Einstein says, “Things should be made as simple as possible—but no simpler.” The risk is that excessive simplicity may compromise the credibility of results obtained from count survey data. In our view, the notion that BBS data can be effectively treated as a random sample of population sizes is wrong, and the primary failure of the Bart et al. (2003) approach is that it perpetuates this view by ignoring important features of the analysis and by structuring simulations of BBS data as though the counts were actual populations. Their approach also ignores the lessons from the history of the survey. The original conception of the BBS was that of a design- based survey, but this view was abandoned when it became apparent that model-based adjustments were needed to accommodate uneven survey coverage and covariates that influence counts. While it is useful to occasionally evaluate assumptions and to refine analyses, new analyses should not ignore features that have been shown to be important in past analyses. Although we have focused our discussion on the BBS, we note that even greater constraints exist on analysis of other continental-scale surveys such as the Christmas Bird Count (Link and Sauer 1999) and the International Shorebird Survey (Howe et al. 1989).

Modern approaches to the analysis of BBS data reflect the necessity of accommodating the large changes in number and consistency of routes surveyed over time, and attempts to analyze or simulate the survey must appropriately incorporate the variation induced by these logistical constraints. Sauer, Fallon, and Johnson (2003, their table 4) document that the amount of missing data in the BBS varies regionally, and often exceeds that considered by Bart et al. (2003). The BBS database is characterized by constant addition of new survey routes that add an additional challenge for analysts. Pattern in mean counts can be induced by adding new survey routes, and this observation has been used as evidence of the failure of simple design-based approaches to analysis of BBS data (Geissler and Noon 1981, James et al. 1990). Rather than develop methods that ignore these complications, a need exists to inform users about appropriate analyses and to identify situations when simple approaches are inadequate. With increased availability of BBS data via the Internet, there is increased risk of misuse of the information due to simplistic analyses. Clearly, a need exists for more extensive metadata associated with the survey to guide users to appropriate analyses, and metadata provided by Sauer, Hines, and Fallon (2003) initiate an attempt to define some of these issues.

Although this commentary has focused on the technical aspects of trend estimation, we also note that implementation of a procedure such as that proposed by Bart et al. (2003) has strategic implications for bird conservation. Increasing information needs for management and increased sophistication of analysis methods provide an opportunity to better integrate monitoring data with scientific and management activities such as adaptive resource management (Ruth et al. 2003). Analyses such as that described by Bart et al. (2003) step away from these opportunities by rejecting model-based approaches that allow direct modeling of the influence of environmental variables on counts and by focusing on the very limited goal of trend analysis. We agree with James et al. (1996) on the limits of use of trend information in science and management, and suggest that any modern analysis of BBS or International Shorebird Survey data should have the capability of directly modeling both more-general aspects of population change and covariates that influence detectability and population size.

We suggest that investigators interested in estimating population change from the BBS or other count- based bird surveys use one of the many approaches that accommodate the constraints of the surveys. Examples of these methods include hierarchical models (Link and Sauer 2002), overdispersed Poisson models (Link and Sauer 1997b), generalized additive models (James et al. 1996, Fewster et al. 2000), as well as more traditional approaches (e.g., Sauer and Droege 1990). Most of these analyses are readily available using computer programs or Internet-based programs; estimating-equation and general-additive-model estimation approaches are presently available on the BBS Analysis and Summary Internet site (Sauer, Hines, and Fallon 2003).

## Acknowledgments

We thank J. D. Nichols, G. W. Pendleton, and an anonymous reviewer for comments on the manuscript.

## LITERATURE CITED

*In*J. R. Sauer and S. Droege [eds.], Survey designs and statistical methods for the estimation of avian population trends. USDI Fish and Wildlife Service Biological Report 90(1), Washington, DC. Google Scholar

*In*J. R. Sauer and S. Droege [eds.], Survey designs and statistical methods for the estimation of avian population trends. USDI Fish and Wildlife Service Biological Report 90(1), Washington, DC. Google Scholar

*In*T. E. Martin and D. Finch [eds.], Ecology and management of Neotropical migrant birds. Cambridge University Press, New York. Google Scholar