Sauer et al. (2004) advocate the use of trend estimation models that adjust counts for differences among observers. We agree that such adjustments are sometimes needed, and we noted (Bart et al. 2003) that they may readily be carried out prior to using the estimation method we described. Including observer covariates, however, is not always necessary and substantially reduces precision, as Sauer et al. (2004) acknowledge. Furthermore, under plausible conditions, including observer covariables introduces bias rather than removing it. In addition, the weighting scheme used in the estimating-equations approach may introduce bias. Our method avoids these sources of bias, is simpler and more flexible than the estimating- equations approach (e.g., carrying out power and sample-size calculations is much easier with our approach), and has smaller standard errors than the estimating-equations approach, especially when counts fluctuate widely. Model-based methods, including the estimating-equations approach, also have advantages, particularly in assessing complex influences on the counts. We recommend that analysts consider both approaches; comparing results obtained with the different methods may be especially informative.
Estimación de Tendencias con un Modelo Lineal: Respuesta a Sauer et al
Resumen. Sauer et al. (2004) recomiendan el uso de modelos de estimación de tendencias que ajusten los conteos a las diferencias existentes entre observadores. Nosotros estamos de acuerdo en que dichos modelos podrían ser útiles, y sugerimos que estos ajustes pueden incorporarse fácilmente antes de usar el mé todo de estimación que describimos. Nosotros introdujimos nuestro método porque es más sencillo y más flexible que el método que requiere estimar ecuaciones (e.g., realizar cálculos de poder estadístico y de tamaños de muestra es mucho más fácil con nuestro mé todo), y porque el nuestro se desempeñó mejor que el de estimación de ecuaciones cuando los conteos fluctuaron ampliamente. Adicionalmente, el procedimiento de pesaje usado en el método de estimación de ecuaciones podría introducir sesgos, mientras que el procedimiento lineal que nosotros describimos se pesa a sí mismo y no es susceptible a este error. Sin embargo, el método de estimación de ecuaciones también ofrece ventajas, particularmente en su habilidad para manejar modelos complejos. Recomendamos que los análisis consideren ambos procedimientos; comparar los resultados obtenidos mediante ambos métodos podría ser particularmente informativo.
In their commentary, Sauer et al. (2004) acknowledge that our method of trend estimation (Bart et al. 2003) performed “in some cases much better than existing methods” but they are concerned that analysts using our approach might not adjust counts for observer differences, a step they view as essential for unbiased estimates. They also raise questions about the North American Breeding Bird Survey (BBS) and about design-based analytic methods in general. Our main responses, presented in detail below, are (1) we noted that analysts can make adjustments for observer effects, or other spurious influences, prior to employing our method; (2) such adjustments are sometimes useful, but they reduce precision and in some cases introduce, rather than remove, bias and should thus be undertaken cautiously rather than automatically; (3) we disagree with some of Sauer et al.'s (2004) comments about the BBS but note, more importantly, that our purpose was not to evaluate the BBS but rather to describe a general trend estimation method; and (4) their criticisms of design-based methods apply with equal force to the model-based approaches they favor.
Sauer et al. (2004) assert that analysts should always use methods that adjust counts for observer effects. We believe that these adjustments are valuable in some cases, and we noted (Bart et al. 2003) that they can be made prior to using the trend estimation method we described. For example, to adjust for observer effects one might carry out multiple regression, with observer covariates, and record the slope and mean from this regression rather than from the simple regression we used to illustrate our method. Other methods might be used to adjust the counts for weather influences, change in extraneous noise, or other spurious influences. Thus, our method can readily be combined with initial steps to adjust the counts in any way that the analyst deems suitable.
We do not agree that counts should always be adjusted, as they are in the estimating-equations approach (Link and Sauer 1994). The approach described by Link and Sauer (1994) includes the assumption that detection rates within observers show no long-term trends. But observers' abilities improve during the first several years they conduct surveys, and later in life the proportion of birds they detect declines as hearing and visual acuity decline. By calculating observer-specific trends, the Link and Sauer method confounds change in detection rate with change in population size. For example, if detection rates are increasing for a majority of the surveyors (due either to increasing skill or familiarity with the route), then the estimated trend will be positive even if population size is stable. If detection rates are decreasing, the trend will be negative even if the population is stable. These within-surveyor trends do not cause bias in the method we introduced. Our method, like any method based on an index, requires that there be no long-term trend in the “index ratio,” the ratio of the expected survey result to population size (Bart et al. 1997). That assumption, however, might readily be met. For example, even if most surveyors were early in their career, the proportion of observers with k years experience, k = 1, 2, 3, …, might be about the same each year due to the annual arrival of new surveyors and disappearance of previous participants. Thus, while long-term trend in average detection rates is a serious problem, the Link and Sauer (1994) method may be significantly biased even if no such trend occurs. In contrast, our method is essentially unbiased under that condition.
A separate reason for concern about the method Link and Sauer advocate is worth mentioning. In their method, data from long-term observers are weighted more heavily than data from short-term observers. Long-term surveyors, however, probably have declining detection rates as noted above. Estimates for regions with a small number of observers, and a few long-term observers, thus may have negative bias. In our method, results are not weighted by number of years the surveyor participates (or any other measure of within-route precision).
It is also worth noting, as Sauer et al. (2004) acknowledge, that including observer covariates reduces precision. In simulations Sauer et al. (2004) describe, variances were almost twice as large when adjustments for observer covariates were included. Thus, including such adjustments, if they are not necessary, is costly.
In summary, adjusting counts with observer or other covariates reduces precision, is not always necessary, may introduce bias rather than remove it, and may need to be carried out in different ways with different data sets (e.g., correcting for observer differences and for weather effects may require quite different approaches). That is why we separated these two efforts in our description. We did not intend, however, to imply that counts should never be adjusted to remove spurious influences.
We now briefly respond to the other comments made by Sauer et al. (2004). Headings refer to their paper. Space limits preclude our addressing all of their comments.
We noted (Bart et al. 2003:367) that design-based methods “assume only that the sampling plan was followed and that the sample size is large enough to make inferences based on the central limit theorem.” Sauer et al. (2004) say that the first assumption is “crucial, and virtually never satisfied,” but this is incorrect. For example, in the BBS, the sampled population is the roadsides (not the entire region) and a well-defined sampling plan is used to select locations from this population. Sauer et al. (2004) also claim that the second part of our sentence is incorrect, noting that “no normality assumption is required for design-based analysis.” We did not say that normality is required for design-based methods. Our point, in fact was just the opposite: under the central limit theorem the t-distribution may be used, if the sample size is large enough, regardless of the underlying distribution.
Finite population of sites
Sauer et al. (2004) argue that BBS “routes cannot be viewed as selections from a sample frame” because they “do not survey habitats >0.4 km from secondary roads” and they say that “Bart et al. (2003) regard BBS routes as a simple random sample”, apparently because we treated a data set, derived from the BBS and used to evaluate our trend estimation method, as a simple random sample. As noted above, the BBS is a (stratified) random sample from a well-defined sampling frame (roadsides). More to the point, however, we were not attempting to evaluate the BBS or make claims about how BBS data should be analyzed, we simply used this data set to construct a hypothetical population. We agree that trend estimates based on BBS data, for specified areas (as opposed to our hypothetical population), should be analyzed with methods for stratified sampling. The extension from simple random sampling to stratified sampling is straightforward in general and for our trend estimation method: calculate point and interval estimates for each stratum and combine them using stratum sizes as weights. This feature is incorporated into a comprehensive trend analysis program we are preparing for general distribution.
Measurable quantities at sites
Sauer et al. object to our definition of Yj as either (a) population size in year j, or (b) as the expected value of the survey result in year j, but we do not understand their concern. Our first case arises when, for example, subjects in a random sample of plots are enumerated so that density per plot yields an unbiased estimate of population density (and thus population size can be estimated). The second case arises with index methods in which, by definition, the detection rate is not estimated. We simply provide a definition and notation which accommodates either case.
Sauer et al. (2004) object to our focus on among-site variability as opposed to within-site variability, and they refer to the “documented confounding of such (within-site) variability with population trend.” Actually, such documentation is rare in the bird survey literature, and we believe they overlook the fact that within-site variability does not, by itself, cause any bias; only a long-term trend in the index ratio does. We agree, however, that such trends in the index ratio must be detected and removed or trend estimates will be biased. In some cases, Sauer et al.'s (2004) focus on observer detection rates and their method (Link and Sauer 1994) for removing trend due to this factor will be appropriate. In other cases, other factors or methods may be more appropriate. By separating adjustment of counts from estimation of the trend, we let analysts tailor this effort to the specific features of their data set.
Defining a parameter of interest
Sauer et al. (2004) question the accuracy of the approximation (Rlin) we used for our parameter of interest (Rexp), but they present an example in which the pattern of counts is distinctly U-shaped. Such data should not be analyzed using either our method or the estimating-equations approach, a point we emphasized in our report. We agree that deciding whether to use an exponential curve to describe trend is sometimes difficult and that better guidelines for making this decision might be useful. The lack of such guidelines, however, hinders decisions about whether to use the estimating-equations approach just as much it affects whether to use our method.
Sauer et al. (2004) carried out analyses using the estimating-equations approach with and without observer covariates and found consistent differences, and they argue that because we did not correct for observer effects, our approach is biased. But as noted above (a) analysts who want to include these adjustments can readily do so and then employ our method, and (b) it is not entirely clear which approach produces smaller bias because the “with observer covariates” approach confounds within-observer trends with trend in population size. We thus regard this analysis as unnecessary and inconclusive. We agree with their view that factors which might cause a long-term trend in the detection ratio should not be ignored.
Sauer et al. (2004) acknowledge that our method performed well but they are concerned that our population did not correspond perfectly to a real population. Our data set, however, was collected on real BBS routes and is far more realistic than the data sets others (e.g., Link and Sauer 1994) have used to evaluate model performance. They also point out again that we excluded observer effects. As we have stressed above, we regard adjustment of counts as an important, but separate, issue. Sauer et al. (2004) assert that the real question is how well the method will work when sampling is not design-based and trends are not linear. We disagree with this view. We did not (and would not) recommend our method except when a random sampling plan has been followed and trends are approximately linear. Sauer et al. (2004) then state that the estimating-equations approach does not estimate the parameter (Rexp) that we defined but rather “a precision- and abundance-weighted average of site-specific trends”. This definition, however, is too vague to be useful. Managers and researchers want to know how population size is changing. The parameter we estimate, Rexp (the annual rate of change of an exponential curve, fit to the population sizes), provides a clear description of how population size is changing. Sauer and colleagues need to explain more clearly how their parameter relates to change in population size.
Sauer et al. (2004) say again that we believe “BBS data can be effectively treated as a random sample of population sizes.” BBS locations are randomly selected from roadsides and thus can be treated as a random sample from this population. Questions do arise about spatial limits of the sampled population (i.e., how far from the road the surveyed area extends), and about the magnitude of selection bias (i.e., a difference between the roadside and regionwide trend), but these issues were not the focus of our report and are just as problematic if the estimating-equations approach is used. Furthermore, we do not view BBS counts as estimates of population sizes; they are an index. Sauer et al. (2004) also state that model-based adjustments are needed to compensate for uneven coverage through time, but we disagree. Such adjustments can also be applied using design-based methods, though this is usually not done due to concern that the adjustment may be correlated with the response variable. Sauer et al. (2004) note that the fraction of data missing in BBS data sets sometimes exceeds the fraction we used in the simulations. This is true, but they give no reason for believing results would have been different had a higher fraction been used. They note that addition of new routes can in some cases lead to bias. We agree, but this issue is distinct from estimating trend in the existing data set; both our method and the estimating- equations approach would need modification to remove bias due to this cause. Finally, they urge that reliable methods be used and note that many other trend-estimation methods have been developed. We recommend that analysts consider using these methods. All of them, however, are complex, make assumptions that are difficult to evaluate, and are difficult for most users to implement, whereas our method is simple to understand and use, makes only limited assumptions, and can easily be used in combination with adjustments for observer differences or other influences. In addition, power and sample-size analyses are much easier using our method than using the other methods.
In conclusion, we reiterate that models which adjust counts for observer differences or other spurious influences are useful tools, especially for exploring complex interactions between observer effects, annual effects, and effects of environmental variables. These approaches, however, inevitably entail more assumptions than our approach, and when these assumptions are incorrect may result in larger bias than analysis of the uncorrected counts. We prefer an analytic strategy in which several approaches, making different assumptions, are available so that the most plausible assumptions for the particular data set may be identified and employed in the analysis. Our method facilitates this approach and, in addition, is simpler and outperforms the estimating-equations approach when counts fluctuate widely, as was true in the shorebird data set we investigated.