Ruffed grouse Bonasa umbellus populations in North America have declined as forests have matured and the extent of early successional forest habitat required by the species has diminished. When wildlife species decline because of habitat loss, determining where to focus habitat management efforts is difficult because both the wildlife population and the required habitat(s) are usually limited in distribution. We adopted a relatively new ecological modeling method, partitioned Mahalanobis D2, which allowed us to predict the distribution of potential ruffed grouse habitat across a landscape of management concern where high quality habitat was uncommon. We used presence data derived from radio-telemetry locations, and GIS habitat data from publicly available sources to create competing partitioned Mahalanobis D2 models. The competing models identified important habitat variables and predicted ruffed grouse habitat distribution at 1-ha and 25-ha scales in southwestern Rhode Island, USA. The 1- and 25-ha models produced comparable overall classification accuracy (83.1% and 81.4%, respectively) but differed substantially in the area of predicted habitat (4,475.5 ha and 10,133.8 ha, respectively). We selected the more conservative 1-ha model as the ‘best’ model, and expanded it to a larger landscape extent. Once expanded, the model predicted 11,463 ha (15.5% of total land area) of potential ruffed grouse habitat for a 735-km2 landscape in southwestern Rhode Island. This model identified those areas with varying proximities to the following features as likely to contain ruffed grouse habitat: early successional forests, river and stream corridors, mixed conifer forests, conifer forests, shrub wetlands and deciduous forests. Early successional forests were the most consistent component of habitat used by grouse, despite the fact that this habitat type was uncommon in our study area (< 1% of total land area). Our model can be used to identify areas of existing ruffed grouse habitat for management focus.
The ruffed grouse Bonasa umbellus (hereafter grouse) is a popular North American game bird whose population has declined >50% range-wide over the last 40 years (Butcher & Niven 2007), most notably in the eastern USA (Rusch et al. 2000, Dessecker & McAuley 2001). Grouse depend on deciduous forests in the first stages of woody succession following disturbance (Bump et al. 1947, Rusch et al. 2000, Dessecker & McAuley 2001). Current land management practices in the north-eastern USA minimize or eliminate several forms of natural disturbance, such as wildfire and American beaver Castor canadensis, which were historically important for maintaining early successional habitat (Askins 2000, Lorimer & White 2003). Most current anthropogenic disturbances (e.g. housing development) do not typically allow forests to regenerate, and thus early successional habitat is becoming increasingly rare in the region (Brooks 2003). The generally accepted hypothesis for the observed grouse population decline is a concomitant decline in the availability of early successional forests (Dessecker & McAuley 2001). This hypothesis is supported by the observation that numerous other species that require similar early successional habitat also are declining in the region. (Askins 2000, Litvaitis 2001, Fuller & DeStefano 2003).
To effectively assess conservation options, biologists require information on the distribution of species' habitat at scales relevant to management goals (Scott et al. 2002). However, when wildlife species decline because of habitat loss, determining where to focus management efforts may be difficult because the species and its habitat may be limited in distribution. Grouse are known to use multiple early successional age classes (Bump et al. 1947, Gullion 1984b, Rusch et al. 2000), where each age class provides different structure required for various life history stages (e.g. open understories in pole-stage stands for nesting vs dense understories in saplingstage stands for brood rearing; Gullion 1984a, Rusch et al. 2000). When multiple seral stages are unavailable, other habitats may serve as surrogates to provide the structural diversity required by grouse. For example, grouse in Rhode Island commonly select forests with pitch pine Pinus rigida and scrub oak Quercus ilicifolia, presumably because they provide high woody stem density, which is important year-round for cover (Endrulat et al. 2005). InPennsylvania, however, these habitats were avoided in an area where a greater amount of saplingstage forest was available (Scott et al. 1998). Thus, surrogate habitats and their orientation on the landscape are likely important determinants of grouse distribution in areas where high quality early successional habitat is rare.
Although methods exist to assess grouse habitat quality based on site-level habitat surveys (e.g. Cade & Sousa 1985), these are only applicable at relatively small scales and for optimum habitat. In contrast, we required information on the distribution of potential grouse habitat for a landscape where high quality habitat was uncommon and multiple surrogate habitats were available and known to be used by grouse. Multivariate statistical models that use geographic information systems (GIS) data are effective tools to estimate the probability of habitat occurrence (Guisan & Zimmermann 2000, Scott et al. 2002, Beissinger et al. 2006). So-called species distribution models (SDM) vary substantially in methodology (for reviews, see Guisan & Zimmerman 2000, Scott et al. 2002, Guisan & Thuiller 2005, Beissinger et al. 2006, Elith et al. 2006), but all rely on the concept of ecological niche (Guisan & Zimmermann 2000, Guisan & Thuiller 2005). An implicit assumption in all SDMs is that the habitat characteristics used to construct the model can adequately characterize this environmental niche (Browning et al. 2005). A second related assumption is that the species being modeled is in pseudo-equilibrium with its environment, an assumption that is necessary to project models built using spatially- and temporally-limited data to larger scales (Guisan & Thuiller 2005). Many SDMs require information regarding species presence and absence, but there are many situations where defining true absences can be difficult, especially when dealing with small populations. Reasons for failure to detect species presence include inconspicuous individuals, inadequate survey effort and/or design, suitable but unoccupied habitat, and truly unsuitable habitat (Hirzel et al. 2001, 2002, Rotenberry et al. 2002, Grahm et al. 2004). These concerns may be especially relevant for grouse, which can be difficult to reliably detect using standard survey protocols (Zimmerman & Gutiérrez 2007).
Our main objective was to predict the distribution of potential grouse habitat in a landscape of management concern in order to inform management authorities and support future research. We adopted a relatively new ecological modeling technique, partitioned Mahalanobis D2 (hereafter partitioned D2) that has been used to predict habitat distribution for rare wildlife species given only data on species presence (Rotenberry et al. 2002, Browning et al. 2005, Rotenberry et al. 2006, Watrous et al. 2006). Partitioned D2 is a modification of the Mahalanobis distance method, which has been widely applied (Clark et al. 1993, Farber & Kadmon 2003, Tsoar et al. 2007, Alloche et al. 2008). Partitioned D2 differs from the Mahalanobis method in that it partitions the full Mahalanobis D2 into separate components that represent independent environmental relationships, the most consistent of which define the species minimum habitat requirements (Rotenberry et al. 2006). The procedure involves conducting a principle components analysis (PCA), which identifies habitat relationships based on the multivariate mean and variance of environmental variables measured at locations where the species is present (Browning et al. 2005, Rotenberry et al. 2006). Unlike traditional interpretation of PCA, partitioned D2 focuses on the lowest principle components, which describe the most consistent patterns in the species' habitat use. Environmental variables that load highly on the lowest components are those variables that consistently occur where the species is found, and in that sense represent the species' minimum habitat requirements (Browning et al. 2005, Rotenberry et al. 2006). Thus, locations with unknown occupancy that contain the minimum habitat requirements are assumed to have a high probability of providing habitat for the species (Rotenberry et al. 2006). Partitioned D2 shares the assumptions inherent to all distribution models as described above, and also assumes that the species habitat can be described in terms of multivariate means and variance (Browning et al. 2005, Rotenberry et al. 2006). Additionally, partitioned D2 assumes that the distribution of minimum habitat requirements limits the species' distribution (Rotenberry et al. 2006). We used partitioned D2 because it allowed us to predict grouse habitat distribution in southwestern Rhode Island using presence-only data in a situation where reliable absence data were unavailable.
Our study was conducted in the Arcadia Management Area (hereafter Arcadia) and surrounding private lands in Washington County, Rhode Island, USA (41°32′N, 71°43′W) (Fig. 1). Arcadia is managed by the Rhode Island Department of Environmental Management as a multiple use recreation area. We selected this site so we could build on previous grouse research in the area (Endrulat et al. 2005) that was conducted as part of the Appalachian Cooperative Grouse Research Project (ACGRP; Norman et al. 2004). Additionally, the site is considered an area of management concern because of recently observed declines in grouse populations (Fig. 2; Tefft 1999, 2007). We used raster GIS data, which is most easily analyzed in square dimensions, to define our study area as a 16,900 ha rectangle that fully encompassed all portions of Arcadia (see Fig. 1). Arcadia covered 6,604 ha of this area, and the remainder consisted of private lands (10,296 ha).
The dominant forest type in Rhode Island, historically, was oak Quercus spp. and chestnut Castanea dentata forest (Butler & Wharton 2002). Like most of southern New England, Rhode Island was almost completely cleared of forests for agriculture and fuel wood by the dawn of the American industrial revolution (Butler & Wharton 2002). Around the turn of the 20th century, chestnut blight eliminated the remaining mature chestnuts (Russell 1987), which changed the dominant forest composition from oak-chestnut to oak-hickory Carya spp. (Butler & Wharton 2002). Forest regeneration in abandoned agricultural areas resulted in a dramatic increase in early successional habitat during the early to mid-20th century (Lorimer 2001, Brooks 2003), but by the end of the century, mature second-growth forests covered most of the undeveloped land in the state (Butler & Wharton 2002), and early successional habitat was less common than pre-settlement levels (Brooks 2003). Our study area is representative of southern New England forests in that when it was last surveyed (1995), the study area was ∼78% forested, and 55% of the total land cover was second-growth deciduous forest (Rhode Island Geographic Information Systems (RIGIS): http://www.edc.uri.edu/rigis) dominated by mature red oak Q. rubra, white oak Q. bicolor, beech Fagus grandifolia and hickory.
Input data used to construct partitioned D2 models
Grouse location data
Past applications of partitioned D2 have typically relied on presence data derived from point counts (Rotenberry et al. 2002, Rotenberry et al. 2006) or discrete landscape features used by one or more individuals (Browning et al. 2005, Watrous et al. 2006). Standard roadside surveys conducted during the breeding season when male grouse display and are most conspicuous routinely fail to detect individuals (Zimmerman & Gutiérrez 2007), and do not account for breeding habitat used by females. We therefore used radio-telemetry to define areas with known grouse presence in our study area.
We captured grouse in Arcadia during 1999–2001 using cloverleaf interception traps (Gullion 1965), and fitted them with necklace-style radio-collars as described in Endrulat et al. (2005). Unless otherwise noted, values are presented as means ± SE. Point locations of radio-collared grouse were collected diurnally by taking at least three bearings within a 30-min period from stations with known UTM coordinates. To ensure independent observations, no more than one location was collected per day, with an average of 5.6 (± 5.7) days between serial locations. Methods followed those of the ACGRP including removal of locations with >800 m Geometric Mean Distance from telemetry stations prior to final analysis (see Whitaker 2003 for complete ACGRP telemetry criteria). For our study, we used 1,210 radio-locations from 28 radio-collared grouse (on average 44 ± 6 locations per bird) that included females and males (N = 7 and 21, respectively) and adult and juvenile age classes (N = 17 and 11, respectively). Telemetry locations included a minimum of 47 points (average = 100.8 ± 36.5; ∼4% of total data set) per month and a minimum of 206 points (average = 302.5 ± 104.1; ∼17% of total data set) per season (i.e. spring, summer, fall and winter). Trapping and handling protocol were approved by the University of Rhode Island Institutional Animal Care and Use Committee (IAUCUC #: AN00-09-009).
Defining grouse presence
In our telemetry data set, radio-relocations and home range estimates for individual birds had a high degree of overlap. Additionally, sampling was not uniform throughout the study area, and the density of marked individuals did not necessarily reflect actual grouse density. Because partitioned D2 predicts potential habitat based on variation in environmental variables at locations where a species is present, we could not identify habitat use on a perindividual basis because habitat values at any given location (in our case, delineated by raster GIS cells) could be input into the model multiple times (once for each grouse that used the location), thus over-representing the importance of that location.
We pooled our radio-telemetry locations among individuals and seasons to identify presence locations that were used by one or more grouse. Since the radio-telemetry data included grouse of both genders and different age classes, and the data were collected evenly across all seasons (see previous section), we assumed that the presence locations are a representative sample of all regional habitat types used by grouse annually. We created grids that covered the entire study area, and classified each cell as either occupied, or as having unknown occupancy. We repeated this process for two different grid resolutions with 1-ha and 25-ha cells. Cells in the 1-ha resolution grid, which represented a site-level scale (Johnson 1980), were considered occupied if they contained at least one radio-location. We chose 1 ha for site-level scale because it approximates movement rates of non-dispersing grouse (109 ± 7 m/hour; Fearer 1999). The second grid represented a home range scale (Johnson 1980) with 25-ha resolution, which approximates the mid-point of published grouse home range sizes (varies from 7.3–49.1 ha, depending on age and gender; Whitaker et al. 2007). Here, we used 50% kernel home range estimates (Endrulat et al. 2005) to identify cells as occupied or unknown based on whether at least one estimated home range overlapped a given cell. Based on these criteria, at the 1-ha scale, we used 468 1-ha grid cells as discrete locations with known grouse presence. At the 25-ha scale, we used 70 25-ha grid cells as presence locations. The total area identified as used by grouse at the 1- and 25-ha scales was 468 ha and 1,750 ha, respectively.
Mean and standard error for habitat characteristics measured at two scales of ruffed grouse presence in western Rhode Island, USA. Variables at 1-ha scale are the distance (meters × 100) from the center of each cell to the center of the nearest neighboring cell of each habitat type. Elevation is given as relative to the average elevation in the study area. Variables at the 25-ha scale are the percent coverage of each habitat type within each cell.
GIS habitat variables
We identified forest habitat types, forested wetlands, stream corridors and elevation from publicly available GIS data (RIGIS: http://www.edc.uri.edu/rigis; Table 1). We converted categorical coverages into continuous variables of proximity to (1 ha) and percent coverage of (25 ha) that reflected habitat configuration. At the 1-ha scale, habitat variables were a measure of the distance from the center of a focal grid cell to the center of the nearest cell of each habitat feature (see Table 1). At the 25-ha scale, we assumed that grouse select an area based on the total availability of resources within a home range, so habitat variables were a measure of composition within each grid cell (i.e. percent cover of habitat types; see Table 1). At this scale we also assumed that more heterogeneous habitat would be more attractive to grouse, and thus included indices of habitat type richness (total number of habitat types), evenness (relative abundance of habitat types) and diversity (a composite of richness and evenness; DeJong 1975) as variables in the model. Grouse are nonmigratory (Rusch et al. 2000), and thus any important seasonal habitats must be contained within their annual home range. In contrast to a standard binary habitat value (habitat type is present vs absent), these distance-based and percent coverage variables allowed us to partially insulate our model from potential seasonal bias of location data. By using distance and percent cover based variables, grouse habitat use could be influenced not only by habitat occupied at the time of survey, but also by surrounding habitat that may have been occupied during a later season.
We found no existing GIS data that specifically identified regenerating early successional forest habitat. Although rare in Rhode Island, early successional forest is an important component of grouse habitat, so, we created a GIS coverage of young forest habitat by interpreting leaf-off 1:5,000 scale digital color orthophotographs. Specifically, we systematically searched 100-ha grid cells for characteristics associated with a recently disturbed and regenerating forest (i.e. visible breaks in the forest canopy, obvious woody regeneration, and clearly defined boundaries). We only considered patches with area >0.4 ha, the minimum area required to support a breeding pair of grouse (Gullion 1984c). We used known patches of regenerating forest as reference sites when interpreting photos, and visited a randomly selected subset (20%) of digitized patches to ground truth photo interpretation accuracy. This process identified 85.0 ha of early successional forest habitat in the study area and 279.8 ha in the expanded area. Individual patches of early successional habitat ranged from 0.4 ha to 13.6 ha, with an average size of 1.6 ha (±1.79 SD), and patches larger than 4.0 ha were uncommon. We calculated continuous values for this coverage as described above.
Collinearity between habitat variables has been identified as a potential cause of instability in partitioned D2 results (Rotenberry et al. 2002, Browning et al. 2005). We therefore created a correlation matrix and eliminated one variable from pairs where r>|0.70|. We used this criterion to eliminate the wetlands variable, which was collinear with the forested wetlands variable at both scales, and the habitat type diversity and richness variables, which were collinear with habitat type evenness at the 25-ha scale.
One assumption inherent to all SDMs is that the species is in equilibrium with its environment (Guisan & Zimmerman 2000, Guisan & Thuiller 2005), i.e. that the species occupies the full range of ecological conditions that can support it. This assumption can be violated if a species' distribution is limited by constraints such as dispersal or competition (Svenning & Skov 2004, Guisan & Thuiller 2005) that prevent the species from occupying otherwise suitable habitat. At a range-wide scale, dispersal limitations likely exclude grouse from some geographic areas that contain suitable habitat (Gullion 1984a). However, as we were interested in predicting habitat distribution at a more localized scale, we did not expect that dispersal would limit grouse distributions in our study. A second critical assumption is that the species' minimum habitat requirements are included in model construction. This assumption is often difficult to fully meet because coarse-scale geospatial data is rarely adequate to describe the underlying biological processes that drive habitat use (Scott et al. 2002). Nevertheless, geospatial data is useful to describe patterns, and we selected variables based on structural characteristics common to certain habitat types and spatial scales that we speculated would drive grouse habitat use.
Partitioned D2 model construction
We used SAS code (SAS Institute 2002) provided by Rotenberry et al. (2006) to create two partitioned D2 models of grouse habitat similarity at 1-ha and 25-ha scales. Complete descriptions of the theory and mechanics behind partitioned D2 can be found elsewhere (Rotenberry et al. 2002, Browning et al. 2005, Rotenberry et al. 2006, Watrous et al. 2006), but here we provide a brief outline of the modeling procedure. We performed a PCA for each model to describe variance in the presence data. This partitioned the potential full Mahalanobis D2 model (Clark et al. 1993) into p components, where p was equal to the number of habitat variables included. Some subset, k, of these components is selected to formulate the partitioned D2 model (our selection procedure is described below). Unlike traditional interpretation of a PCA, only principal components with low eigenvalues are considered for inclusion in k. These low components contain the least variance in the data, and thus represent the most consistent aspects of the species' habitat use (Rotenberry et al. 2006). Variables that load highly on components selected for k are assumed to be the characteristics most closely associated with the species' habitat distribution (minimum habitat requirements; Browning et al. 2005, Rotenberry et al. 2006).
Using eigenvectors and eigenvalues from each of the k selected components, and the same habitat variables measured at locations with unknown occupancy (hereafter unknown locations), we were able to calculate a cumulative distance statistic, D2(k), for all unknown locations. D2(k) summarizes the cumulative multivariate distance between habitat values at an unknown location, and the mean of values for all presence locations. This provides a measure of the degree of similarity between the unknown location and the mean of habitat values at all presence locations. D2(k) values are difficult to interpret, because values can range from 0 (completely similar) to infinity (Browning et al. 2005, Rotenberry et al. 2006). Consequently, we calculated P-values by comparing D2(k) to an approximating χ2-distribution (Browning et al. 2005, Rotenberry et al. 2006, Watrous et al. 2006), which yielded an indexed output of values ranging from 0 to 1 that represented the probability that habitat present at any unknown location contained grouse habitat. We calculated P-values from D2(k) for all points (both presence and unknown) at 100 m (1-ha model) and 500 m (25-ha model) intervals within our study area, and converted point data to a raster GIS coverage with equivalent cell size to create a predictive habitat probability map for each model.
Selection of D2(k) and test of model stability
We tested model stability and placed lower bounds on our selection of k using crossvalidation (Browning et al. 2005). While crossvalidation is useful for removing unstable components with 0 or near-zero eigenvalues, the selection of the upper bounds on k is still largely qualitative. To place an upper bound on k, we decided to consider only components with non-zero eigenvalues <1.0, and we only selected the subset of components that explained as near, but no more than, 20% of the cumulative variance in the data. Although subjective, these criteria provided us with guidelines that were consistent between competing models, and improved our ability to directly compare each model.
Model evaluation and comparison between models
We considered all variables with eigenvector loadings ≥|0.45| to be the most important variables on each partition, as these variables ultimately would have the greatest influence on model prediction. We used methods similar to Browning et al. (2005) to identify a habitat threshold, which is necessary to separate potential habitat from non-habitat, and to identify patch structure and total habitat area. When selecting a threshold P-value, there is a tradeoff between classification accuracy and model specificity, since a low threshold P-value will correctly classify more points, but will do so by predicting a larger area as probable habitat. This in turn will lead to incorrectly classified non-habitat, and increased error of commission. Thus, the optimum threshold P-value will strike a balance between accuracy and specificity. To define an optimum threshold, we classified P-values into groups of 0.05 from 0.0 to 1.0 (i.e. 0.0, 0.05, 0.10 … 1.0.), identified the percentage of correct classifications for each group, and divided this value by the percentage of the study area identified as probable grouse habitat. This produced a ratio of accuracy:specificity, and we assumed that the threshold value with the lowest ratio represents the optimum habitat threshold for a given model.
Because PCA does not offer a traditional measure of goodness-of-fit or effect size, we evaluated model performance using jackknife resampling (Manly 1998, Browning et al. 2005). As our data set consisted of a large number of presence locations, removal of only one presence location would have little influence on model outcome and would tend to overestimate model accuracy. We used the raw telemetry data points and calculated a 90% kernel density estimate, pooled across individuals, which identified 14 clusters of point locations that presumably corresponded to 14 discrete areas of core grouse habitat. We withheld clusters one at a time with replacement, and assessed model accuracy based on the average classification within withheld habitat clusters.
We compared models based on reclassification accuracy and by comparing predicted overall accuracy (non-resampled) with potential habitat area at threshold. To identify the ‘best’ model, we assessed how accurately each model classified the maximum number of locations while still identifying a relatively small area as potential habitat. Thus, the ‘best’ model would achieve an optimum balance between accuracy and predicted habitat area. We also visually evaluated map outputs from each model based on our familiarity with the study area to ensure that outputs were reasonable.
Eigenvalues from resampled (crossvalidation) and full partitioned Mahalanobis D2 models of ruffed grouse habitat use in western Rhode Island, USA. Crossvalidation eigenvalues are averaged for all crossvalidation iterations and given with SD of crossvalidation iterations.
Expansion of model coverage
Once satisfied with the ‘best’ model's performance at the study area scale, we expanded the model by calculating D2(k) and its associated P-values for a much larger area (hereafter the expanded area). We required an expanded area that was large enough to be useful for evaluating grouse habitat distribution at a landscape scale as well as an area of management interest. At the same time, we wanted to minimize the degree to which we exceeded the model's level of inference in areas where habitat conditions differed from the original study area. The resulting expanded area covered 735 km2 (approximately the southwestern ¼ of the state), included the majority of state-controlled Wildlife Management Areas in the region (see Fig. 1), and was approximately 4.5 times larger than the original study area.
Selection of D2(k) and test of model stability
Models at both scales included p = 10 principle components. We selected k by including components with non-zero eigenvalues < 1.0 that described <20% of the total variance for each model. This resulted in a selection of D2(k) based on principal components (PCs) 7–10 for both models. In general, crossvalidation results showed that component eigenvalues were relatively stable among iterations for each model (Table 2). In each case, eigenvalues averaged across iterations tracked closely with full model eigenvalues for each component, with small standard deviations (see Table 2). No iterations produced components with eigenvalues of 0 or close to 0, suggesting overall stability for both models. Additionally, every iteration resulted in the same selection of k = 4.
Model thresholds and area of predicted habitat
Both models retained similar levels of accuracy at threshold, but differed substantially in the amount of total habitat area predicted. The 1-ha model had the greatest (non-resampled) classification accuracy (83.1%), and identified 27.6% (4,475.5 ha) of the study area as potential habitat at a threshold of 0.15. The 25-ha model had similar classification accuracy (81.4%), and identified 62.5% (10,133.8 ha) of the study area as potential habitat at a threshold value of 0.25. Average resampled accuracy was >50%, and averaged P-values were greater than model threshold levels for both models. The 25-ha model had an average reclassification accuracy of 0.57, and average P-value of 0.39, whereas the 1-ha model had similar values of 0.54 and 0.30, respectively.
Eigenvalues, variance (%) and cumulative variance explained (%) (A) and eigenvector loadings (B) from competing partitioned Mahalanobis D2 models of ruffed grouse habitat use in western Rhode Island, USA. Cumulative variance begin with the lowest principle component. Variables at 1-ha scale are the distance (meters × 100) from the center of each cell to the center of the nearest neighboring cell of each habitat type. Variables at the 25-ha scale are the percent coverage of each habitat type within each cell.
Important habitat variables
For each model, a number of habitat variables were identified as important correlates to grouse habitat use. For the 1-ha model, D2(k) included PCs 7–10, all of which had eigenvalues ≤ 0.48 and explained up to 14% of the overall variance (Table 3). Early successional proximity loaded highly on PC10, riparian corridor and conifer forest proximity loaded most highly on PC9, shrub wetland and mixed conifer forest proximity loaded highly on PC8, and deciduous forest, conifer forest and shrub wetland proximity loaded highly on PC7 (see Table 3).
For the 25-ha model, D2(k) included PCs 7–10, which all had eigenvalues ≤0.61, and explained up to 16% of the overall variance (see Table 3). Deciduous forest and mixed conifer forest percent coverage loaded highly on PC10, shrub wetland and early successional coverage loaded highly on PC9, and habitat type richness and deciduous forest coverage loaded highly on PC8. No variables loaded > 0.45 on PC7 (see Table 3).
Model evaluation and final model selection
The 25-ha model produced higher jackknife reclassification accuracy than the 1-ha model, but the latter model performed considerably better than chance. When overall (non-resampled) accuracy was considered, both models performed well (>80%) at threshold, with the 1-ha model capturing the greatest classification accuracy. When we compared habitat probability maps for the two models (Fig. 3), the 1-ha model depicted a clear patch structure that was consistent with our knowledge of grouse distributions in the study area, whereas the 25-ha full model produced a confusing output with no patch structure. Of the two, the 1-ha model also predicted a smaller land area as probable habitat (126% less total land area), and thus was more conservative. We selected the 1-ha model as the ‘best’ model, and expanded its extent.
Distribution of potential grouse habitat
The 1-ha model of grouse habitat distribution identified approximately 15.5% (11,463 ha) of the expanded area as potential grouse habitat (see Fig. 3). The largest single habitat patch was in Arcadia (1,661.8 ha), and two other large patches were identified to the north and east of this patch. The second largest patch (920.2 ha) fell partially within the Tillinghast management area. Other patches of habitat tended to be smaller (<400ha) and relatively evenly spaced throughout the expanded area. State wildlife areas contained 3,201 ha of identified grouse habitat, whereas the remaining 8,262 ha was located on privately owned property.
Habitat characteristics associated with grouse presence
It is well documented that grouse require diverse resources that are provided by multiple habitat types and/or structures (Bump et al. 1947, Gullion 1984a, Rusch et al. 2000, Norman et al. 2004). Multiple surrogate habitat types played a large role in our model's predictions, and areas that were located near all or most of these habitat types were consistently predicted as grouse habitat. Conversely, areas that were relatively homogeneous were typically predicted as non-habitat. The important habitat types that influenced our model's predictions likely provide various resources that are consistent with current knowledge of grouse habitat requirements.
Grouse typically select habitat with high woody stem density and abundant herbaceous vegetation (Bump et al. 1947, Rusch 2000, Dessecker & McAuley 2001, Haulton et al. 2003, Whitaker et al. 2006) which are both common in early successional forests (Dessecker & McAuley 2001). However, individual patches of early successional habitat in our study area were typically too small (average = 1.6 ha ± 1.7), and lacked the seral diversity necessary to support a grouse home range. As such, we speculate that grouse utilize several habitat types as surrogate sources of woody stem density and herbaceous vegetation. Abundant moisture and nutrients in shrub wetlands support dense shrub tangles and a well-developed herbaceous layer. Forests with mixed coniferous and deciduous species (e.g. pitch pine and scrub oak forests) typically have diverse crown height and structure, and hence allow sunlight to reach the forest floor and promotes shrub growth that provides increased stem density. Riparian corridors also typically have a well-developed herbaceous layer because of abundant soil moisture and nutrients. We suggest that grouse in our study consistently used areas in close proximity to mixed conifer forests because these areas provide high woody stem density, riparian corridors because they provide a dense herbaceous layer, and shrub wetlands because they provide both of these habitat components.
Mast fruits, and especially hard mast, are important to the ecology of grouse inhabiting oakhickory forests such as those found in Rhode Island. In years with abundant mast crops, grouse home range size decreased (Whitaker 2003), and reproductive output increased (Devers et al. 2007). Deciduous forests in Rhode Island typically contain multiple mast species (e.g. red and white oaks and beech) and provide the most consistent source of hard mast in the state, which may explain why grouse locations were consistently located near deciduous stands.
Conifer forests in Rhode Island typically contain large stands of mature white pine with an open understory, little woody stem density, and minimal mast production. Such conifer stands can provide excellent concealment for avian predators and are typically avoided by grouse (Gullion 1970, Gullion & Alm 1983). Conifer forest was an important variable in the 1-ha model, but areas that were predicted as potential habitat did not typically contain this habitat type. Thus, we speculate that the importance of this variable in our model likely reflects consistent grouse avoidance of mature conifer stands.
Patterns in habitat availability and their implications
The 1-ha model predicted 4,524 ha (27.9%) of potential grouse habitat in the study area, and 11,463 ha (15.5%) of potential habitat in the expanded area. Recent surveys suggest that grouse densities in our study area are extremely low (E. Blomberg, unpubl. data), and that populations have declined substantially (Tefft 1999, 2007). Extensive use of surrogate habitats suggests that reduced availability of high-quality early successional habitat may negatively effect demographic rates and limit grouse populations in the study area. Consistent with this idea, Endrulat et al. (2005) found that grouse in our study area occupied considerably larger territories than reported in previous studies of grouse home range. Also, Devers (2005) reported lower survival and reproduction for grouse in our study area compared to other study sites in the southern part of the grouse's range. Habitat models based onindividuals in marginal habitat likely include extensive low-quality habitat, and our model's relatively large area of predicted habitat likely reflects the overall low quality of grouse habitat in Rhode Island. Given recent declines, it is important to note that habitat identified by our model may not represent truly suitable habitat, hence our reliance on the term probable habitat throughout this paper. Whether current conditions in Rhode Island are adequate to maintain viable populations at low densities remains unclear, although recent downward population trends for grouse in the state (see Fig. 2; Tefft 1999, 2007) suggest they are not.
In southwestern Rhode Island, privately controlled lands contained approximately 72% of the predicted grouse habitat in the expanded area (Fig. 4). This suggests that private lands management should be a priority for grouse conservation in the state. However, privately controlled forestland in Rhode Island typically consists of small properties (average = 5.2 ha, > 80% of private parcels < 4.0 ha; Butler & Wharton 2002) that may be too small and isolated to provide adequate grouse habitat. Conversely, maintenance of evenly dispersed patches of high-quality grouse habitat on state-owned areas may provide source populations for adjacent areas. In either case, management effectiveness will depend on the factors that influence population response to habitat manipulation at the landscape scale; questions that remain unanswered. In light of this uncertainty, future research should focus on how landscape-scale habitat availability, distribution and quality influence grouse population dynamics. Predictive habitat distribution models such as ours should prove useful for designing and implementing this future work.
Performance and evaluation of partitioned D2 models
Our study is the first to evaluate partitioned D2 models based in part on the total extent of predicted habitat. If we had used only classification accuracy to evaluate model performance, we would have considered the 25-ha model a strong model even though its accuracy was achieved only because a much larger area (which included non-habitat) was predicted as potential grouse habitat. In contrast, the 1-ha model had similar accuracy, but did so without predicting an unduly large area of potential habitat. We suggest that both classification accuracy and the extent of predicted habitat be used to evaluate SDMs regardless of analysis method, especially when population numbers are low and required habitat is likely to be uncommon.
A potential source of bias in this or any SDM is groupings of presence data, such as age, gender or seasonally used habitats, which may bias results if one group is over-represented and driving model results. For example, if location data were collected primarily during one season (such as a summer field season), inferences about annual habitat use could not be made. In our telemetry data set locations from different ages, genders and seasonal habitats were evenly represented (see Methods), and thus we consider our model results robust with respect to our presence data.
Calenge et al. (2008), recently suggested that partitioned D2 may be sensitive to inclusion of widely available habitat types, which may influence model results by having universally low variance in the study area. These authors propose a modification of partitioned D2 to deal with this non-trivial issue by performing an additional partition of D2(k) that incorporates environmental availability into the model. We did not include this additional step into our modeling process because we felt that the environmental variables in our model had strong biological justification, and thus merited inclusion regardless of availability. Nevertheless, the bulk of our variables were relatively limited in availability in the study area, and as such were unlikely to have the negative effect on model results suggested by Calenge et al. (2008). This is supported by the fact that early successional forest had the most limited availability (<0.5% of total land cover) of any variable in our data set, loaded highly on the lowest PC, and thus was the most consistent variable utilized by grouse. One notable exception is deciduous forest, which comprises the bulk of the study area (∼55% of total land cover), but is clearly linked to hard mast production that is crucial as a winter food source for grouse (Whitaker 2003, Devers et al. 2007). Although deciduous forest was identified as an important habitat variable, it loaded highly on the last PC (PC7) included in our analysis, which indicates lower importance of the deciduous forest compared to those variables loading highly on PCs 8–10.
Implications for management
Partitioned D2 provided us with an efficient statistical technique to predict the distribution of potential grouse habitat in Rhode Island. We suggest our model be used as an approximate estimate of grouse habitat distribution to focus field surveys and identify sites with high potential when planning management. For example, the model identified a sizeable patch of potential grouse habitat in the Tillinghast Management Area, which was recently (2006) acquired through a joint purchase by The Rhode Island Department of Environmental Management, The Nature Conservancy, and the town of West Greenwhich. If field surveys confirm grouse presence or habitat potential, managers can work to create high-quality early successional habitat to benefit an existing grouse population on this state-managed property.
Early successional forest was the most consistent habitat used by grouse in our study (as evident by the variables high loading on the lowest principle component; Rotenberry et al. 2006), and there is clearly a need to create more early successional forest to enhance grouse populations in the region (Dessecker & McAuley 2001). Availability of more early successional habitat in Rhode Island would likely decrease grouse reliance on surrogate habitats, improve survival and reproductive rates, and bolster future population viability. Management agencies should continue to focus efforts on increasing the availability of high-quality early successional habitat using established forest management techniques (e.g. Gullion 1984b, Jones et al. 2004, Storm et al. 2003, Whitaker 2003), as the availability of these areas will likely continue to limit grouse populations in Rhode Island, and throughout the eastern USA.
we thank J.T. Rotenberry, K.L. Preston and S.T. Knick for making their SAS code freely available, and D.M. Browning and K.S. Watrous for providing helpful advice on the logistics of creating a partitioned D2 model. E. Endrulat thanks those cooperators and technicians already acknowledged in Endrulat et al. 2005. This project was supported by the University of Rhode Island Department of Natural Resources Science, Rhode Island Agricultural Experiment Station, Rhode Island Champlin Foundations, Rhode Island Department of Environmental Management, and The Ruffed Grouse Society. This is contribution number 5220 of the Rhode Island Agricultural Experiment Station. We thank H. Ginsberg, J. Heltshe, J.M. Reed, D.F. Stauffer, N. Yoccoz and one anonymous reviewer for comments that greatly improved earlier versions of this paper.