Waterfowl management in the United States is one of the more visible conservation success stories in the United States. It is authorized and supported by appropriate legislative authorities, based on large-scale monitoring programs, and widely accepted by the public. The process is one of only a limited number of large-scale examples of effective collaboration between research and management, integrating scientific information with management in a coherent framework for regulatory decision-making. However, harvest management continues to face some serious technical problems, many of which focus on sequential identification of the resource system in a context of optimal decision-making. The objective of this paper is to provide a theoretical foundation of adaptive harvest management, the approach currently in use in the United States for regulatory decision-making. We lay out the legal and institutional framework for adaptive harvest management and provide a formal description of regulatory decision-making in terms of adaptive optimization. We discuss some technical and institutional challenges in applying adaptive harvest management and focus specifically on methods of estimating resource states for linear resource systems.

The regulation of waterfowl harvests in the United States involves public announcements, deliberations, and joint decision making by the federal and state governments. The federal government derives its responsibility for establishing sport-hunting regulations from the Migratory Bird Treat Act of 1918 (as amended), which implements provisions of international treaties for migratory bird conservation. The Act directs the Secretary of Agriculture to periodically adopt hunting regulations for migratory birds, “having due regard to the zones of temperature and to the distribution, abundance, economic value, breeding habits, and times and lines of migratory flight of such birds” (U.S. Department of the Interior 1975). The responsibility for managing migratory bird harvests has since been passed to the Secretary of the Interior and the U.S. Fish and Wildlife Service. Other legislative acts, such as the National Environmental Policy Act, the Endangered Species Act, the Administrative Procedures Act, the Freedom of Information Act and the Regulatory Flexibility Act, provide additional responsibilities in the development of hunting regulations, and help define the nature of the regulatory process (Blohm 1989).

An essential element of harvest regulations is the annual collection and analysis of data on breeding population status, harvest levels, survival, production, migration and other population characteristics (Smith, Blohm, Kelly & Reynolds 1989). Long-term databases of monitoring data are used to estimate key population parameters such as survivorship and reproduction, and to predict harvest impacts on population dynamics (Nichols, Conroy, Anderson & Burnham 1984, Johnson, Sparling & Cowardin 1987, Johnson, Nichols & Schwartz 1992). The information thus accumulated is folded into waterfowl population ‘models’, which in turn are used to inform the regulations process (Cowardin & Johnson 1979, Johnson, Nichols, Conroy & Cowardin 1988, Williams & Nichols 1990).

Though biologists have long recognized a need for informative harvest management (Anderson & Burnham 1976, Nichols et al. 1984, Montalbano, Johnson, Miller & Rusch 1988, Williams & Nichols 1990), traditionally the regulation of waterfowl harvests has not focused on uncertainty about regulations impacts. Nor has the opportunity to use regulations to reduce uncertainty been exploited. An unfortunate result is an unnecessarily slow rate of learning about population dynamics, and a correspondingly slow rate of improvement in management over time.

## The harvest regulations process

Most waterfowl hunting regulations are established annually, within a timetable that is constrained by the timing of biological surveys and the need to give states and the public an opportunity to influence regulations. The annual regulatory cycle includes analysis and interpretation of biological data, development of regulatory proposals and solicitation of public comment, leading in turn to the promulgation and publication of hunting regulations in the autumn of each year.

A key component of the regulatory process consists of data collected each year on population status, habitat conditions, production, harvest levels and other system attributes of management interest (Smith et al. 1989). Population and habitat monitoring is essential for discerning resource status and modifying hunting regulations in response to changes in environmental conditions. The system of waterfowl monitoring in North America is unparalleled in its scope and is made possible only by the cooperative efforts of the U.S. Fish and Wildlife Service, the Canadian Wildlife Service, state and provincial wildlife agencies and various research institutions.

Each year monitoring data are used to estimate key population parameters such as survival and reproductive rates, and to associate levels of harvest with various regulatory scenarios (Martin, Pospahala & Nichols 1979). These and other estimators are combined to produce and refine dynamic population models, which describe how waterfowl abundance varies in response to harvest and uncontrolled environmental factors (Williams & Nichols 1990). These models in turn are used to inform the regulations process, on assumption that population status is directly related to harvest and harvest can be predicted as a function of hunting regulations (Johnson, Williams, Nichols, Hines, Kendall, Smith & Caithamer 1993). By building on accumulated monitoring data, these models constantly evolve to reflect a growing understanding of waterfowl population dynamics and the impacts of harvest.

Unfortunately, the modeling of waterfowl populations and their harvest continues to be characterized by great uncertainty. In many cases, the sheer number and complexity of hunting regulations, combined with inadequate replication and experimental controls, has precluded reliable inference about the relationship between regulations and harvests (Nichols & Johnson 1989). Managers know even less about the impact of harvest on subsequent waterfowl population size. Particularly problematic in this regard are questions about the nature of density-dependent population regulation, which provides the theoretical basis for sustainable exploitation (Hilborn, Walters & Ludwig 1995). Uncertainties about the relationships among hunting regulations, harvest and population size constitute a principal source of controversy in the regulations-setting process.

In response to frustrations about the continuing lack of the biological understanding needed to inform harvest regulations, in 1995 the U.S. Fish and Wildlife Service implemented a new system to harvest regulation under the name of adaptive harvest management (AHM). This system is a rather formalized example of Adaptive Resource Management (Holling 1978, Walters 1986), which often is described in terms of ‘management by experiment’ or ‘learning by doing’ (Walters & Holling 1990). An appropriate definition for AHM is ‘management in the face of uncertainty, with a focus on its reduction’ (Williams & Johnson 1995). An adaptive approach to harvest regulations emphasizes uncertainty about regulatory effects, and incorporates uncertainty as a factor guiding management actions (Johnson et al. 1993). The goal is to reduce uncertainty over time, and thereby improve long-term management.

## Operational elements of adaptive harvest management

Adaptive harvest management explicitly accounts for uncertainty and the value of information in the regulatory process, recognizing that there are at least four identifiable sources (Williams 1997). The first and most obvious is uncontrollable (and sometimes unrecognized) environmental variation, which influences biological processes and induces stochasticity in population dynamics. The second is uncertainty about resource status, called partial observability to indicate limitations in one's ability to observe a resource system through monitoring. A third source of uncertainty is referred to as partial controllability, to emphasize the limited influence of management decisions on harvest and other actions. Finally, structural uncertainty concerns the lack of understanding (or lack of agreement) about the structure of biological relationships, such as the influence of harvest on survivorship. Environmental variation, structural uncertainty, partial observability and partial controllability all limit a manager's ability to make informed regulatory decisions (Nichols, Johnson & Williams 1995).

Along with an institutional framework and appropriate monitoring programs as described above, four elements are definitive of the adaptive process for setting waterfowl regulations:

1) An array of regulatory options that are available for the regulation of waterfowl harvest. These options include various combinations of regulations representing, e.g. ‘restrictive’, ‘liberal’, and ‘moderate’ regulations, with possible constraints on allowable fluctuations from year to year. The set of feasible regulatory options can be limited or expanded as the need and desirability to do so is recognized by management.

2) An objective function by which to evaluate and compare these options. The general form of the objective function is a weighted sum of harvests (or harvest utilities) over some recognized time frame. This is in keeping with traditional goals for waterfowl harvest management, and ensures that the focus is on harvest and harvest opportunity. A long time frame and harvest utilities that devalue harvest at low population sizes provide a conservation perspective, by preventing excessive harvest in the short term and thereby ensuring longterm sustainability.

3) A set of waterfowl models representing an array of meaningful hypotheses about the impact of regulations on waterfowl populations. For example, the set currently in use includes models that incorporate the hypothesis of additive hunting mortality, and others that incorporate the hypothesis of completely compensatory hunting mortality. These models are used to gauge the consequences of different regulations. At present four models are used, each developed from data bases that have accrued as a result of waterfowl monitoring and research programs.

4) Measures of reliability for the models used in selecting harvest regulations. Reliability measures are used to weight the model outputs and are updated each year as additional data about resource status and the impacts of regulation become available. The notion of reliability is included in the process as an acknowledgment that the ‘correct’ or best approximating model for use in evaluating regulatory options is not known with certainty, and this uncertainty should be incorporated somehow in the procedure for evaluating and selecting regulations.

Adaptive harvest management is framed in terms of sequential decision making under uncertainty, in which one annually observes the state of the resource system (e.g. population size and relevant environmental features) and takes some management action (e.g. hunting regulations). An immediate return accrues as a result, which is expressed as a function of the benefits and costs that are relevant to the stated objectives of management. In response to the combined influence of management actions and uncontrolled environmental variation, the resource system subsequently evolves to a new state. The manager then observes the new system state, makes a new decision, accumulates additional returns, and the system evolves to yet another state (Fig. 1). And so on. The goal of management is to make a sequence of such decisions, each based on information about current system status, so as to maximize management returns over an extended time frame.

A major advantage of AHM is the explicit acknowledgement of alternative hypotheses describing the effects of regulations and other environmental factors on population dynamics. These hypotheses are codified in a set of system models, each of which has an associated weight reflecting its ability to describe system dynamics. Each year the weights are updated by comparing the model-specific prediction of changes in population size against the actual change observed from the monitoring program. By iteratively updating model weights and optimizing regulatory choices, the process eventually should identify which model is most appropriate to describe the dynamics of the managed population, and thereby should allow more effective, because better informed, management.

## Population dynamics and the value of harvest

In what follows we describe AHM in a context of optimal adaptive control theory. Structural uncertainty is characterized here with multiple models of population dynamics over a discrete time frame, along with model-specific measures of uncertainty about which model is most appropriate. Without loss of generality as to optimal management, we combine environmental variation and partial controllability into a single stochastic factor z_{t} affecting population dynamics. Resource status is characterized by x
_{t}, recognizing that x
_{t}, includes attributes that are definitive of resource state. Management action at time t is designated by a_{t}, and policies describing actions over the remainder of the time frame are designated by A_{t}.

With these notational conventions, consider a biological population that annually is subjected to harvest, with management actions that are based on resource status x
_{t} and the projected effects on future resource states. Models depicting population responses play prominently in the assessment of impacts. Several models of the form

are assumed to be available, where a
_{t}, and z_{t} represent management controls and random variation, respectively. Initially one does know which model most appropriately represents population change in response to management. This uncertainty is captured in a set p_{i}(t) of likelihoods that express one's confidence in the models at time t. The notation p_{i}(t) allows for evolving likelihood values in response to accumulating information about management controls and population responses. By affecting population dynamics, management can influence the evolution of the likelihoods, and thereby promote learning.

Benefits and costs attend the implementation of harvest controls over time, and these can be captured in a utility function that itself may be model-specific. For simplicity we describe utilities as functions of the current resource state and action, recognizing that the utility function might also represent an average of utilities across potential outcome states. Thus, R_{i}(a
_{t}|x
_{t}) is the utility for model i if the resource status is x
_{t} and action a_{t}
, is taken. In the case of waterfowl harvest, the utility is given in terms of harvest yield, recognizing an aversion to harvest decisions that would result in an expected population size below the goal of the North American Waterfowl Management Plan (NAWMP; U.S. Department of the Interior, Environment Canada, and Secretario de Desarrollo Social Mexico 1994). The capacity of available breeding habitat to promote population growth is considered in determining an optimal regulatory decision for x
_{t}. Thus, liberal hunting regulations could be appropriate even if the population is below the NAWMP goal, if current habitat conditions are expected to result in good production of young. On the other hand, restrictive regulations may be appropriate when reproductive success is expected to be low, even if populations are at or above the NAWMP goal.

In general, an overall value for harvest utility that accounts for model uncertainty is the average

based on model-specific utilities R_{i}(a
_{t}|x
_{t}) and model likelihoods p_{i}(t). If there is only a single model under consideration, or if the likelihood is assumed to be p_{i}(t)= 1 for model i, the utility corresponding to action a
_{t} simplifies to

Each of the population models characterizes transitions of the population over time, as influenced by such factors as survivorship, recruitment and migration, along with the controls affecting them. These factors always are subject to environmental variation and other stochastic factors, including randomness in the effects of controls. Thus, the projected resource status x
_{t+1} for model i inherits a probability distribution P_{i}(x
_{t+I}|x
_{t},a
_{t}) from environmental and other sources of variation. The challenge is to choose harvest controls that will maximize aggregate harvest utility in the face of stochastic effects, while also accounting for uncertainties about the biological processes that drive population dynamics.

## Dynamic programming

Adaptive harvest management (AHM) utilizes a variant of stochastic dynamic programming, an iterative procedure for identifying optimal state-specific actions for dynamic systems. Stochastic dynamic programming can be described in terms of an observable system

with a
_{t} and z_{t} representing management controls and random variation, respectively, at time t. Let policy A, specify a state-specific control a _{for every state xτ at every time in a time frame {t,t+l,…,T}. A value V(At|xt}) can be associated with A_{t}, by accumulating utilities over the remainder of the time frame:

where the expectation is with respect to environmental variation and partial controllability over the time frame. The notation V(A_{t}|x
_{t}) indicates that the accumulation of utilities begins at time t, the start of the time frame for A_{t}. It also expresses the fact that accumulated utilities are conditional on the population state x
_{t}, in that V(A_{t}|x
_{t}) can (and usually does) vary for different population states x
_{t}.

Decomposing the sum above into current and future utilities, we have

with the transition probabilities p(x
_{t+1}|x
_{t}
a
_{t}) capturing environmental variation and partial controllability. These probabilities often are assumed to be stationary, in that they change through time only as a result of controls. However, stationarity is not a theoretical requirement, and non-stationary transition probabilities can be denoted in the above expression simply by conditioning explicitly on time, as in p(x
_{t+l}|x
_{t},a
_{t},t).

Values for the aggregate utilities V(A_{t}|x
_{t}) can be obtained for every possible policy A, over the time frame. Thus, by proper choice of A_{t} these values can be optimized. A backward iteration algorithm to determine an optimal policy is given by the Hamilton-Jacobi-Bellman (HJB) equation

(Stengel 1994), with V*(x
_{t}) the optimal value of aggregate utility corresponding to state x
_{t} at time t.

The iterative application of the HJB equation starting at the end of the time frame is known as stochastic dynamic programming (Bellman & Dreyfus 1962, Dreyfus & Law 1977). The optimal policy A*(x
_{t}) thus described identifies optimal actions for all population states at all times in the time frame, along with a field of optimal values V*(x
_{t}) for all population states and times. An optimal policy having been found with dynamic programming one need only identify the population state x
_{t} at a particular time, and then apply the control specified by the policy for that state at that time.

## Optimization in adaptive harvest management

Now consider the control of a population for which several models describing population dynamics are available, but the most appropriate model is not known with certainty, i.e. P_{i}(t)≠1. Policy value is given in terms of accumulated harvest utilities, averaged over all models based on the model likelihoods:

This expression can be further decomposed into current and future utilities by

The term p_{i}(t)p_{i}(x
_{t+1}|x
_{t},a
_{t}) in the latter expression can be replaced by P_{i}(t+1) (x
_{t+1}|x
_{t},a
_{t}) via Bayes′ theorem

(Lee 1992), so that

A value V(A_{t}|x
_{t},p
_{t}) for the average accumulated utility can be obtained for every possible policy A_{t} over the time frame, starting at any particular time t and any combination (x
_{t},P
_{t}). By proper choice of A_{t} these values can be optimized, with a solution algorithm that is based on

This is a stochastic dynamic programming problem, though complicated somewhat by the characterization of system state by (x
_{t},p
_{t}). Transitions via Bayes' theorem are required for p
_{t}, and transitions for x
_{t} are given in terms of the transition probabilities (x
_{t+1}|x
_{t},a
_{t}).

The optimization problem can be solved by iterative application of Equation 1, starting at the end of the time frame and proceeding backward in time. An optimal solution consists of a policy A*(x
_{t},p
_{t}) that identifies a specific action for every combination (x
_{t},p
_{t}) of resource state x
_{t} and likelihood state p
_{t,} along with a field of optimal values *(x
_{t},p
_{t}) for all resource states and model likelihoods at all times in the time frame. To implement the optimal policy, at each time one must (i) determine the resource status, (ii) update the likelihoods with Bayes' theorem, and (iii) apply the regulatory control specified by the optimal policy for the resource state and set of updated likelihoods (Table 1).

We note that when p_{i} (t)= 1 the optimal policy and values for (x
_{t}.p
_{t}) are A_{i}*(x
_{t}) and
_{i}*(x
_{t}) respectively for a single model i. This intuitive result follows from the fact that if P_{i}(t)=l, (x
_{t+1}|x
_{t},a
_{t}) = P_{i}(x
_{t+1}|x
_{t},a
_{t}) throughout the remainder of the time frame, so the computing algorithm (1) reduces to the HJB equation for model i:

This formula describes a straightforward stochastic dynamic programming problem, which can be solved by iterative application of Equation 2 as described above.

## Optimal decision-making with partial observability

In almost all management applications, resource status is not known with certainty, and instead must be estimated at each time with field data. An estimate
_{t} inherits a distribution from data collected in the field, conditional on the field sampling design and the actual population size x
_{t}. Let y
_{t} represent field data collected at time t, and Y_{t} represent the accumulation of data up to t. Each year's monitoring effort adds to the accumulation of data, by Y_{t+1}={ Y_{t} y
_{t+1}}. Assume that an estimate
_{t} of resource status can be obtained as a function
_{t}=
_{t}(y
_{t}|Y_{t-1}) of the data accumulated up to time t. Since y
_{t}, is conditional on x
_{t}, the estimate
_{t} inherits conditional distributions *f*
_{1}(x
_{t}|
_{t}) and *f*
_{2} (
_{t}|x
_{t}) from y
_{t}, the transition from
_{t} =
_{t}(y
_{t}|Y_{t-1}) to
_{t+1} =
_{t+1}(Y_{t+1}|Y_{t}) is given in terms of the model-specific probabilities

Under these conditions a solution of the optimization problem for model i is obtained by iterative application of the HJB equation, expressed in terms of estimated rather than actual system state:

## Table 1.

Optimal regulatory choices for midcontinent mallards during the 1999 hunting season. This strategy is based on very restrictive (VR), restrictive (R), moderate (M) and liberal (L) regulatory alternatives, along with current model weights, based on a dual objective of maximizing long-term cumulative harvest and achieving a population goal of 8.7 million. An appropriate regulatory action is identified for each combination of population and habitat conditions, based on information from resource surveys. Table cells with no regulatory entries correspond to season closure.

A key feature in incorporating partial observability is the statistical association between x
_{t} and
_{t}, from which are derived the conditional distributions *f*1 (x
_{t}|
_{t}) and *f*
_{2}(|x
_{t}). These distributions derive from the stochastic structure of the statistic
_{t} =
_{t}(y
_{t}|Y_{t-1}), which is parameterized by x
_{t}. _{}The distribution *f*2(x
_{t}|x
_{t}) arises naturally from
_{t} =
_{t}(y
_{t}|Y_{t-1}), based on sampling variation in y_{t}. On the other hand, the derivation of *f*1(x
_{t}|x
_{t}) can be quite difficult to derive and is a subject of considerable theoretical interest (see below).

The extension to adaptive optimization with multiple models is straightforward. Then the HJB equation 4 becomes

Where * (
_{t},p
_{t})and * (
_{t+1},p
_{t+1}) are defined as before and

To identify the transition probabilities (
_{t+1}|
_{t},a_{t}) in Equation 5 one must determine the transition probability (x
_{t+1}|x_{t},a
_{t}) for every state x
_{t} for which there is a nonzero probability *f*
_{1}(x
_{t}|
_{t}), and calculate the average utility (a
_{t}|x_{t}
,p
_{t}) for all states with non-zero probabilities. These requirements result in a substantial increase in computations, well beyond what is required for a solution with a single model.

## Sequential identification of resource state

In the development above we assumed an estimator
_{t} =
_{t}(
_{t}|Y_{t-1}) and described the system transitions in terms of
_{t}. In what follows we generalize this situation somewhat by describing system transitions directly in terms of the accumulated observations. Thus, at each time in the time frame the system is observed, a management action a, then is taken and the system evolves to a new state x
_{t+1} at time t+1. The system is assumed to be only partially observable, so that x
_{t} cannot be observed directly, and information about the system state must be obtained from y
_{t}. System dynamics are recorded in terms of the transition of observations from y
_{t} to y
_{t+1}, rather than transitions from x
_{t} to x
_{t+1}. Transition probabilities, expressing the stochastic influence of environmental uncertainty, partial controllability and partial observability, can be represented in terms of

where Y_{t} = {Y_{t-1},y_{t}}accumulates observations up to time t. Here we use *f*1 and *f*2 to denote distributions based on observation data y
_{t}, in contrast to Equation 3 in which they represent distributions for the estimators
_{t}. We also characterize the transition probabilities by *f*ß(x
_{t}|x
_{t-1},a
_{t-1}) with structural uncertainty now represented by the parameter ß.

Of the three probability distributions in Equation 6, the distribution

captures the notion of a sequential linkage of systems states, whereby the transition from x
_{t-1} to x
_{t} is influenced by management control a_{t-1} and is parameterized by ß (recall that the subscript i was used earlier to denote a limited suite of models). This distribution is based on stochastic models of population dynamics which themselves express hypothesized relationships among system, environmental and control variables. This formulation allows structural uncertainty to be expressed in terms of variation in the parameter ß.

The component *f*2(y
_{t}|xt) of Equation 6 essentially specifies that system state informs system observations, in that variation in the observations y
_{t} is conditional on state x
_{t}. The idea is that observations are tied to the system state, with stochastic variation in y
_{t} as a result of random sampling. This variation can be modelled with field data, based on an assumed form for *f*
_{2}(y
_{t}|x
_{t}).

The lead distribution in Equation 6

reflects the fact that the actual system state x
_{t-1} is conditionally associated with accumulated observations Y_{t-1}. Identifying distribution (8) can be especially problematic, in large part because the conditioning variable Y_{t-1} is itself subject to stochastic and time-varying influences (i.e. sampling variability) that are not present in x
_{t-1}. The stochastic identification of x
_{t-1} is an example of statistical ‘calibration’ (Graybill 1976), whereby the value of a conditional predictor variable is sought given one or more values of a stochastic response variable. The difficulty here is that the predictor variable itself evolves stochastically according to the process equation

where e
_{t} can be thought of as a general environmental white noise process with an assumed mean of 0 and dispersion W
_{t}. Equation 9 provides the biological basis for distribution (6). To simplify notation, we suppress the subscript ß in the argument below, recognizing, however, the potential for structural uncertainty.

In what follows, we focus on the stochastic prediction of x
_{t} based on the accumulated data Y_{t} up to time t, and in particular we seek the distribution *f*2(x
_{t}|Y_{t}), as represented by estimates of the conditional mean E(x
_{t}|Y_{t}) and conditional dispersion
Σxt|yt begin, consider initially a linear dynamic system, with x
_{t} a vector of time-specific state variables (in our case population and/or habitat status) and y
_{t}, a vector of time-specific observation variables (from e.g. breeding grounds surveys). The idea is to say something about the (actual) state x
_{t} at each point in time, given a record of observations Y_{t} up to t. State transition equations for a linear system can be expressed as

where F
_{t}, is a full rank nxn matrix of potentially time-varying but non-random parameters. A key structural assumption is that the environmental vector e
_{t} adds to the process components F
_{t}
x
_{t-1}. Then the first and second system moments are given by

and

Note that the notation for system control has been suppressed; on including controls, see below.

The system is assumed here to be partially observable, with observation equations

where H
_{t} is a kxn matrix of potentially time-varying but non-random parameters and r(H
_{t}) = min {k,n}. The vector ε
_{t} represents sampling variation, and is assumed to have 0 mean and dispersion V
_{t}. Then

and

The linkage between system state and observation state in Equation 13 is controlled by the magnitude of dispersion V
_{t}, and also by the rank of the transform H
_{t}. A full rank transform with 0 dispersion is tantamount to complete observability.

The problem of estimating x
_{t} is non-trivial because the transition to x
_{t} each time is conditional on the previous state variable value x
_{t-1} which itself is unobservable. An iterative procedure utilizes the conditional distribution of x
_{t}, given the current data y
_{t} and an unbiased estimate
_{t-1} based on previous data. The procedure is based on the joint distribution

of x
_{t} and y
_{t} where E(x
_{t}) and Σx
_{t} are given as in Equations 11 and 12, E(y
_{t}) and Σy
_{t} are given as in Equations 14 and 15 and

Thus,

From this distribution it follows that, conditional on observations y
_{t}, the distribution of system state x
_{t} has the mean

(Graybill 1976: 106). Substituting
_{t-1} and
_{t}, for the expected values in Equation 16 yields the iterative algorithm

for updating the estimate
_{t-1} at time t-1 to
_{t} at t based on the data y
_{t}. A number of interesting points are noteworthy in Equations 16–18:

The estimate

_{t}in expression (18) is a linear combination of two components, the first of which is simply the propagation of_{t-1}with the system transition matrix F_{t}, absent any updating with data. The second adjusts this propagation with a factor that accounts for the components y_{t}.Likewise, the conditional dispersion in expression (17) is a linear combination of two components, the first of which is the dispersion of x

_{t}based on the system transition Equation 10, absent environmental and observability considerations for time t. The second component adjusts this dispersion with a factor that accounts for the latter sources of variation.If F

_{t}, H_{t}, V_{t}, and W_{t}are stationary the evolution of moments for (x_{t}|y_{t}) display the following patterns:- Both the estimate

_{t}and the dispersion tend to increase or decrease through time depending on the eigenvalues of F.- Both environmental dispersion W and observability dispersion V influence the evolution

_{t}through the second term in expression (18).- Environmental variation and partial observability also influence the evolution of the conditional dispersion, through the second term in expression (17).

The following conditions contribute to a large adjustment in the mean of (x

_{t}|y_{t}) through time:The following conditions contribute to the reduction in dispersion of (x

_{t}|y_{t}) through time:

With this formulation it is easy to see why adaptive decision-making is so much easier in the presence of complete observability. If V
_{t}, = 0 and H
_{t} is dimension n and full rank, the estimate x
_{t} in expression (14) becomes

and the dispersion of (x
_{t}|y
_{t}) in expression (17) becomes

In essence, complete observability means that x
_{t} and y
_{t} encode the same information, so that the system state x
_{t} at time t is known with certainty once the data y
_{t} are available. The adaptive management job then becomes one of controlling an observable process with temporally varying dispersion R
_{t} that is known at each point in time.

Equation 18 is a version of the Kalman filter (Kalman 1960, Kalman & Bucy 1961), with filter gain R
_{t}
H′_{t}(H
_{t}
R
_{t}
H′_{t} - V
_{t})^{-1} influencing both the conditional means and the conditional dispersions. On assumption of multivariate normality for v
_{t} and w
_{t}, it can be shown that the Kalman filter provides a minimum variance estimator of system state. Note that data-based adjustments occur in the estimation of
_{t} but not in the temporal updating of dispersion, since the latter is influenced by sampling variation only through the sampling dispersion V
_{t} in the filter gain. A readable description of the Kalman filter is given by Meinhold & Singpurwalla (1983), though their approach differs from the development here. A theoretically comprehensive description is given by Stengel (1994).

### Incorporating harvest controls

One of the nice things about this formulation is that additional complexity can be accommodated without much effort. For example, additive controls can be included simply by incorporating an additional term a
_{t-1} in transition Equation 10:

Then Equation 11 for the mean E(x
_{t}) becomes

Equation 16 for the conditional mean becomes

and the updating algorithm in Equation 13 is

Note that because a
_{t} is assumed to be non-random, Equation 13 for the dispersion of (x
_{t}|y
_{t}) remains unchanged. It follows that controls in a linear system can influence the system state (and thus the data-based estimate
_{t} of system state), but not the corresponding dispersion. This is not surprising, since a
_{t} essentially constitutes a linear random influence on the system mean. Equation 16 still describes a (relatively!) simple algorithm for calculating the dispersion trajectory, one that is especially easy to use if transfer matrices are constant over time. Then the eigenstructure of the gain will be diagnostic of change in dispersion, with patterns that exhibit either explosive growth or exponential decay over the long term depending on the eigenvalues.

One can introduce partial controllability into the problem by assuming that the control at is random,

where the distribution mean represents an ‘intended’ control at each point in time and the dispersion represents uncontrolled variation about the mean (as with the other stochastic model components, we assume here that there is no autocorrelation). Then the system mean becomes

the system dispersion becomes

and the updating algorithm becomes

with

Note that stochastic controls, like deterministic controls, influence the system state through the mean value
_{t-1}. However, unlike deterministic controls they also inflate system dispersion. Again this is not surprising, since stochastic control simply adds another random element to the system transitions.

### Special cases

There are several interesting special cases that correspond to restrictions on the sources of system variation. Examples include:

Constant system state: On assumption that x
_{t} = x
_{t-1} over the time frame, system ‘dynamics’ really consist of variation in the system observations through time:

This formulation is simply the general linear model of classical estimation theory, with x playing the role of the vector ß to be estimated and H
_{t} playing the role of the predictor data matrix X. Estimation in this case proceeds in a straightforward manner, according to the usual computing algorithms for general linear models (Graybill 1976).

We note that a filter-like form can be used for recursive updating of a least-squares estimator. To illustrate this, consider a univariate response y_{t} at each point in time, with y′_{t} = [y′_{t-1},y
_{t}] and H′_{t} = [H′_{t-1}, h′_{t}] representing the accumulation of data up to time t. If
_{t-1} and
_{t-1} are time-specific least-squares estimates, then

with

The dispersion of
_{t} also can be expressed recursively as

or in terms of the estimator gain K_{t} as

The recursive approach to least-squares estimation is described in Walters (1986).

Observation error only: In this case the system state varies through time according to Equation 10, except that process variation is assumed to be captured in the transition matrix F
_{t}. In effect, there are no stochastic components in the state transitions, and the state trajectory is completely determined by the initial state x
_{0}
. The effect of this assumption is registered in a simplified dispersion for x
_{t} in Equation 12, which no longer contains a term W
_{t} for the dispersion of e
_{t}. All other computing forms are operative, and the overall effect is to increase system gain: basically, with less noise in the system there is more information in the accumulated data about x
_{0}.

Process variation only: In this case the system contains random ‘process variation’ e
_{t}, but it is observed perfectly. Without loss of generality, the observation Equa-tion 13 is simplified to y
_{t} = x
_{t}, with H
_{t} = I and V
_{t} = 0. As mentioned above, the estimation issue then vanishes, and the problem reduces to the optimal control of an observable stochastic system.

## Discussion

It is worth keeping in mind that the theoretical developments of the previous section are really about the (apparently) small problem of identifying *f*1(x
_{t-1}|Y
_{t-1}) in the decomposition

of transition probabilities in the Hamilton-Jacobi-Bellman equation. By imposing rather stringent linearity conditions on the system transitions as in Equation 10, it is possible to derive efficient algorithms for estimating or ‘projecting’ the system state x
_{t} at each point in the time frame. On the assumption that stochasticities are normally distributed, the Kalman filter can be shown to produce an optimal estimator
_{t}.

It also is worth keeping in mind the limitations of the Kalman filter, especially for biological systems with severe non-linearities and non-additive variance components. The linear system transitions in Equation 10 can be generalized naturally as in Equation 9, to include mathematical non-linearities as well as non-additivity in the stochastic environmental influences. Either or both features require a different approach to the derivation of *f*1(x
_{t-1}|Y
_{t-1}). A possible attack on non-linear system dynamics is by way of a neighbourhood analysis around a system equilibrium utilizing a Taylor series expansion to produce a quasi-linearized approximation of the system equations. This approach leads into the realm of the extended Kalman filters for non-linear systems, a mathematically complex issue that nevertheless is worth exploring in the context of adaptive harvest management.

The focus of the development presented here is on the identification of resource state in the case of partial observability. Together with other recent technical developments (Williams 1996a,b, Williams, Nichols & Conroy in press, B. Lubow, pers. comm.), the preceding material provides the framework needed to actually carry out adaptive optimization. Despite the recent advances on this particular topic, technical challenges nevertheless remain in the implementation of adaptive management. Some of these challenges involve optimization computations, others involve possible changes in the nature of the objective function and still others involve more fundamental issues associated with the development of the *a priori* model set.

There are very substantial computational requirements with the approach presented here for projecting future resource state in the face of partial observability, and presented elsewhere (Williams 1996a,b, Williams et al. in press) for projecting future information state. Software development for this purpose is proceeding rapidly, however, and should permit application of this approach in the very near future (B. Lubow, pers. comm.). The stochastic dynamic programming approach to computing optimal state-specific policies (Lubow 1994, 1995) has been applied successfully to the adaptive harvest management of midcontinent mallard ducks *Anas platyrhynchos* in North America (Williams & Johnson 1995, Williams, Johnson & Wilkens 1996, Johnson, Moore, Kendall, Dubovsky, Caithamer, Kelley & Williams 1997, Johnson & Williams 1999), a problem involving a single population and four competing models of system dynamics. Recent decisions to consider the use of adaptive management for additional populations of mallards and perhaps other waterfowl species could increase the dimension of the optimization problem, adding substantially to the computational load. Although most of our practical experience is based on the management of hunting regulations, recent attention has focused on habitat acquisition and management (e.g. Johnson, Williams & Schmidt 1996, Johnson, Anderson, Baydack, Nelson, Ringelman, Koneff, Bailey, Martin & Rubec 1997). The simultaneous incorporation of both habitat and harvest actions into an adaptive framework is certainly possible, but the existence of different temporal scales of the two classes of action (e.g. annual hunting regulations, planting of nesting cover at multi-year periods, one-time acquisition of nesting or wintering habitat) may complicate both the modeling and the optimization algorithms.

The objective function currently used in mallard management focuses on the size of the harvest. Because waterfowl hunting in the United States is not a commercial enterprise, it can be argued that the objective function should incorporate measures of hunter ‘satisfaction’ that include factors in addition to harvest (Ringelman 1997, Johnson & Case 2000). Development of appropriate metrics for hunter satisfaction, monitoring programs for these metrics, and models relating satisfaction to hunting regulations represent an important technical challenge (also see Johnson & Case 2000).

The optimal policies and learning identified via the outlined approach to adaptive management are conditional on the members of the model set. This conditional nature of the process encourages substantial care and effort in the development of the competing system models. If none of the models closely approximates system dynamics in response to management actions, then the information state of the process will not evolve in the desired manner to decrease uncertainty. Because even models that are well-grounded empirically may not perform well when conditions change, we recommend that biologists focus on mechanistic models to the degree practicable (e.g. Johnson et al. 1993, Williams et al. in press). If the modelled processes themselves change over time, then the information state can be expected to evolve in the direction of the models best approximating reality. The success of the adaptive process in this situation should be a function of the relative rates of change in the underlying process versus the information state.

In addition to these technical challenges, the continuation and expansion of adaptive management of waterfowl harvest will involve a number of institutional challenges. Perhaps the most important of these challenges are political in nature. Adaptive harvest management was implemented by the U.S. Fish and Wildlife Service in 1995 and habitat conditions have remained favourable for mallard production. The combination of reasonable mallard populations and good habitat conditions has led to liberal hunting regulations during each of the years under adaptive management. Political pressure for changes in regulatory decisions tend to be much greater in years of poor populations and/or habitat conditions. It is hoped that harvest managers can withstand political pressures when these conditions occur, and continue to establish hunting regulations according to the policies developed through the adaptive optimization process.

We believe that political input to the adaptive management process should largely focus on the development of objective functions and the identification of regulatory alternatives. Political influence is to be expected with these two components of the adaptive management process, but politically motivated changes should be restricted to periods of program reassessment and should not occur frequently. Frequent changes in the regulatory packages, for example, limit one's ability to predict harvest rates from regulations, and thereby retard the reduction of a principal source of uncertainty (partial controllability).

Another institutional challenge involves the technical and relatively mathematical nature of this formal approach to adaptive management. Work on this topic requires a level of technical expertise that is not common among wildlife managers. The group of scientists and managers responsible for the initial efforts with mallards is relatively small, and it is essential to maintain this group and to train other technicians in this methodology.

Despite these and other challenges (e.g. Johnson & Case 2000), we believe that adaptive processes of the type described in this paper will become increasingly important in North American waterfowl harvest management. The political and scientific selective pressures that resulted in the consideration and adoption of this process in 1995 are even stronger and more compelling today (Nichols 2000). We thus believe that the approach merits the additional research to deal with the remaining technical challenges and to expand the approach to other populations and species.