Stochastic control

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Stochastic control or stochastic optimal control is a subfield of control theory that deals with the existence of uncertainty either in observations of the data or in the things that drive the evolution of the data. The designer assumes, in a Bayesian probability-driven fashion, that random noise with known probability distribution affects the evolution and observation of the state variables. Stochastic control aims to design the time path of the controlled variables that performs the desired control task with minimum cost, somehow defined, despite the presence of this noise.[1] The context may be either discrete time or continuous time.

Certainty equivalence[edit]

An extremely well-studied formulation in stochastic control is that of linear quadratic Gaussian control. Here the model is linear, the objective function is the expected value of a quadratic form, and the disturbances are purely additive. A basic result for discrete-time centralized systems is the certainty equivalence property:[2] that the optimal control solution in this case is the same as would be obtained in the absence of the additive disturbances. This property is applicable to all centralized systems with linear equations of evolution, quadratic cost function, and noise entering the model only additively; the quadratic assumption allows for the optimal control laws, which follow the certainty-equivalence property, to be linear functions of the observations of the controllers.

Any deviation from the above assumptions—a nonlinear state equation, a non-quadratic objective function, noise in the multiplicative parameters of the model, or decentralization of control—causes the certainty equivalence property not to hold. For example, its failure to hold for decentralized control was demonstrated in Witsenhausen's counterexample.

Discrete time[edit]

In a discrete time context, the decision-maker observes the state variable, possibly with observational noise, in each time period. The objective may be to optimize the sum of expected values of a nonlinear (possibly quadratic) objective function over all the time periods from the present to the final period of concern, or to optimize the value of the objective function as of the final period only. At each time period new observations are made, and the control variables are to be adjusted optimally. Finding the optimal solution for the present time may involve iterating a matrix Riccati equation backwards in time from the last period to the present period.

In the discrete-time case with uncertainty about the parameter values in the transition matrix (giving the effect of current values of the state variables on their own evolution) and/or the control response matrix of the state equation, but still with a linear state equation and quadratic objective function, a Riccati equation can still be obtained for iterating backward to each period's solution even though certainty equivalence does not apply.[2]ch.13[3] The discrete-time case of a non-quadratic loss function but only additive disturbances can also be handled, albeit with more complications.[4]

Continuous time[edit]

If the model is in continuous time, the controller knows the state of the system at each instant of time. The objective is to maximize either an integral of, for example, a concave function of a state variable over a horizon from time zero (the present) to a terminal time T, or a concave function of a state variable at some future date T. As time evolves, new observations are continuously made and the control variables are continuously adjusted in optimal fashion.

In finance[edit]

In a continuous time approach in a finance context, the state variable in the stochastic differential equation is usually wealth or net worth, and the controls are the shares placed at each time in the various assets. Given the asset allocation chosen at any time, the determinants of the change in wealth are usually the stochastic returns to assets and the interest rate on the risk-free asset. The field of stochastic control has developed greatly since the 1970s, particularly in its applications to finance. Robert Merton[5] used stochastic control to study optimal portfolios of safe and risky assets. His work and that of Black-Scholes changed the nature of the finance literature. Major mathematical developments were by W. Fleming and R. Rishel[6] and W. Fleming and M. Soner.[7] These techniques were applied by J. L. Stein to the U.S. financial crisis of the decade of the 2000s.[8]

The maximization, say of the expected logarithm of net worth at a terminal date T, is subject to stochastic processes on the components of wealth. In this case, in continuous time the Ito equation is the main tool of analysis. In the case where the maximization is an integral of a concave function of utility over an horizon (0,T), dynamic programming is used. There is no certainty equivalence as in the older literature, because the coefficients of the control variables—that is, the returns received by the chosen shares of assets—are stochastic.

References[edit]

  1. ^ Definition from Answers.com
  2. ^ a b Chow, Gregory P., Analysis and Control of Dynamic Economic Systems, Wiley, 1976.
  3. ^ Turnovsky, Stephen, "Optimal stabilization policies for stochastic linear systems: The case of correlated multiplicative and additive disturbances," Review of Economic Studies 43(1), 1976, 191-94.
  4. ^ Mitchell, Douglas W., "Tractable risk sensitive control based on approximate expected utility," Economic Modelling, April 1990, 161-164.
  5. ^ Robert Merton, Continuous Time Finance, Blackwell (1990)
  6. ^ W. Fleming and R. Rishel, Deterministic and Stochastic Optimal Control (1975)
  7. ^ W. Fleming and M. Soner, Controlled Markov Processes and Viscosity Solutions, Springer (2006)
  8. ^ J. L. Stein, Stochastic Optimal Control and the US Financial Crisis, Springer-Science (2012).

See also[edit]