Statistical manifold

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In mathematics, a statistical manifold is a Riemannian manifold, each of whose points is a probability distribution. Statistical manifolds provide a setting for the field of information geometry. The Fisher information metric provides a metric on these manifolds.

Examples[edit]

The family of all normal distributions,[clarification needed] parametrized by the expected value μ and the variance σ2 ≥ 0, with the Riemannian metric given by the Fisher information matrix, is a statistical manifold. Its geometry is modeled on hyperbolic space.

A simple example of a statistical manifold, taken from physics, would be the canonical ensemble: it is a one-dimensional manifold, with the temperature T serving as the coordinate on the manifold. For any fixed temperature T, one has a probability space: so, for a gas of atoms, it would be the probability distribution of the velocities of the atoms. As one varies the temperature T, the probability distribution varies.

Another simple example, taken from medicine, would be the probability distribution of patient outcomes, in response to the quantity of medicine administered. That is, for a fixed dose, some patients improve, and some do not: this is the base probability space. If the dosage is varied, then the probability of outcomes changes. Thus, the dosage is the coordinate on the manifold. To be a smooth manifold, one would have to measure outcomes in response to arbitrarily small changes in dosage; this is not a practically realizable example, unless one has a pre-existing mathematical model of dose-response where the dose can be arbitrarily varied.

Definition[edit]

Let X be an orientable manifold, and let (X,\Sigma,\mu) be a measure on X. Equivalently, let (\Omega, \mathcal{F},P) be a probability space on \Omega=X, with sigma algebra \mathcal{F}=\Sigma and probability P=\mu.

The statistical manifold S(X) of X is defined as the space of all measures \mu on X (with the sigma-algebra \Sigma held fixed). Note that this space is infinite-dimensional; it is commonly taken to be a Fréchet space. The points of S(X) are measures.

Rather than dealing with an infinite-dimensional space S(X), it is common to work with a finite-dimensional submanifold, defined by considering a set of probability distributions parameterized by some smooth, continuously-varying parameter \theta. That is, one considers only those measures that are selected by the parameter. If the parameter \theta is n-dimensional, then, in general, the submanifold will be as well. All finite-dimensional statistical manifolds can be understood in this way.[clarification needed]