# Disintegration theorem

In mathematics, the disintegration theorem is a result in measure theory and probability theory. It rigorously defines the idea of a non-trivial "restriction" of a measure to a measure zero subset of the measure space in question. It is related to the existence of conditional probability measures. In a sense, "disintegration" is the opposite process to the construction of a product measure.

## Motivation

Consider the unit square in the Euclidean plane R2, S = [0, 1] × [0, 1]. Consider the probability measure μ defined on S by the restriction of two-dimensional Lebesgue measure λ2 to S. That is, the probability of an event ES is simply the area of E. We assume E is a measurable subset of S.

Consider a one-dimensional subset of S such as the line segment Lx = {x} × [0, 1]. Lx has μ-measure zero; every subset of Lx is a μ-null set; since the Lebesgue measure space is a complete measure space,

${\displaystyle E\subseteq L_{x}\implies \mu (E)=0.}$

While true, this is somewhat unsatisfying. It would be nice to say that μ "restricted to" Lx is the one-dimensional Lebesgue measure λ1, rather than the zero measure. The probability of a "two-dimensional" event E could then be obtained as an integral of the one-dimensional probabilities of the vertical "slices" ELx: more formally, if μx denotes one-dimensional Lebesgue measure on Lx, then

${\displaystyle \mu (E)=\int _{[0,1]}\mu _{x}(E\cap L_{x})\,\mathrm {d} x}$

for any "nice" ES. The disintegration theorem makes this argument rigorous in the context of measures on metric spaces.

## Statement of the theorem

(Hereafter, P(X) will denote the collection of Borel probability measures on a metric space (X, d).)

Let Y and X be two Radon spaces (i.e. separable metric spaces on which every probability measure is a Radon measure). Let μ ∈ P(Y), let π : YX be a Borel-measurable function, and let ${\displaystyle \nu }$P(X) be the pushforward measure ${\displaystyle \nu }$ = π(μ) = μ ∘ π−1. Then there exists a ${\displaystyle \nu }$-almost everywhere uniquely determined family of probability measures {μx}xXP(Y) such that

• the function ${\displaystyle x\mapsto \mu _{x}}$ is Borel measurable, in the sense that ${\displaystyle x\mapsto \mu _{x}(B)}$ is a Borel-measurable function for each Borel-measurable set BY;
• μx "lives on" the fiber π−1(x): for ${\displaystyle \nu }$-almost all xX,
${\displaystyle \mu _{x}\left(Y\setminus \pi ^{-1}(x)\right)=0,}$
and so μx(E) = μx(E ∩ π−1(x));
• for every Borel-measurable function f : Y → [0, ∞],
${\displaystyle \int _{Y}f(y)\,\mathrm {d} \mu (y)=\int _{X}\int _{\pi ^{-1}(x)}f(y)\,\mathrm {d} \mu _{x}(y)\mathrm {d} \nu (x).}$
In particular, for any event EY, taking f to be the indicator function of E,[1]
${\displaystyle \mu (E)=\int _{X}\mu _{x}\left(E\right)\,\mathrm {d} \nu (x).}$

## Applications

### Product spaces

The original example was a special case of the problem of product spaces, to which the disintegration theorem applies.

When Y is written as a Cartesian product Y = X1 × X2 and πi : YXi is the natural projection, then each fibre π1−1(x1) can be canonically identified with X2 and there exists a Borel family of probability measures ${\displaystyle \{\mu _{x_{1}}\}_{x_{1}\in X_{1}}}$ in P(X2) (which is (π1)(μ)-almost everywhere uniquely determined) such that

${\displaystyle \mu =\int _{X_{1}}\mu _{x_{1}}\,\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)=\int _{X_{1}}\mu _{x_{1}}\,\mathrm {d} (\pi _{1})_{*}(\mu )(x_{1}),}$

which is in particular

${\displaystyle \int _{X_{1}\times X_{2}}f(x_{1},x_{2})\,\mu (\mathrm {d} x_{1},\mathrm {d} x_{2})=\int _{X_{1}}\left(\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}|x_{1})\right)\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)}$

and

${\displaystyle \mu (A\times B)=\int _{A}\mu \left(B|x_{1}\right)\,\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right).}$

The relation to conditional expectation is given by the identities

${\displaystyle \operatorname {E} (f|\pi _{1})(x_{1})=\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}|x_{1}),}$
${\displaystyle \mu (A\times B|\pi _{1})(x_{1})=1_{A}(x_{1})\cdot \mu (B|x_{1}).}$

### Vector calculus

The disintegration theorem can also be seen as justifying the use of a "restricted" measure in vector calculus. For instance, in Stokes' theorem as applied to a vector field flowing through a compact surface Σ ⊂ R3, it is implicit that the "correct" measure on Σ is the disintegration of three-dimensional Lebesgue measure λ3 on Σ, and that the disintegration of this measure on ∂Σ is the same as the disintegration of λ3 on ∂Σ.[2]

### Conditional distributions

The disintegration theorem can be applied to give a rigorous treatment of conditional probability distributions in statistics, while avoiding purely abstract formulations of conditional probability.[3]