= Disintegration theorem =

In mathematics, the disintegration theorem is a result in measure theory and probability theory. It rigorously defines the idea of a non-trivial "restriction" of a measure to a measure zero subset of the measure space in question. It is related to the existence of conditional probability measures. In a sense, "disintegration" is the opposite process to the construction of a product measure.

==Motivation==
Consider the unit square $S = [0,1]\times[0,1]$ in the Euclidean plane $\mathbb{R}^2$. Consider the probability measure $\mu$ defined on $S$ by the restriction of two-dimensional Lebesgue measure $\lambda^2$ to $S$. That is, the probability of an event $E\subseteq S$ is simply the area of $E$. We assume $E$ is a measurable subset of $S$.

Consider a one-dimensional subset of $S$ such as the line segment $L_x = \{x\}\times[0, 1]$. $L_x$ has $\mu$-measure zero; every subset of $L_x$ is a $\mu$-null set; since the Lebesgue measure space is a complete measure space,
$E \subseteq L_{x} \implies \mu (E) = 0.$

While true, this is somewhat unsatisfying. It would be nice to say that $\mu$ "restricted to" $L_x$ is the one-dimensional Lebesgue measure $\lambda^1$, rather than the zero measure. The probability of a "two-dimensional" event $E$ could then be obtained as an integral of the one-dimensional probabilities of the vertical "slices" $E\cap L_x$: more formally, if $\mu_x$ denotes one-dimensional Lebesgue measure on $L_x$, then
$\mu (E) = \int_{[0, 1]} \mu_{x} (E \cap L_{x}) \, \mathrm{d} x$
for any "nice" $E\subseteq S$. The disintegration theorem makes this argument rigorous in the context of measures on metric spaces.

==Statement of the theorem==
(Hereafter, $\mathcal{P}(X)$ will denote the collection of Borel probability measures on a topological space $(X, T)$.)
The assumptions of the theorem are as follows:
- Let $Y$ and $X$ be two Polish spaces (i.e. separably completely metrizable spaces).
- Let $\mu\in\mathcal{P}(Y)$.
- Let $\pi : Y\to X$ be a Borel-measurable function. Here one should think of $\pi$ as a function to "disintegrate" $Y$, in the sense of partitioning $Y$ into $\{ \pi^{-1}(x)\ |\ x \in X\}$. For example, for the motivating example above, one can define $\pi((a,b)) = a$, $(a,b) \in [0,1]\times [0,1]$, which gives that $\pi^{-1}(a) = a \times [0,1]$, a slice we want to capture.
- Let $\nu \in\mathcal{P}(X)$ be the pushforward measure $\nu = \pi_{*}(\mu) = \mu \circ \pi^{-1}$. This measure provides the distribution of $x$ (which corresponds to the events $\pi^{-1}(x)$).

The conclusion of the theorem: There exists a $\nu$-almost everywhere uniquely determined family of probability measures $\{\mu_x\}_{x\in X} \subseteq \mathcal{P}(Y)$, which provides a "disintegration" of $\mu$ into such that:
- the function $x \mapsto \mu_{x}$ is Borel measurable, in the sense that $x \mapsto \mu_{x} (B)$ is a Borel-measurable function for each Borel-measurable set $B\subseteq Y$;
- $\mu_x$ "lives on" the fiber $\pi^{-1}(x)$: for $\nu$-almost all $x\in X$, $\mu_{x} \left( Y \setminus \pi^{-1} (x) \right) = 0,$ and so $\mu_x(E) =\mu_x(E\cap\pi^{-1}(x))$;
- for every Borel-measurable function $f : Y \to [0,\infty]$, $\int_{Y} f(y) \, \mathrm{d} \mu (y) = \int_{X} \int_{\pi^{-1} (x)} f(y) \, \mathrm{d} \mu_x (y) \, \mathrm{d} \nu (x).$ In particular, for any event $E\subseteq Y$, taking $f$ to be the indicator function of $E$, $\mu (E) = \int_X \mu_x (E) \, \mathrm{d} \nu (x),$which shows that the family $\{\mu_x\}_{x \in X}$ is a regular conditional probability.

==Applications==

===Product spaces===

The original example was a special case of the problem of product spaces, to which the disintegration theorem applies.

When $Y$ is written as a Cartesian product $Y = X_1\times X_2$ and $\pi_i : Y\to X_i$ is the natural projection, then each fibre $\pi_1^{-1}(x_1)$ can be canonically identified with $X_2$ and there exists a Borel family of probability measures $\{ \mu_{x_{1}} \}_{x_{1} \in X_{1}}$ in $\mathcal{P}(X_2)$ (which is $(\pi_1)_*(\mu)$-almost everywhere uniquely determined) such that
$\mu = \int_{X_{1}} \mu_{x_{1}} \, \mu \left(\pi_1^{-1}(\mathrm d x_1) \right)= \int_{X_{1}} \mu_{x_{1}} \, \mathrm{d} (\pi_{1})_{*} (\mu) (x_{1}),$
which is in particular
$\int_{X_1\times X_2} f(x_1,x_2)\, \mu(\mathrm d x_1,\mathrm d x_2) = \int_{X_1}\left( \int_{X_2} f(x_1,x_2) \mu(\mathrm d x_2\mid x_1) \right) \mu\left( \pi_1^{-1}(\mathrm{d} x_{1})\right)$
and
$\mu(A \times B) = \int_A \mu\left(B\mid x_1\right) \, \mu\left( \pi_1^{-1}(\mathrm{d} x_{1})\right).$

The relation to conditional expectation is given by the identities
$\operatorname E(f\mid \pi_1)(x_1)= \int_{X_2} f(x_1,x_2) \mu(\mathrm d x_2\mid x_1),$
$\mu(A\times B\mid \pi_1)(x_1)= 1_A(x_1) \cdot \mu(B\mid x_1).$

===Vector calculus===
The disintegration theorem can also be seen as justifying the use of a "restricted" measure in vector calculus. For instance, in Stokes' theorem as applied to a vector field flowing through a compact surface , it is implicit that the "correct" measure on $\Sigma$ is the disintegration of three-dimensional Lebesgue measure $\lambda^3$ on $\Sigma$, and that the disintegration of this measure on ∂Σ is the same as the disintegration of $\lambda^3$ on $\partial\Sigma$.

===Conditional distributions===
The disintegration theorem can be applied to give a rigorous treatment of conditional probability distributions in statistics, while avoiding purely abstract formulations of conditional probability. The theorem is related to the Borel–Kolmogorov paradox, for example.

==See also==

- Ionescu-Tulcea theorem
- Joint probability distribution
- Copula (statistics)
- Conditional expectation
- Borel–Kolmogorov paradox
- Regular conditional probability
- Lifting theory
