= Giry monad =

In mathematics, the Giry monad is a construction that assigns to a measurable space a space of probability measures over it, equipped with a canonical sigma-algebra. It is one of the main examples of a probability monad.

It is implicitly used in probability theory whenever one considers probability measures which depend measurably on a parameter (giving rise to Markov kernels), or when one has probability measures over probability measures (such as in de Finetti's theorem).

Like many iterable constructions, it has the category-theoretic structure of a monad, on the category of measurable spaces.

==Construction==

The Giry monad, like every monad, consists of three structures:

- A functorial assignment, which in this case assigns to a measurable space $X$ a space of probability measures $PX$ over it;
- A natural map $\delta_X:X\to PX$ called the unit, which in this case assigns to each element of a space the Dirac measure over it;
- A natural map $\mathcal{E}_X:PPX\to PX$ called the multiplication, which in this case assigns to each probability measure over probability measures its expected value.

===The space of probability measures===

Let $(X, \mathcal{F})$ be a measurable space.
Denote by $PX$ the set of probability measures over $(X, \mathcal{F})$.
We equip the set $PX$ with a sigma-algebra as follows. First of all, for every measurable set $A\in \mathcal{F}$, define the map $\varepsilon_A:PX\to\mathbb{R}$ by $p\longmapsto p(A)$.
We then define the sigma algebra $\mathcal{PF}$ on $PX$ to be the smallest sigma-algebra which makes the maps $\varepsilon_A$ measurable, for all $A\in\mathcal{F}$ (where $\mathbb{R}$ is assumed equipped with the Borel sigma-algebra).

Equivalently, $\mathcal{PF}$ can be defined as the smallest sigma-algebra on $PX$ which makes the maps
$p\longmapsto\int_X f \,dp$
measurable for all bounded measurable $f:X\to\mathbb{R}$.

The assignment $(X,\mathcal{F})\mapsto (PX,\mathcal {PF})$ is part of an endofunctor on the category of measurable spaces, usually denoted again by $P$. Its action on morphisms, i.e. on measurable maps, is via the pushforward of measures.
Namely, given a measurable map $f:(X,\mathcal{F})\to(Y,\mathcal{G})$, one assigns to $f$ the map $f_*:(PX,\mathcal {PF})\to(PY,\mathcal {PG})$ defined by
$f_*p\,(B)=p(f^{-1}(B))$
for all $p\in PX$ and all measurable sets $B\in\mathcal{G}$.

===The Dirac delta map===

Given a measurable space $(X,\mathcal{F})$, the map $\delta:(X,\mathcal{F})\to(PX,\mathcal{PF})$ maps an element $x\in X$ to the Dirac measure $\delta_x\in PX$, defined on measurable subsets $A\in\mathcal{F}$ by
$\delta_x(A) = 1_A(x) =
\begin{cases}
1 & \text{if }x\in A, \\
0 & \text{if }x\notin A.
\end{cases}$

===The expectation map===

Let $\mu\in PPX$, i.e. a probability measure over the probability measures over $(X,\mathcal{F})$. We define the probability measure $\mathcal{E}\mu\in PX$ by
$\mathcal{E}\mu(A) = \int_{PX} p(A)\,\mu(dp)$
for all measurable $A\in\mathcal{F}$.
This gives a measurable, natural map $\mathcal{E}:(PPX,\mathcal{PPF})\to(PX,\mathcal{PF})$.

====Example: mixture distributions====

A mixture distribution, or more generally a compound distribution, can be seen as an application of the map $\mathcal{E}$.
Let's see this for the case of a finite mixture. Let $p_1,\dots,p_n$ be probability measures on $(X,\mathcal{F})$, and consider the probability measure $q$ given by the mixture
$q(A) = \sum_{i=1}^n w_i\,p_i(A)$
for all measurable $A\in\mathcal{F}$, for some weights $w_i\ge 0$ satisfying $w_1+\dots+w_n=1$.
We can view the mixture $q$ as the average $q=\mathcal{E}\mu$, where the measure on measures $\mu\in PPX$, which in this case is discrete, is given by
$\mu = \sum_{i=1}^n w_i\,\delta_{p_i} .$
More generally, the map $\mathcal{E}:PPX\to PX$ can be seen as the most general, non-parametric way to form arbitrary mixture or compound distributions.

The triple $(P,\delta,\mathcal{E})$ is called the Giry monad.

==Relationship with Markov kernels==

One of the properties of the sigma-algebra $\mathcal{PF}$ is that given measurable spaces $(X,\mathcal{F})$ and $(Y,\mathcal{G})$, we have a bijective correspondence between measurable functions $(X,\mathcal{F})\to(PY,\mathcal{PG})$ and Markov kernels $(X,\mathcal{F})\to(Y,\mathcal{G})$. This allows to view a Markov kernel, equivalently, as a measurably parametrized probability measure.

In more detail, given a measurable function $f:(X,\mathcal{F})\to(PY,\mathcal{PG})$, one can obtain the Markov kernel $f^\flat:(X,\mathcal{F})\to(Y,\mathcal{G})$ as follows,
$f^\flat(B|x) = f(x)(B)$
for every $x\in X$ and every measurable $B\in\mathcal{G}$ (note that $f(x)\in PY$ is a probability measure).
Conversely, given a Markov kernel $k:(X,\mathcal{F})\to(Y,\mathcal{G})$, one can form the measurable function $k^\sharp:(X,\mathcal{F})\to(PY,\mathcal{PG})$ mapping $x\in X$ to the probability measure $k^\sharp(x)\in PY$ defined by
$k^\sharp(x)(B) = k(B|x)$
for every measurable $B\in\mathcal{G}$.
The two assignments are mutually inverse.

From the point of view of category theory, we can interpret this correspondence as an adjunction
$\mathrm{Hom}_\mathrm{Meas} (X,PY) \cong \mathrm{Hom}_\mathrm{Stoch} (X,Y)$
between the category of measurable spaces and the category of Markov kernels. In particular, the category of Markov kernels can be seen as the Kleisli category of the Giry monad.

==Product distributions==

Given measurable spaces $(X,\mathcal{F})$ and $(Y,\mathcal{G})$, one can form the measurable space $(PX,\mathcal{PX})\times (PY,\mathcal{PY})=(X\times Y, \mathcal{F}\times\mathcal{G})$ with the product sigma-algebra, which is the product in the category of measurable spaces.
Given probability measures $p\in PX$ and $q\in PY$, one can form the product measure $p\otimes q$ on $(X\times Y, \mathcal{F}\times\mathcal{G})$. This gives a natural, measurable map
$(PX,\mathcal{PF})\times (PY,\mathcal{PG})\to \big(P(X\times Y), \mathcal{P(F\times G)}\big)$
usually denoted by $\nabla$ or by $\otimes$.

The map $\nabla:PX\times PY\to P(X\times Y)$ is in general not an isomorphism, since there are probability measures on $X\times Y$ which are not product distributions, for example in case of correlation.
However, the maps $\nabla:PX\times PY\to P(X\times Y)$ and the isomorphism $1\cong P1$ make the Giry monad a monoidal monad, and so in particular a commutative strong monad.

==Further properties==

- If a measurable space $(X,\mathcal{F})$ is standard Borel, so is $(PX,\mathcal{PF})$. Therefore the Giry monad restricts to the full subcategory of standard Borel spaces.

- The algebras for the Giry monad include compact convex subsets of Euclidean spaces, as well as the extended positive real line $[0,\infty]$, with the algebra structure map given by taking expected values. For example, for $[0,\infty]$, the structure map $e:P[0,\infty]\to [0,\infty]$ is given by
$p \longmapsto \int_{[0,\infty)} x\,p(dx)$
whenever $p$ is supported on $[0,\infty)$ and has finite expected value, and $e(p)=\infty$ otherwise.

== See also ==

- Mixture distribution
- Compound distribution
- de Finetti theorem
- Measurable space
- Markov kernel
- Monad (category theory)
- Monad (functional programming)
- Category of measurable spaces
- Category of Markov kernels
- Categorical probability
