In probability theory , a Markov kernel (or stochastic kernel ) is a map that plays the role, in the general theory of Markov processes , that the transition matrix does in the theory of Markov processes with a finite state space .[1]
Formal definition [ edit ]
Let
(
X
,
A
)
{\displaystyle (X,{\mathcal {A}})}
,
(
Y
,
B
)
{\displaystyle (Y,{\mathcal {B}})}
be measurable spaces . A Markov kernel with source
(
X
,
A
)
{\displaystyle (X,{\mathcal {A}})}
and target
(
Y
,
B
)
{\displaystyle (Y,{\mathcal {B}})}
is a map
κ
:
X
×
B
→
[
0
,
1
]
{\displaystyle \kappa \colon X\times {\mathcal {B}}\to [0,1]}
with the following properties:
The map
x
↦
κ
(
x
,
B
)
{\displaystyle x\mapsto \kappa (x,B)}
is
A
{\displaystyle {\mathcal {A}}}
- measureable for every
B
∈
B
{\displaystyle B\in {\mathcal {B}}}
.
The map
B
↦
κ
(
x
,
B
)
{\displaystyle B\mapsto \kappa (x,B)}
is a probability measure on
(
Y
,
B
)
{\displaystyle (Y,{\mathcal {B}})}
for every
x
∈
X
{\displaystyle x\in X}
.
(i.e. It associates to each point
x
∈
X
{\displaystyle x\in X}
a probability measure
κ
(
x
,
.
)
{\displaystyle \kappa (x,.)}
on
(
Y
,
B
)
{\displaystyle (Y,{\mathcal {B}})}
such that, for every measurable set
B
∈
B
{\displaystyle B\in {\mathcal {B}}}
, the map
x
↦
κ
(
x
,
B
)
{\displaystyle x\mapsto \kappa (x,B)}
is measurable with respect to the
σ
{\displaystyle \sigma }
-algebra
A
{\displaystyle {\mathcal {A}}}
.)[2]
Examples [ edit ]
Simple random walk : Take
X
=
Y
=
Z
{\displaystyle X=Y=\mathbb {Z} }
and
A
=
B
=
P
(
Z
)
{\displaystyle {\mathcal {A}}={\mathcal {B}}={\mathcal {P}}(\mathbb {Z} )}
, then the Markov kernel
κ
{\displaystyle \kappa }
with
κ
(
x
,
B
)
=
1
2
1
B
(
x
−
1
)
+
1
2
1
B
(
x
+
1
)
,
∀
x
∈
Z
,
∀
B
∈
P
(
Z
)
{\displaystyle \kappa (x,B)={\frac {1}{2}}\mathbf {1} _{B}(x-1)+{\frac {1}{2}}\mathbf {1} _{B}(x+1),\quad \forall x\in \mathbb {Z} ,\quad \forall B\in {\mathcal {P}}(\mathbb {Z} )}
,
describes the transition rule for the random walk on
Z
{\displaystyle \mathbb {Z} }
. Where
1
{\displaystyle \mathbf {1} }
is the indicator function .
Galton-Watson process : Take
X
=
Y
=
N
{\displaystyle X=Y=\mathbb {N} }
,
A
=
B
=
P
(
N
)
{\displaystyle {\mathcal {A}}={\mathcal {B}}={\mathcal {P}}(\mathbb {N} )}
, then
κ
(
x
,
B
)
=
{
1
B
(
0
)
x
=
0
,
P
[
ξ
1
+
⋯
+
ξ
x
∈
B
]
else,
{\displaystyle \kappa (x,B)={\begin{cases}\mathbf {1} _{B}(0)&\quad x=0,\\P[\xi _{1}+\dots +\xi _{x}\in B]&\quad {\text{else,}}\\\end{cases}}}
with i.i.d. random variables
ξ
i
{\displaystyle \xi _{i}}
.
General Markov processes with finite state space: Take
X
=
Y
{\displaystyle X=Y}
,
A
=
B
=
P
(
X
)
=
P
(
Y
)
{\displaystyle {\mathcal {A}}={\mathcal {B}}={\mathcal {P}}(X)={\mathcal {P}}(Y)}
and
|
X
|
=
|
Y
|
=
n
{\displaystyle |X|=|Y|=n}
, then the transition rule can be represented as a stochastic matrix
(
K
i
j
)
1
≤
i
,
j
≤
n
{\displaystyle (K_{ij})_{1\leq i,j\leq n}}
with
Σ
j
∈
Y
K
i
j
=
1
{\displaystyle \Sigma _{j\in Y}K_{ij}=1}
for every
i
∈
X
{\displaystyle i\in X}
. In the convention of Markov kernels we write
κ
(
i
,
B
)
=
Σ
j
∈
B
K
i
j
,
∀
i
∈
X
,
∀
B
∈
B
{\displaystyle \kappa (i,B)=\Sigma _{j\in B}K_{ij},\quad \forall i\in X,\quad \forall B\in {\mathcal {B}}}
.
∫
X
k
(
x
,
y
)
ν
(
d
y
)
=
1
,
{\displaystyle \int _{X}k(x,y)\nu (\mathrm {d} y)=1,}
for all
x
∈
X
{\displaystyle x\in X}
, then the mapping
κ
:
X
×
B
→
[
0
,
1
]
{\displaystyle \kappa :X\times {\mathcal {B}}\to [0,1]}
κ
(
x
,
B
)
=
∫
B
k
(
x
,
y
)
ν
(
d
y
)
,
{\displaystyle \kappa (x,B)=\int _{B}k(x,y)\nu (\mathrm {d} y),}
defines a Markov kernel.[3]
Properties [ edit ]
Semidirect product [ edit ]
Let
(
X
,
A
,
P
)
{\displaystyle (X,{\mathcal {A}},P)}
be a probability space and
κ
{\displaystyle \kappa }
a Markov kernel from
(
X
,
A
)
{\displaystyle (X,{\mathcal {A}})}
to some
(
Y
,
B
)
{\displaystyle (Y,{\mathcal {B}})}
.
Then there exists a unique measure
Q
{\displaystyle Q}
on
(
X
×
Y
,
A
⊗
B
)
{\displaystyle (X\times Y,{\mathcal {A}}\otimes {\mathcal {B}})}
, such that
Q
(
A
×
B
)
=
∫
A
κ
(
x
,
B
)
d
P
(
x
)
,
∀
A
∈
A
,
∀
B
∈
B
{\displaystyle Q(A\times B)=\int _{A}\kappa (x,B)dP(x),\quad \forall A\in {\mathcal {A}},\quad \forall B\in {\mathcal {B}}}
.
Regular conditional distribution [ edit ]
Let
(
S
,
Y
)
{\displaystyle (S,Y)}
be a Borel space ,
X
{\displaystyle X}
a
(
S
,
Y
)
{\displaystyle (S,Y)}
- valued random variable on the measure space
(
Ω
,
F
,
P
)
{\displaystyle (\Omega ,{\mathcal {F}},P)}
and
G
⊆
F
{\displaystyle {\mathcal {G}}\subseteq {\mathcal {F}}}
a sub-
σ
{\displaystyle \sigma }
-algebra.
Then there exists a Markov kernel
κ
{\displaystyle \kappa }
from
(
Ω
,
G
)
{\displaystyle (\Omega ,{\mathcal {G}})}
to
(
S
,
Y
)
{\displaystyle (S,Y)}
, such that
κ
(
.
,
B
)
{\displaystyle \kappa (.,B)}
is a version of the conditional expectation
E
[
1
{
X
∈
B
}
|
G
]
{\displaystyle E[\mathbf {1} _{\{X\in B\}}|{\mathcal {G}}]}
for every
B
∈
Y
{\displaystyle B\in Y}
, i.e.
P
[
X
∈
B
|
G
]
=
E
[
1
{
X
∈
B
}
|
G
]
=
κ
(
ω
,
B
)
,
P
−
a
.
s
.
∀
B
∈
G
{\displaystyle P[X\in B|{\mathcal {G}}]=E[\mathbf {1} _{\{X\in B\}}|{\mathcal {G}}]=\kappa (\omega ,B),\quad P-a.s.\forall B\in {\mathcal {G}}}
.
It is called regular conditional distribution of
X
{\displaystyle X}
given
G
{\displaystyle {\mathcal {G}}}
and is not uniquely defined.
References [ edit ]
§36. Kernels and semigroups of kernels