In probability theory , Kolmogorov's inequality is a so-called "maximal inequality " that gives a bound on the probability that the partial sums of a finite collection of independent random variables exceed some specified bound. The inequality is named after the Russian mathematician Andrey Kolmogorov .[citation needed ]
Statement of the inequality
Let X 1 , ..., X n : Ω → R be independent random variables defined on a common probability space (Ω, F , Pr), with expected value E[X k ] = 0 and variance Var[X k ] < +∞ for k = 1, ..., n . Then, for each λ > 0,
Pr
(
max
1
≤
k
≤
n
|
S
k
|
≥
λ
)
≤
1
λ
2
Var
[
S
n
]
≡
1
λ
2
∑
k
=
1
n
Var
[
X
k
]
,
{\displaystyle \Pr \left(\max _{1\leq k\leq n}|S_{k}|\geq \lambda \right)\leq {\frac {1}{\lambda ^{2}}}\operatorname {Var} [S_{n}]\equiv {\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}\operatorname {Var} [X_{k}],}
where S k = X 1 + ... + X k .
The convenience of this result is that we can bound the worst case deviation of a random walk at any point of time using its value at the end of time interval.
Proof
The following argument is due to Kareem Amin and employs discrete martingales .
As argued in the discussion of Doob's martingale inequality , the sequence
S
1
,
S
2
,
…
,
S
n
{\displaystyle S_{1},S_{2},\dots ,S_{n}}
is a martingale.
Without loss of generality , we can assume that
S
0
=
0
{\displaystyle S_{0}=0}
and
S
i
≥
0
{\displaystyle S_{i}\geq 0}
for all
i
{\displaystyle i}
.
Define
(
Z
i
)
i
=
0
n
{\displaystyle (Z_{i})_{i=0}^{n}}
as follows. Let
Z
0
=
0
{\displaystyle Z_{0}=0}
, and
Z
i
+
1
=
{
S
i
+
1
if
max
1
≤
j
≤
i
S
j
<
λ
Z
i
otherwise
{\displaystyle Z_{i+1}=\left\{{\begin{array}{ll}S_{i+1}&{\text{ if }}\displaystyle \max _{1\leq j\leq i}S_{j}<\lambda \\Z_{i}&{\text{ otherwise}}\end{array}}\right.}
for all
i
{\displaystyle i}
.
Then
(
Z
i
)
i
=
0
n
{\displaystyle (Z_{i})_{i=0}^{n}}
is also a martingale. Since
S
i
−
S
i
−
1
{\displaystyle S_{i}-S_{i-1}}
is independent and mean zero,
∑
i
=
1
n
E
[
(
S
i
−
S
i
−
1
)
2
]
=
∑
i
=
1
n
E
[
S
i
2
−
2
S
i
S
i
−
1
+
S
i
−
1
2
]
=
∑
i
=
1
n
E
[
S
i
2
−
2
(
S
i
−
1
+
S
i
−
S
i
−
1
)
S
i
−
1
+
S
i
−
1
2
]
=
∑
i
=
1
n
E
[
S
i
2
−
S
i
−
1
2
]
−
2
E
[
S
i
−
1
(
S
i
−
S
i
−
1
)
]
=
E
[
S
n
2
]
−
E
[
S
0
2
]
=
E
[
S
n
2
]
.
{\displaystyle {\begin{aligned}\sum _{i=1}^{n}{\text{E}}[(S_{i}-S_{i-1})^{2}]&=\sum _{i=1}^{n}{\text{E}}[S_{i}^{2}-2S_{i}S_{i-1}+S_{i-1}^{2}]\\&=\sum _{i=1}^{n}{\text{E}}\left[S_{i}^{2}-2(S_{i-1}+S_{i}-S_{i-1})S_{i-1}+S_{i-1}^{2}\right]\\&=\sum _{i=1}^{n}{\text{E}}\left[S_{i}^{2}-S_{i-1}^{2}\right]-2{\text{E}}\left[S_{i-1}(S_{i}-S_{i-1})\right]\\&={\text{E}}[S_{n}^{2}]-{\text{E}}[S_{0}^{2}]={\text{E}}[S_{n}^{2}].\end{aligned}}}
The same is true for
(
Z
i
)
i
=
0
n
{\displaystyle (Z_{i})_{i=0}^{n}}
. Thus
Pr
(
max
1
≤
i
≤
n
S
i
≥
λ
)
=
Pr
[
Z
n
≥
λ
]
≤
1
λ
2
E
[
Z
n
2
]
=
1
λ
2
∑
i
=
1
n
E
[
(
Z
i
−
Z
i
−
1
)
2
]
≤
1
λ
2
∑
i
=
1
n
E
[
(
S
i
−
S
i
−
1
)
2
]
=
1
λ
2
E
[
S
n
2
]
=
1
λ
2
Var
[
S
n
]
{\displaystyle {\begin{aligned}{\text{Pr}}\left(\max _{1\leq i\leq n}S_{i}\geq \lambda \right)&={\text{Pr}}[Z_{n}\geq \lambda ]\\&\leq {\frac {1}{\lambda ^{2}}}{\text{E}}[Z_{n}^{2}]={\frac {1}{\lambda ^{2}}}\sum _{i=1}^{n}{\text{E}}[(Z_{i}-Z_{i-1})^{2}]\\&\leq {\frac {1}{\lambda ^{2}}}\sum _{i=1}^{n}{\text{E}}[(S_{i}-S_{i-1})^{2}]={\frac {1}{\lambda ^{2}}}{\text{E}}[S_{n}^{2}]={\frac {1}{\lambda ^{2}}}{\text{Var}}[S_{n}]\end{aligned}}}
by Chebyshev's inequality .
This inequality was generalized by Hájek and Rényi in 1955.
See also
References
Billingsley, Patrick (1995). Probability and Measure . New York: John Wiley & Sons, Inc. ISBN 0-471-00710-2 . (Theorem 22.4)
Feller, William (1968) [1950]. An Introduction to Probability Theory and its Applications, Vol 1 (Third ed.). New York: John Wiley & Sons, Inc. xviii+509. ISBN 0-471-25708-7 .
This article incorporates material from Kolmogorov's inequality on PlanetMath , which is licensed under the Creative Commons Attribution/Share-Alike License .