In mathematics, concentration inequalities provide probability bounds on how a random variable deviates from some value (e.g. its expectation). The laws of large numbers of classical probability theory state that sums of independent random variables are, under very mild conditions, close to their expectation with a large probability. Such sums are the most basic examples of random variables concentrated around their mean. Recent results shows that such behavior is shared by other functions of independent random variables.
Markov's inequality
If X is any random variable and a > 0, then
![{\displaystyle \Pr(|X|\geq a)\leq {\frac {{\textrm {E}}(|X|)}{a}}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0b70cf12e3bc05fd58ecd9891263e50b6c85300a)
Proof can be found here.
We can extend Markov's inequality to a strictly increasing and non-negative function
. We have
![{\displaystyle \Pr(X\geq a)=\Pr(\Phi (X)\geq \Phi (a))\leq {\frac {{\textrm {E}}(\Phi (X))}{\Phi (a)}}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/04147bf159ac744a01caeaf45bf474592b05ec68)
Chebyshev's inequality
Chebyshev's inequality is a special case of generalized Markov's inequality when
If X is any random variable and a > 0, then
![{\displaystyle \Pr(|X-{\textrm {E}}(X)|\geq a)\leq {\frac {{\textrm {Var}}(X)}{a^{2}}},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d9e911ccf10c1613960a02462b1de70ee8ff51f8)
Where Var(X) is the variance of X, defined as:
![{\displaystyle \operatorname {Var} (X)=\operatorname {E} [(X-\operatorname {E} (X))^{2}].}](https://wikimedia.org/api/rest_v1/media/math/render/svg/71c7a116967cab98cb1eb56e626497e77ce354a2)
Asymptotic Behavior of Binomial Distribution
If a random variable X follows the binomial distribution with parameter
and
. The probability of getting exact
successes in
trials is given by the probability mass function
![{\displaystyle f(k;n,p)=\Pr(K=k)={n \choose k}p^{k}(1-p)^{n-k}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8db5d7522b04219f31d6f16f24f9d66470295965)
Let
and
's are i.i.d. Bernoulli random variables with parameter
.
follows the binomial distribution with parameter with parameter
and
. Central Limit Theorem suggests when
,
is approximately normally distributed with mean
and variance
, and
![{\displaystyle \lim _{n\to \infty }\Pr[a\sigma <S_{n}-np<b\sigma ]=\int _{a}^{b}{\frac {1}{\sqrt {2\pi }}}e^{-x^{2}/2}dx}](https://wikimedia.org/api/rest_v1/media/math/render/svg/b78e77f8d91508b5c36ebaf3141c0a259deb6ca3)
For
, where
is a constant, the limit distribution of binomial distribution
is the Poisson distribution
General Chernoff Inequality
Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. Let
denote i.i.d. random variables, satisfying
, for
.
we have lower tail inequality:
![{\displaystyle \Pr[X\leq E(X)-\lambda ]\leq e^{-{\frac {\lambda ^{2}}{2(Var(X)+\sum _{i=1}^{n}a_{i}^{2}+M\lambda /3)}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/576b8206cdd2ba42383ba0d02519fb6c81b99efc)
If
satisfies
, we have upper tail inequality:
![{\displaystyle \Pr[X\leq E(X)+\lambda ]\leq e^{-{\frac {\lambda ^{2}}{2(Var(X)+\sum _{i=1}^{n}a_{i}^{2}+M\lambda /3)}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/38405d6261ea8963b65e123fcfba61677973bd80)
If
are i.i.d.,
and
is the variance of
. A typical version of Chernoff Inequality is:
![{\displaystyle \Pr[|X|\geq k\sigma ]\leq 2e^{-k^{2}/4}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c334e45a17c86aa6d047eafebd33c205c330a364)
Hoeffdin's Inequality
Hoeffding's inequality can be stated as follows:
If :
are independent. Assume that the
are almost surely bounded; that is, assume for
that
![{\displaystyle \Pr(X_{i}\in [a_{i},b_{i}])=1.\!}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a460e6e3339616c614880055f4f9d618572d420a)
Then, for the empirical mean of these variables
![{\displaystyle {\overline {X}}=(X_{1}+\cdots +X_{n})/n\!}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f2efc39d45cd5c73bfe5341956ef6f641e44e131)
we have the inequalities (Hoeffding 1963, Theorem 2 [1]):
![{\displaystyle \Pr({\overline {X}}-\mathrm {E} [{\overline {X}}]\geq t)\leq \exp \left(-{\frac {2t^{2}n^{2}}{\sum _{i=1}^{n}(b_{i}-a_{i})^{2}}}\right),\!}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f3329fcf72ab85e07d9c00200c914c978198f1d7)
![{\displaystyle \Pr(|{\overline {X}}-\mathrm {E} [{\overline {X}}]|\geq t)\leq 2\exp \left(-{\frac {2t^{2}n^{2}}{\sum _{i=1}^{n}(b_{i}-a_{i})^{2}}}\right),\!}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ff75bfcc1b16c55e41278e3b512506557a21edbc)
Bennett's Inequality
Let
X1, … Xn
be independent random variables, and assume (for simplicity but without loss of generality) they all have zero expected value. Further assume |Xi| ≤ a almost surely for all i, and let
![{\displaystyle \sigma ^{2}={\frac {1}{n}}\sum _{i=1}^{n}\operatorname {Var} (X_{i}).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/72e714136afcd05cdc31bfe7975f04d20f209cd6)
Then for any t ≥ 0,
![{\displaystyle \Pr \left(\sum _{i=1}^{n}X_{i}>t\right)\leq \exp \left(-{\frac {n\sigma ^{2}}{a^{2}}}h\left({\frac {at}{n\sigma ^{2}}}\right)\right),}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d552a9db5ba9d68a1ab3fb3ce63c8f305929c95f)
where h(u) = (1 + u)log(1 + u) – u.[2]
References