de Moivre–Laplace theorem

From Wikipedia, the free encyclopedia
Jump to: navigation, search
As n grows large, the shape of the binomial distribution begins to resemble the smooth Gaussian curve.

In probability theory, the de Moivre–Laplace theorem is a normal approximation to the binomial distribution. It is a special case of the central limit theorem. It states that the binomial distribution of the number of "successes" in n independent Bernoulli trials with probability p of success on each trial is approximately a normal distribution with mean np and standard deviation np(1-p), if n is very large and some conditions are satisfied.

The theorem appeared in the second edition of The Doctrine of Chances by Abraham de Moivre, published in 1738. The "Bernoulli trials" were not so-called in that book, but rather de Moivre wrote about the probability distribution of the number of times "heads" appears when a coin is tossed 3600 times.[1]

Theorem[edit]

As n grows large, for k in the neighborhood of np we can approximate[2][3]

{n \choose k}\, p^k q^{n-k} \simeq \frac{1}{\sqrt{2 \pi npq}}\,e^{-\frac{(k-np)^2}{2npq}}, \qquad p+q=1,\ p, q > 0

in the sense that the ratio of the left-hand side to the right-hand side converges to 1 as n → ∞.

Proof[edit]

According to Stirling's formula, we can replace the factorial of a large number n, with the approximation:

n! \simeq  n^n e^{-n}\sqrt{2 \pi n}\qquad \text{as } n \to \infty.

Thus

\begin{align}
{n \choose k} p^k q^{n-k} & = \frac{n!}{k!(n-k)!} p^k q^{n-k} \\
& \simeq \frac{n^n e^{-n}\sqrt{2\pi n} }{\left (k^ke^{-k}\sqrt{2\pi k} \right ) \left ((n-k)^{n-k}e^{-(n-k)}\sqrt{2\pi (n-k)} \right )} p^k q^{n-k}\\
& = \left (\frac{\sqrt{2\pi n} }{\sqrt{2\pi k} \sqrt{2\pi (n-k)} }\right ) \cdot \left (\frac{e^{-n}}{e^{-k}e^{-(n-k)} }\right) \cdot \left (\frac{n^n }{k^k(n-k)^{n-k}} p^k q^{n-k}\right )\\
& = \sqrt{\frac{n}{2\pi k(n-k)}}\cdot 1 \cdot \left (n^n\left(\frac{p}{k}\right)^k{\left(\frac{q }{n-k }\right)}^{(n- k)}\right ) \\
& = \sqrt{\frac{n}{2\pi k(n-k)}} \left ( n^{n-k} n^k {\left(\frac{p}{k}\right)}^k {\left(\frac{q}{n-k}\right)}^{(n-k)}\right )\\
& = \sqrt{\frac{n}{2\pi k(n-k)}} \left ( \left(\frac{np }{k }\right)^k \left(\frac{nq }{n-k }\right)^{(n- k)} \right)\\
& = \sqrt{\frac{n}{2\pi k(n-k)}} \left ( \frac{k}{np}\right)^{-k} \left(\frac{n-k}{nq }\right)^{-(n- k)} \\
& = \sqrt{\frac{n}{2\pi k(n-k)}} \left ( 1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)} && x :=\frac{k-np}{\sqrt{npq}} \\
& = \sqrt{\frac{n}{2\pi k(n-k)} \frac{n^{-2}}{n^{-2}} } \left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)} \\
& = \sqrt{\frac{n^{-1}}{2\pi k(n-k) n^{-2}}} \left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)} \\
& = \sqrt{\frac{n^{-1}}{2\pi \frac{k}{n}\frac{(n-k)}{n}}} \left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)}\\
& = \sqrt{\frac{n^{-1}}{2\pi \frac{k}{n}\left(1-\frac{k}{n}\right)}} \left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)} \\
&\simeq \sqrt{\frac{n^{-1}}{2\pi p(1-p)}} \left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)}  &&\text{as } k\to np \text{ we get } \tfrac{k}{n} \to p\\
& =\sqrt{\frac{1}{2\pi npq}} \left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)} && p+q=1\\
& = \frac{1}{\sqrt{2\pi npq}} \exp \left \{ \ln \left [\left(1+x\sqrt{\frac{q}{np}}\right)^{-k} \left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n - k)}\right ] \right \}&& e^{\ln(y)} = y \\
& = \frac{1}{\sqrt{2\pi npq}} \exp \left \{ \ln \left[\left(1+x\sqrt{\frac{q}{np}}\right)^{-k}\right ]+\ln\left [\left(1-x\sqrt{\frac{p}{nq}}\right)^{-(n-k)}\right ] \right \} \\
& = \frac{1}{\sqrt{2\pi npq}} \exp \left \{ -k\ln\left [1+x\sqrt{\frac{q}{np}} \right ] -(n-k)\ln\left [1-x\sqrt{\frac{p}{nq}}\right ] \right \} \\
& = \frac{1}{\sqrt{2\pi npq}} \exp \left \{  -\left(np+x\sqrt{npq}\right) \ln\left [1+x\sqrt{\frac{q}{np}} \right ] - \left(nq-x\sqrt{npq}\right) \ln\left [1-x\sqrt{\frac{p}{nq}}\right ] \right \}  
\end{align}

The last line follows from our definition of x. Now using the Taylor series expansion of the functions ln(1±x) we arrive at:

\begin{align}
\frac{1}{\sqrt{2\pi npq}} \exp &\left \{-\left(np+x\sqrt{npq}\right)\left(x\sqrt{\frac{q}{np}}-\frac{x^2q}{2np}+\cdots \right) -\left(nq-x\sqrt{npq}\right)\left(-x\sqrt{\frac{p}{nq}}-\frac{x^2p}{2nq}-\cdots \right)\right \}  = \\
& =\frac{1}{\sqrt{2\pi npq}} \exp \left \{ -\left(x\sqrt{npq}-\tfrac{1}{2}x^2q+x^2q+\cdots \right)-\left(-x\sqrt{npq}-\tfrac{1}{2}x^2p+x^2p+\cdots \right) \right \}\\
& =\frac{1}{\sqrt{2\pi npq}} \exp \left \{ -\left(x\sqrt{npq}+\tfrac{1}{2}x^2q+\cdots \right)-\left(-x\sqrt{npq}+\tfrac{1}{2}x^2p+\cdots \right) \right \}\\
& =\frac{1}{\sqrt{2\pi npq}} \exp \left \{ -x\sqrt{npq}-\tfrac{1}{2}x^2q+x\sqrt{npq}-\tfrac{1}{2}x^2p-\cdots \right \} \\
& =\frac{1}{\sqrt{2\pi npq}} \exp \left \{ -\tfrac{1}{2}x^2\left(q+p\right)-\cdots \right \} \\
& =\frac{1}{\sqrt{2\pi npq}} \exp \left \{ -\tfrac{1}{2}x^2-\cdots \right \}\\
&\simeq \frac{1}{\sqrt{2\pi npq}} \exp \left \{ -\tfrac{1}{2}x^2 \right \} && \text{as } n \to \infty  \text{ we get } x \to 0\\
&= \frac{1}{\sqrt{2\pi npq}} \exp \left \{ -\frac{1}{2} \left(\frac{k-np}{\sqrt{npq}}\right)^2 \right \} \\
&= \frac{1}{\sqrt{2\pi npq}} \exp \left \{ - \frac{(k-np)^2}{2npq} \right \} 
\end{align}

Thus,

{n \choose k}p^kq^{n-k}\simeq \frac{1}{\sqrt{2\pi npq}}e^{-\frac{(k-np)^2}{2npq}}.

Notes[edit]

  1. ^ Walker, Helen M (1985). "De Moivre on the law of normal probability". In Smith, David Eugene. A source book in mathematics. Dover. p. 78. ISBN 0-486-64690-4. "But altho’ the taking an infinite number of Experiments be not practicable, yet the preceding Conclusions may very well be applied to finite numbers, provided they be great, for Instance, if 3600 Experiments be taken, make n = 3600, hence ½n will be = 1800, and ½√n 30, then the Probability of the Event’s neither appearing oftner than 1830 times, nor more rarely than 1770, will be 0.682688." 
  2. ^ Papoulis, Pillai, "Probability, Random Variables, and Stochastic Processes", 4th Edition
  3. ^ Feller, W. (1968) An Introduction to Probability Theory and Its Applications (Volume 1). Wiley. ISBN 0-471-25708-7. Section VII.3