Bernoulli Parameters
0
<
p
<
1
,
p
∈
R
{\displaystyle 0<p<1,p\in \mathbb {R} }
Support
k
∈
{
0
,
1
}
{\displaystyle k\in \{0,1\}\,}
PMF
{
q
=
(
1
−
p
)
for
k
=
0
p
for
k
=
1
{\displaystyle {\begin{cases}q=(1-p)&{\text{for }}k=0\\p&{\text{for }}k=1\end{cases}}}
CDF
{
0
for
k
<
0
1
−
p
for
0
≤
k
<
1
1
for
k
≥
1
{\displaystyle {\begin{cases}0&{\text{for }}k<0\\1-p&{\text{for }}0\leq k<1\\1&{\text{for }}k\geq 1\end{cases}}}
Mean
p
{\displaystyle p\,}
Median
{
0
if
q
>
p
0.5
if
q
=
p
1
if
q
<
p
{\displaystyle {\begin{cases}0&{\text{if }}q>p\\0.5&{\text{if }}q=p\\1&{\text{if }}q<p\end{cases}}}
Mode
{
0
if
q
>
p
0
,
1
if
q
=
p
1
if
q
<
p
{\displaystyle {\begin{cases}0&{\text{if }}q>p\\0,1&{\text{if }}q=p\\1&{\text{if }}q<p\end{cases}}}
Variance
p
(
1
−
p
)
(
=
p
q
)
{\displaystyle p(1-p)(=pq)\,}
Skewness
1
−
2
p
p
q
{\displaystyle {\frac {1-2p}{\sqrt {pq}}}}
Excess kurtosis
1
−
6
p
q
p
q
{\displaystyle {\frac {1-6pq}{pq}}}
Entropy
−
q
ln
(
q
)
−
p
ln
(
p
)
{\displaystyle -q\ln(q)-p\ln(p)\,}
MGF
q
+
p
e
t
{\displaystyle q+pe^{t}\,}
CF
q
+
p
e
i
t
{\displaystyle q+pe^{it}\,}
PGF
q
+
p
z
{\displaystyle q+pz\,}
Fisher information
1
p
(
1
−
p
)
{\displaystyle {\frac {1}{p(1-p)}}}
In probability theory and statistics , the Bernoulli distribution , named after Swiss scientist Jacob Bernoulli ,[ 1] is the probability distribution of a random variable which takes the value 1 with success probability of
p
{\displaystyle p}
and the value 0 with failure probability of
q
=
1
−
p
{\displaystyle q=1-p}
. It can be used to represent a coin toss where 1 and 0 would represent "head" and "tail" (or vice versa), respectively. In particular, unfair coins would have
p
≠
0.5
{\displaystyle p\neq 0.5}
.
The Bernoulli distribution is a special case of the two-point distribution , for which the two possible outcomes need not be 0 and 1. It is also a special case of the binomial distribution ; the Bernoulli distribution is a binomial distribution where n=1.
Properties
If
X
{\displaystyle X}
is a random variable with this distribution, we have:
Pr
(
X
=
1
)
=
1
−
Pr
(
X
=
0
)
=
1
−
q
=
p
.
{\displaystyle \Pr(X=1)=1-\Pr(X=0)=1-q=p.\!}
The probability mass function
f
{\displaystyle f}
of this distribution, over possible outcomes k , is
f
(
k
;
p
)
=
{
p
if
k
=
1
,
1
−
p
if
k
=
0.
{\displaystyle f(k;p)={\begin{cases}p&{\text{if }}k=1,\\[6pt]1-p&{\text{if }}k=0.\end{cases}}}
This can also be expressed as
f
(
k
;
p
)
=
p
k
(
1
−
p
)
1
−
k
for
k
∈
{
0
,
1
}
.
{\displaystyle f(k;p)=p^{k}(1-p)^{1-k}\!\quad {\text{for }}k\in \{0,1\}.}
The Bernoulli distribution is a special case of the binomial distribution with
n
=
1
{\displaystyle n=1}
.[ 2]
The kurtosis goes to infinity for high and low values of
p
{\displaystyle p}
, but for
p
=
1
/
2
{\displaystyle p=1/2}
the two-point distributions including the Bernoulli distribution have a lower excess kurtosis than any other probability distribution, namely −2.
The Bernoulli distributions for
0
≤
p
≤
1
{\displaystyle 0\leq p\leq 1}
form an exponential family .
The maximum likelihood estimator of
p
{\displaystyle p}
based on a random sample is the sample mean .
Mean
The expected value of a Bernoulli random variable
X
{\displaystyle X}
is
E
(
X
)
=
p
{\displaystyle \operatorname {E} \left(X\right)=p}
This is due to the fact that for a Bernoulli distributed random variable
X
{\displaystyle X}
with
Pr
(
X
=
1
)
=
p
{\displaystyle \Pr(X=1)=p}
and
Pr
(
X
=
0
)
=
q
{\displaystyle \Pr(X=0)=q}
we find
E
[
X
]
=
Pr
(
X
=
1
)
⋅
1
+
Pr
(
X
=
0
)
⋅
0
=
p
⋅
1
+
q
⋅
0
=
p
{\displaystyle \operatorname {E} [X]=\Pr(X=1)\cdot 1+\Pr(X=0)\cdot 0=p\cdot 1+q\cdot 0=p}
Variance
The variance of a Bernoulli distributed
X
{\displaystyle X}
is
Var
[
X
]
=
p
q
=
p
(
1
−
p
)
{\displaystyle \operatorname {Var} [X]=pq=p(1-p)}
We first find
E
[
X
2
]
=
Pr
(
X
=
1
)
⋅
1
2
+
Pr
(
X
=
0
)
⋅
0
2
=
p
⋅
1
2
+
q
⋅
0
2
=
p
{\displaystyle \operatorname {E} [X^{2}]=\Pr(X=1)\cdot 1^{2}+\Pr(X=0)\cdot 0^{2}=p\cdot 1^{2}+q\cdot 0^{2}=p}
From this follows
Var
[
X
]
=
E
[
X
2
]
−
E
[
X
]
2
=
p
−
p
2
=
p
(
1
−
p
)
=
p
q
{\displaystyle \operatorname {Var} [X]=\operatorname {E} [X^{2}]-\operatorname {E} [X]^{2}=p-p^{2}=p(1-p)=pq}
Skewness
The skewness is
q
−
p
p
q
=
1
−
2
p
p
q
{\displaystyle {\frac {q-p}{\sqrt {pq}}}={\frac {1-2p}{\sqrt {pq}}}}
. When we take the standardized Bernoulli distributed random variable
X
−
E
[
X
]
Var
[
X
]
{\displaystyle {\frac {X-\operatorname {E} [X]}{\sqrt {\operatorname {Var} [X]}}}}
we find that this random variable attains
q
p
q
{\displaystyle {\frac {q}{\sqrt {pq}}}}
with probability
p
{\displaystyle p}
and attains
−
p
p
q
{\displaystyle -{\frac {p}{\sqrt {pq}}}}
with probability
q
{\displaystyle q}
. Thus we get
γ
1
=
E
[
(
X
−
E
[
X
]
Var
[
X
]
)
3
]
=
p
⋅
(
q
p
q
)
3
+
q
⋅
(
−
p
p
q
)
3
=
1
p
q
3
(
p
q
3
−
q
p
3
)
=
p
q
p
q
3
(
q
−
p
)
=
q
−
p
p
q
{\displaystyle {\begin{aligned}\gamma _{1}&=\operatorname {E} \left[\left({\frac {X-\operatorname {E} [X]}{\sqrt {\operatorname {Var} [X]}}}\right)^{3}\right]\\&=p\cdot \left({\frac {q}{\sqrt {pq}}}\right)^{3}+q\cdot \left(-{\frac {p}{\sqrt {pq}}}\right)^{3}\\&={\frac {1}{{\sqrt {pq}}^{3}}}\left(pq^{3}-qp^{3}\right)\\&={\frac {pq}{{\sqrt {pq}}^{3}}}(q-p)\\&={\frac {q-p}{\sqrt {pq}}}\end{aligned}}}
If
X
1
,
…
,
X
n
{\displaystyle X_{1},\dots ,X_{n}}
are independent, identically distributed (i.i.d. ) random variables, all Bernoulli distributed with success probability p , then
Y
=
∑
k
=
1
n
X
k
∼
B
(
n
,
p
)
{\displaystyle Y=\sum _{k=1}^{n}X_{k}\sim \mathrm {B} (n,p)}
(binomial distribution ).
The Bernoulli distribution is simply
B
(
1
,
p
)
{\displaystyle \mathrm {B} (1,p)}
.
See also
Notes
^ James Victor Uspensky: Introduction to Mathematical Probability , McGraw-Hill, New York 1937, page 45
^ McCullagh and Nelder (1989) , Section 4.2.2.
References
McCullagh, Peter ; Nelder, John (1989). Generalized Linear Models, Second Edition . Boca Raton: Chapman and Hall/CRC. ISBN 0-412-31760-5 .
Johnson, N.L., Kotz, S., Kemp A. (1993) Univariate Discrete Distributions (2nd Edition). Wiley. ISBN 0-471-54897-9
External links
Discrete univariate
with finite support with infinite support
Continuous univariate
supported on a bounded interval supported on a semi-infinite interval supported on the whole real line with support whose type varies
Mixed univariate
Multivariate (joint) Directional Degenerate and singular Families
Template:Common univariate probability distributions