User:Hilikus44

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Contributions I have made to Wikipedia

One-Dimensional Examples of Sufficient Statistics[edit]

Normal Distribution[edit]

If X_1,...,X_n are independent and normally distributed with expected value θ (a parameter) and known finite variance \sigma^{2}, then T(X_1^n)=\overline{X}=\frac1n\sum_{i=1}^nX_i is a sufficient statistic for θ.

To see this, consider the joint probability density function of X_1^n=(X_1,...,X_n). Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align}
f_{X_1^n}(x_1^n) 
  &= \prod_{i=1}^n \tfrac{1}{\sqrt{2\pi\sigma^2}}\, e^{\frac{-(x_i-\theta)^2}{2\sigma^2}} 
   = (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2}\sum_{i=1}^n(x_i-\theta)^2}  
   = (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2}\sum_{i=1}^n( (x_i-\overline{x}) - (\theta-\overline{x}) )^2}  
   = (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} (\sum_{i=1}^n(x_i-\overline{x})^2 + \sum_{i=1}^n(\theta-\overline{x})^2 -2\sum_{i=1}^n(x_i-\overline{x})(\theta-x_i)) }.
\end{align}

Then, since \sum_{i=1}^n(x_i-\overline{x})(\theta-x_i)=0, which can be shown simply by expanding this term,

\begin{align}
f_{X_1^n}(x_1^n) 
  &= (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} (\sum_{i=1}^n(x_i-\overline{x})^2 + n(\theta-\overline{x})^2) }  
  &= (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} \sum_{i=1}^n(x_i-\overline{x})^2}\, e^{ {-n\over2\sigma^2}(\theta-\overline{x})^2 }.
\end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} 
h(x_1^n)= (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} \sum_{i=1}^n(x_i-\overline{x})^2},\,\,\, 
g_{\theta}(x_1^n)= e^{ {-n\over2\sigma^2}(\theta-\overline{x})^2 }.
\end{align}

Since h(x_1^n) does not depend on the parameter \theta and g_{\theta}(x_1^n) depends only on x_1^n through the function T(X_1^n)=\overline{X}=\frac1n\sum_{i=1}^nX_i,

the Fisher–Neyman factorization theorem implies T(X_1^n)=\overline{X}=\frac1n\sum_{i=1}^nX_i is a sufficient statistic for \theta.

Exponential Distribution[edit]

If X_1,...,X_n are independent and exponentially distributed with expected value θ (an unknown real-valued positive parameter), then T(X_1^n)=\sum_{i=1}^nX_i is a sufficient statistic for θ.

To see this, consider the joint probability density function of X_1^n=(X_1,...,X_n). Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align}
f_{X_1^n}(x_1^n) 
  &= \prod_{i=1}^n {1 \over \theta}  \, e^{ {-1 \over \theta}x_i }
   =               {1 \over \theta^n}\, e^{ {-1 \over \theta} \sum_{i=1}^nx_i }.
\end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} 
h(x_1^n)= 1,\,\,\, 
g_{\theta}(x_1^n)= {1 \over \theta^n}\, e^{ {-1 \over \theta} \sum_{i=1}^nx_i }.
\end{align}

Since h(x_1^n) does not depend on the parameter \theta and g_{\theta}(x_1^n) depends only on x_1^n through the function T(X_1^n)=\sum_{i=1}^nX_i

the Fisher–Neyman factorization theorem implies T(X_1^n)=\sum_{i=1}^nX_i is a sufficient statistic for \theta.


Two-Dimensional Examples of Sufficient Statistics[edit]

Uniform Distribution (with two parameters)[edit]

If X_1,...,X_n\, are independent and uniformly distributed on the interval [\alpha, \beta]\, (where \alpha\, and \beta\, are unknown parameters), then T(X_1^n)=(\min_{1 \leq i \leq n}X_i,\max_{1 \leq i \leq n}X_i)\, is a two-dimensional sufficient statistic for (\alpha\, , \, \beta).

To see this, consider the joint probability density function of X_1^n=(X_1,...,X_n). Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align}
f_{X_1^n}(x_1^n) 
  &= \prod_{i=1}^n ({1 \over \beta-\alpha}) \mathbf{1}_{ \{ \alpha \leq x_i \leq \beta \} }
  &= ({1 \over \beta-\alpha})^n \mathbf{1}_{ \{ \alpha \leq x_i \leq \beta, \, \forall \, i = 1,\cdots,n\}}
  &= ({1 \over \beta-\alpha})^n \mathbf{1}_{ \{ \alpha \, \leq \, \min_{1 \leq i \leq n}X_i \} } \mathbf{1}_{ \{ \max_{1 \leq i \leq n}X_i \, \leq \, \beta \} }.
\end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} 
h(x_1^n)= 1,\,\,\, 
g_{(\alpha \, , \, \beta)}(x_1^n)= ({1 \over \beta-\alpha})^n \mathbf{1}_{ \{ \alpha \, \leq \, \min_{1 \leq i \leq n}X_i \} } \mathbf{1}_{ \{ \max_{1 \leq i \leq n}X_i \, \leq \, \beta \} }.
\end{align}

Since h(x_1^n) does not depend on the parameter (\alpha\, , \, \beta) and g_{(\alpha \, , \, \beta)}(x_1^n) depends only on x_1^n through the function T(X_1^n)=(\min_{1 \leq i \leq n}X_i,\max_{1 \leq i \leq n}X_i)\,,

the Fisher–Neyman factorization theorem implies T(X_1^n)=(\min_{1 \leq i \leq n}X_i,\max_{1 \leq i \leq n}X_i)\, is a sufficient statistic for (\alpha\, , \, \beta).

Gamma Distribution[edit]

If X_1,...,X_n\, are independent and distributed as a \Gamma(\alpha \, , \, \beta) \,\, , where \alpha\, and \beta\, are unknown parameters of a Gamma distribution, then T(X_1^n)=( \prod_{i=1}^n{x_i} , \sum_{i=1}^n{x_i} )\, is a two-dimensional sufficient statistic for (\alpha\, , \, \beta).

To see this, consider the joint probability density function of X_1^n=(X_1,...,X_n). Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align}
f_{X_1^n}(x_1^n) 
  &= \prod_{i=1}^n ({1 \over \Gamma(\alpha) \beta^{\alpha}}) x_i^{\alpha -1} e^{{-1 \over \beta}x_i}
  &= ({1 \over \Gamma(\alpha) \beta^{\alpha}})^n (\prod_{i=1}^n x_i)^{\alpha-1} e^{{-1 \over \beta} \sum_{i=1}^n{x_i}}.
\end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} 
h(x_1^n)= 1,\,\,\, 
g_{(\alpha \, , \, \beta)}(x_1^n)= ({1 \over \Gamma(\alpha) \beta^{\alpha}})^n (\prod_{i=1}^n x_i)^{\alpha-1} e^{{-1 \over \beta} \sum_{i=1}^n{x_i}}.
\end{align}

Since h(x_1^n) does not depend on the parameter (\alpha\, , \, \beta) and g_{(\alpha \, , \, \beta)}(x_1^n) depends only on x_1^n through the function T(X_1^n)=( \prod_{i=1}^n{x_i} , \sum_{i=1}^n{x_i} )\,,

the Fisher–Neyman factorization theorem implies T(X_1^n)=( \prod_{i=1}^n{x_i} , \sum_{i=1}^n{x_i} )\, is a sufficient statistic for (\alpha\, , \, \beta).