# User:Hilikus44

Contributions I have made to Wikipedia

## One-Dimensional Examples of Sufficient Statistics

### Normal Distribution

If $X_1,...,X_n$ are independent and normally distributed with expected value θ (a parameter) and known finite variance $\sigma^{2}$, then $T(X_1^n)=\overline{X}=\frac1n\sum_{i=1}^nX_i$ is a sufficient statistic for θ.

To see this, consider the joint probability density function of $X_1^n=(X_1,...,X_n)$. Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align} f_{X_1^n}(x_1^n) &= \prod_{i=1}^n \tfrac{1}{\sqrt{2\pi\sigma^2}}\, e^{\frac{-(x_i-\theta)^2}{2\sigma^2}} = (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2}\sum_{i=1}^n(x_i-\theta)^2} = (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2}\sum_{i=1}^n( (x_i-\overline{x}) - (\theta-\overline{x}) )^2} = (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} (\sum_{i=1}^n(x_i-\overline{x})^2 + \sum_{i=1}^n(\theta-\overline{x})^2 -2\sum_{i=1}^n(x_i-\overline{x})(\theta-x_i)) }. \end{align}

Then, since $\sum_{i=1}^n(x_i-\overline{x})(\theta-x_i)=0$, which can be shown simply by expanding this term,

\begin{align} f_{X_1^n}(x_1^n) &= (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} (\sum_{i=1}^n(x_i-\overline{x})^2 + n(\theta-\overline{x})^2) } &= (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} \sum_{i=1}^n(x_i-\overline{x})^2}\, e^{ {-n\over2\sigma^2}(\theta-\overline{x})^2 }. \end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} h(x_1^n)= (2\pi\sigma^2)^{-n\over2}\, e^{ {-1\over2\sigma^2} \sum_{i=1}^n(x_i-\overline{x})^2},\,\,\, g_{\theta}(x_1^n)= e^{ {-n\over2\sigma^2}(\theta-\overline{x})^2 }. \end{align}

Since $h(x_1^n)$ does not depend on the parameter $\theta$ and $g_{\theta}(x_1^n)$ depends only on $x_1^n$ through the function $T(X_1^n)=\overline{X}=\frac1n\sum_{i=1}^nX_i,$

the Fisher–Neyman factorization theorem implies $T(X_1^n)=\overline{X}=\frac1n\sum_{i=1}^nX_i$ is a sufficient statistic for $\theta$.

### Exponential Distribution

If $X_1,...,X_n$ are independent and exponentially distributed with expected value θ (an unknown real-valued positive parameter), then $T(X_1^n)=\sum_{i=1}^nX_i$ is a sufficient statistic for θ.

To see this, consider the joint probability density function of $X_1^n=(X_1,...,X_n)$. Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align} f_{X_1^n}(x_1^n) &= \prod_{i=1}^n {1 \over \theta} \, e^{ {-1 \over \theta}x_i } = {1 \over \theta^n}\, e^{ {-1 \over \theta} \sum_{i=1}^nx_i }. \end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} h(x_1^n)= 1,\,\,\, g_{\theta}(x_1^n)= {1 \over \theta^n}\, e^{ {-1 \over \theta} \sum_{i=1}^nx_i }. \end{align}

Since $h(x_1^n)$ does not depend on the parameter $\theta$ and $g_{\theta}(x_1^n)$ depends only on $x_1^n$ through the function $T(X_1^n)=\sum_{i=1}^nX_i$

the Fisher–Neyman factorization theorem implies $T(X_1^n)=\sum_{i=1}^nX_i$ is a sufficient statistic for $\theta$.

## Two-Dimensional Examples of Sufficient Statistics

### Uniform Distribution (with two parameters)

If $X_1,...,X_n\,$ are independent and uniformly distributed on the interval $[\alpha, \beta]\,$ (where $\alpha\,$ and $\beta\,$ are unknown parameters), then $T(X_1^n)=(\min_{1 \leq i \leq n}X_i,\max_{1 \leq i \leq n}X_i)\,$ is a two-dimensional sufficient statistic for $(\alpha\, , \, \beta)$.

To see this, consider the joint probability density function of $X_1^n=(X_1,...,X_n)$. Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align} f_{X_1^n}(x_1^n) &= \prod_{i=1}^n ({1 \over \beta-\alpha}) \mathbf{1}_{ \{ \alpha \leq x_i \leq \beta \} } &= ({1 \over \beta-\alpha})^n \mathbf{1}_{ \{ \alpha \leq x_i \leq \beta, \, \forall \, i = 1,\cdots,n\}} &= ({1 \over \beta-\alpha})^n \mathbf{1}_{ \{ \alpha \, \leq \, \min_{1 \leq i \leq n}X_i \} } \mathbf{1}_{ \{ \max_{1 \leq i \leq n}X_i \, \leq \, \beta \} }. \end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} h(x_1^n)= 1,\,\,\, g_{(\alpha \, , \, \beta)}(x_1^n)= ({1 \over \beta-\alpha})^n \mathbf{1}_{ \{ \alpha \, \leq \, \min_{1 \leq i \leq n}X_i \} } \mathbf{1}_{ \{ \max_{1 \leq i \leq n}X_i \, \leq \, \beta \} }. \end{align}

Since $h(x_1^n)$ does not depend on the parameter $(\alpha\, , \, \beta)$ and $g_{(\alpha \, , \, \beta)}(x_1^n)$ depends only on $x_1^n$ through the function $T(X_1^n)=(\min_{1 \leq i \leq n}X_i,\max_{1 \leq i \leq n}X_i)\,$,

the Fisher–Neyman factorization theorem implies $T(X_1^n)=(\min_{1 \leq i \leq n}X_i,\max_{1 \leq i \leq n}X_i)\,$ is a sufficient statistic for $(\alpha\, , \, \beta)$.

### Gamma Distribution

If $X_1,...,X_n\,$ are independent and distributed as a $\Gamma(\alpha \, , \, \beta) \,\,$ , where $\alpha\,$ and $\beta\,$ are unknown parameters of a Gamma distribution, then $T(X_1^n)=( \prod_{i=1}^n{x_i} , \sum_{i=1}^n{x_i} )\,$ is a two-dimensional sufficient statistic for $(\alpha\, , \, \beta)$.

To see this, consider the joint probability density function of $X_1^n=(X_1,...,X_n)$. Because the observations are independent, the pdf can be written as a product of individual densities, ie -

\begin{align} f_{X_1^n}(x_1^n) &= \prod_{i=1}^n ({1 \over \Gamma(\alpha) \beta^{\alpha}}) x_i^{\alpha -1} e^{{-1 \over \beta}x_i} &= ({1 \over \Gamma(\alpha) \beta^{\alpha}})^n (\prod_{i=1}^n x_i)^{\alpha-1} e^{{-1 \over \beta} \sum_{i=1}^n{x_i}}. \end{align}

The joint density of the sample takes the form required by the Fisher–Neyman factorization theorem, by letting

\begin{align} h(x_1^n)= 1,\,\,\, g_{(\alpha \, , \, \beta)}(x_1^n)= ({1 \over \Gamma(\alpha) \beta^{\alpha}})^n (\prod_{i=1}^n x_i)^{\alpha-1} e^{{-1 \over \beta} \sum_{i=1}^n{x_i}}. \end{align}

Since $h(x_1^n)$ does not depend on the parameter $(\alpha\, , \, \beta)$ and $g_{(\alpha \, , \, \beta)}(x_1^n)$ depends only on $x_1^n$ through the function $T(X_1^n)=( \prod_{i=1}^n{x_i} , \sum_{i=1}^n{x_i} )\,$,

the Fisher–Neyman factorization theorem implies $T(X_1^n)=( \prod_{i=1}^n{x_i} , \sum_{i=1}^n{x_i} )\,$ is a sufficient statistic for $(\alpha\, , \, \beta)$.