= Normal-inverse-gamma distribution =

\frac{ \beta^\alpha }{ \Gamma( \alpha ) }
\left( \frac{1}{\sigma^2 } \right)^{\alpha + 1}
\exp \left( -\frac { 2\beta + \lambda (x - \mu)^2} {2\sigma^2}\right) </math>
|
  cdf =|
  mean =$\operatorname{E}[x] = \mu$

$\operatorname{E}[\sigma^2] = \frac{\beta}{\alpha - 1}$, for $\alpha >1$|
  median =|
  mode =$x = \mu \; \textrm{(univariate)}, x = \boldsymbol{\mu} \; \textrm{(multivariate)}$

$\sigma^2 = \frac{\beta}{\alpha + 1 + 1/2} \; \textrm{(univariate)}, \sigma^2 = \frac{\beta}{\alpha + 1 + k/2} \; \textrm{(multivariate)}$|
  variance =$\operatorname{Var}[x] = \frac{\beta}{(\alpha -1)\lambda}$, for $\alpha > 1$

$\operatorname{Var}[\sigma^2] = \frac{\beta^2}{(\alpha -1)^2(\alpha -2)}$, for $\alpha > 2$

$\operatorname{Cov}[x, \sigma^2] = 0$, for $\alpha > 1$|
  skewness =|
  kurtosis =|
  entropy =|
  mgf =|
  char =|
}}
In probability theory and statistics, the normal-inverse-gamma distribution (or Gaussian-inverse-gamma distribution) is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

==Definition==
Suppose

$x \mid \sigma^2, \mu, \lambda\sim \mathrm{N}(\mu,\sigma^2 / \lambda) \,\!$
has a normal distribution with mean $\mu$ and variance $\sigma^2 / \lambda$, where

$\sigma^2\mid\alpha, \beta \sim \Gamma^{-1}(\alpha,\beta) \!$
has an inverse-gamma distribution. Then $(x,\sigma^2)$
has a normal-inverse-gamma distribution, denoted as
$(x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) \! .$

($\text{NIG}$ is also used instead of $\text{N-}\Gamma^{-1}.$)

The normal-inverse-Wishart distribution is a generalization of the normal-inverse-gamma distribution that is defined over multivariate random variables.

==Characterization==

===Probability density function===

 $f(x,\sigma^2\mid\mu,\lambda,\alpha,\beta) = \frac {\sqrt{\lambda}} {\sigma\sqrt{2\pi} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1} \exp \left( -\frac { 2\beta + \lambda(x - \mu)^2} {2\sigma^2} \right)$

For the multivariate form where $\mathbf{x}$ is a $k \times 1$ random vector,

 $f(\mathbf{x},\sigma^2\mid\mu,\mathbf{V}^{-1},\alpha,\beta) = |\mathbf{V}|^{-1/2} {(2\pi)^{-k/2} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1 + k/2} \exp \left( -\frac { 2\beta + (\mathbf{x} - \boldsymbol{\mu})^T \mathbf{V}^{-1} (\mathbf{x} - \boldsymbol{\mu})} {2\sigma^2} \right).$

where $|\mathbf{V}|$ is the determinant of the $k \times k$ matrix $\mathbf{V}$. Note how this last equation reduces to the first form if $k = 1$ so that $\mathbf{x}, \mathbf{V}, \boldsymbol{\mu}$ are scalars.

==== Alternative parameterization ====
It is also possible to let $\gamma = 1 / \lambda$ in which case the pdf becomes

 $f(x,\sigma^2\mid\mu,\gamma,\alpha,\beta) = \frac {1} {\sigma\sqrt{2\pi\gamma} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1} \exp \left( -\frac{2\gamma\beta + (x - \mu)^2}{2\gamma \sigma^2} \right)$

In the multivariate form, the corresponding change would be to regard the covariance matrix $\mathbf{V}$ instead of its inverse $\mathbf{V}^{-1}$ as a parameter.

===Cumulative distribution function===
 $F(x,\sigma^2\mid\mu,\lambda,\alpha,\beta) = \frac{e^{-\frac{\beta}{\sigma^2}} \left(\frac{\beta }{\sigma ^2}\right)^\alpha
   \left(\operatorname{erf}\left(\frac{\sqrt{\lambda} (x-\mu )}{\sqrt{2} \sigma }\right)+1\right)}{2
   \sigma^2 \Gamma (\alpha)}$

==Properties==

===Marginal distributions===

Given $(x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) \! .$ as above, $\sigma^2$ by itself follows an inverse gamma distribution:

$\sigma^2 \sim \Gamma^{-1}(\alpha,\beta) \!$

while $\sqrt{\frac{\alpha\lambda}{\beta}} (x - \mu)$ follows a t distribution with $2 \alpha$ degrees of freedom.

 ,
</math>

with $x=\sigma^2$, $a = \alpha + 1/2$, $b = \frac { 2\beta + (x - \mu)^2} {2}$.

Since $\int_0^\infty dx \Gamma^{-1}(x; a, b) = 1, \quad \int_0^\infty dx x^{-(a+1)} e^{-b/x} = \Gamma(a) b^{-a}$, and

$\int_0^\infty d\sigma^2
\left( \frac{1}{\sigma^2} \right)^{\alpha + 1/2 + 1} \exp \left( -\frac { 2\beta + (x - \mu)^2} {2\sigma^2}
\right)
= \Gamma(\alpha + 1/2) \left(\frac { 2\beta + (x - \mu)^2} {2} \right)^{-(\alpha + 1/2)}$

Substituting this expression and factoring dependence on $x$,

$f(x \mid \mu,\alpha,\beta) \propto_{x} \left(1 + \frac{(x - \mu)^2}{2 \beta} \right)^{-(\alpha + 1/2)} .$

Shape of generalized Student's t-distribution is

$t(x | \nu,\hat{\mu},\hat{\sigma}^2)
\propto_x
\left(1+\frac{1}{\nu} \frac{ (x-\hat{\mu})^2 }{\hat{\sigma}^2 } \right)^{-(\nu+1)/2}$.

Marginal distribution $f(x \mid \mu,\alpha,\beta)$ follows t-distribution with
$2 \alpha$ degrees of freedom

$f(x \mid \mu,\alpha,\beta) = t(x | \nu=2 \alpha, \hat{\mu}=\mu, \hat{\sigma}^2=\beta/\alpha )$.
}}

In the multivariate case, the marginal distribution of $\mathbf{x}$ is a multivariate t distribution:

$\mathbf{x} \sim t_{2\alpha}(\boldsymbol{\mu}, \frac{\beta}{\alpha} \mathbf{V}) \!$

===Scaling===
Suppose

$(x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) \! .$

Then for $c>0$,
$(cx,c\sigma^2) \sim \text{N-}\Gamma^{-1}(c\mu,\lambda/c,\alpha,c\beta) \! .$

Proof: To prove this let $(x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta)$ and fix $c>0$. Defining $Y=(Y_1,Y_2)=(cx,c \sigma^2)$, observe that the PDF of the random variable $Y$ evaluated at $(y_1,y_2)$ is given by $1/c^2$ times the PDF of a $\text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta)$ random variable evaluated at $(y_1/c,y_2/c)$. Hence the PDF of $Y$ evaluated at $(y_1,y_2)$ is given by :$f_Y(y_1,y_2)=\frac{1}{c^2} \frac {\sqrt{\lambda}} {\sqrt{2\pi y_2/c} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{y_2/c} \right)^{\alpha + 1} \exp \left( -\frac { 2\beta + \lambda(y_1/c - \mu)^2} {2y_2/c} \right) = \frac {\sqrt{\lambda/c}} {\sqrt{2\pi y_2} } \, \frac{(c\beta)^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{y_2} \right)^{\alpha + 1} \exp \left( -\frac { 2c\beta + (\lambda/c) \, (y_1 - c\mu)^2} {2y_2} \right).\!$

The right hand expression is the PDF for a $\text{N-}\Gamma^{-1}(c\mu,\lambda/c,\alpha,c\beta)$ random variable evaluated at $(y_1,y_2)$, which completes the proof.

===Exponential family===

Normal-inverse-gamma distributions form an exponential family with natural parameters $\textstyle\theta_1=\frac{-\lambda}{2}$, $\textstyle\theta_2=\lambda \mu$, $\textstyle\theta_3=\alpha$, and $\textstyle\theta_4=-\beta+\frac{-\lambda \mu^2}{2}$ and sufficient statistics $\textstyle T_1=\frac{x^2}{\sigma^2}$, $\textstyle T_2=\frac{x}{\sigma^2}$, $\textstyle T_3=\log \big( \frac{1}{\sigma^2} \big)$, and $\textstyle T_4=\frac{1}{\sigma^2}$.

===Kullback–Leibler divergence===

Measures difference between two distributions.

== Posterior distribution of the parameters ==
See the articles on normal-gamma distribution and conjugate prior.

== Interpretation of the parameters ==
See the articles on normal-gamma distribution and conjugate prior.

== Generating normal-inverse-gamma random variates ==
Generation of random variates is straightforward:
1. Sample $\sigma^2$ from an inverse gamma distribution with parameters $\alpha$ and $\beta$
2. Sample $x$ from a normal distribution with mean $\mu$ and variance $\sigma^2/\lambda$

== Related distributions ==
- The normal-gamma distribution is the same distribution parameterized by precision rather than variance
- A generalization of this distribution which allows for a multivariate mean and a completely unknown positive-definite covariance matrix $\sigma^2 \mathbf{V}$ (whereas in the multivariate inverse-gamma distribution the covariance matrix is regarded as known up to the scale factor $\sigma^2$) is the normal-inverse-Wishart distribution

== See also ==
- Compound probability distribution
