|| location (vector of real)
inverse scale matrix (pos. def.)
|| covariance matrix (pos. def.)
In probability theory and statistics, the normal-inverse-Wishart distribution (or Gaussian-inverse-Wishart distribution) is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distribution with unknown mean and covariance matrix (the inverse of the precision matrix).
has a multivariate normal distribution with mean and covariance matrix , where
has an inverse Wishart distribution. Then has a normal-inverse-Wishart distribution, denoted as
Probability density function
By construction, the marginal distribution over is an inverse Wishart distribution, and the conditional distribution over given is a multivariate normal distribution. The marginal distribution over is a multivariate t-distribution.
Posterior distribution of the parameters
Suppose the sampling density is a multivariate normal distribution
where is an matrix and (of length ) is row of the matrix .
With the mean and covariance matrix of the sampling distribution is unknown, we can place a Normal-Inverse-Wishart prior on the mean and covariance parameters jointly
The resulting posterior distribution for the mean and covariance matrix will also be a Normal-Inverse-Wishart
To sample from the joint posterior of , one simply draws samples from , then draw . To draw from the posterior predictive of a new observation, draw , given the already drawn values of and .
Generating normal-inverse-Wishart random variates
Generation of random variates is straightforward:
- Sample from an inverse Wishart distribution with parameters and
- Sample from a multivariate normal distribution with mean and variance
- The normal-Wishart distribution is essentially the same distribution parameterized by precision rather than variance. If then .
- The normal-inverse-gamma distribution is the one-dimensional equivalent.
- The multivariate normal distribution and inverse Wishart distribution are the component distributions out of which this distribution is made.
- ^ Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution." 
- ^ Gelman, Andrew, et al. Bayesian data analysis. Vol. 2, p.73. Boca Raton, FL, USA: Chapman & Hall/CRC, 2014.
- Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer Science+Business Media.
- Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution."