Talk:Mixture distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.

I have only seen these two terms used synonomously 06:30, 31 October 2007 (UTC)


Should we rename this to Mixture distribution so as to include the discrete case? —3mta3 (talk) 12:27, 24 June 2009 (UTC)


I suggest to rename this to Mixture family, similarly to how we have the exponential family. Unlike all other “distribution” articles, this model cannot be completely described by a finite number of parameters, there always remains freedom in choosing the “mixed-in” densities pi(x), just like there is a similar freedom in the exponential family.  … stpasha »  15:20, 16 January 2010 (UTC)


I think the text should include a Simulation chapter. It's very easy to simulate a variable: first, sample from the discrete set {1, 2, ... n} with weights wi and get i, then simulate the i-th distribution. Is there any source for this? Albmont (talk) 17:29, 1 July 2009 (UTC)

Incorrect diagram[edit]

An IP editor plavedthis in the main text "-this is wrong it shows the concept, but the "mixed normal pdf" should be underneath the 3 pdf's because each pdf integrates to 1 on its own." I have hidden both this and the diagram for now. Melcombe (talk) 15:55, 28 June 2010 (UTC)

Incorrect formula for variance (2nd momentum; Section 'Moments')[edit]

There appears to be a mistake in the formula for the variance of a mixture distribution: The general formula is:

\operatorname{E}[(X - \mu)^j] & = \sum_{i = 1}^n w_i \operatorname{E}[(X_i - \mu_i + \mu_i - \mu)^j] & = \sum_{i=1}^n \sum_{k=0}^j \left( \begin{array}{c} j \\ k \end{array} \right) (\mu_i - \mu)^{j-k} w_i \operatorname{E}[(X_i- \mu_i)^k]

For j=2 (Variance), we get:

\operatorname{E}[(X - \mu)^2] = \sigma^2 & = \sum_{i=1}^n \sum_{k=0}^2 \left( \begin{array}{c} 2 \\ k \end{array} \right) (\mu_i - \mu)^{2-k} w_i \operatorname{E}[(X_i- \mu_i)^k] = 
\sum_{i=1}^n (1 * (\mu_i - \mu)^{2} * w_i * 1) + (2 * (\mu_i - \mu) * w_i * 0) + (1 * 1 * w_i * \operatorname{E}[(X_i- \mu_i)^2]) = \sum_{i=1}^n w_i*((\mu_i - \mu)^{2} + \operatorname{E}[(X_i- \mu_i)^2]) = \\
& = \sum_{i=1}^n w_i((\mu_i - \mu)^{2} + \sigma_i^2)

This is different to the formula from the article:

 \operatorname{E}[(X - \mu)^2] = \sigma^2 = \sum_{i = 1}^n w_i (\mu_i^2 + \sigma_i^2) - \mu^2 .

I believe the mistake is that

 (\mu_i - \mu)^{2} \neq (\mu_i^2 - \mu^2)


I altered the statement that the variance formula applied to normal distributions only. It is general.

Source: — Preceding unsigned comment added by Svein Olav Nyberg (talkcontribs) 22:25, 17 November 2015 (UTC)