Talk:Mixture model

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.

The second definition is a specialization of the first. Also, the EM algorithm can be generalized so that it applies to distributions other than Gaussians and I think this will actually make the text simpler. I will be bold and fix it. MisterSheik 16:05, 15 February 2007 (UTC)

The fruit punch example needs elaboration to make it cogent for a novice.

I might be mistaken but I believe that the "Topics in a Document" example is redundant: The first point suggests using a Dirichlet prior to determine the topic mixture, while sub-bullet two of point 2 suggests that instead of mixture modeling you could use Latent Dirichlet Allocation. (LDA) But isn't LDA just a Bayesian mixture model with a dirichlet prior? If not, that point should be clarified as well. (talk) 21:13, 29 March 2012 (UTC)

Unexplained removal of section[edit]

Hi, Please see these two unexplained edits that removed the entire "Recent Papers" section. This was done months ago, but can anyone tell if it was warranted? I don't know enough about the subject, but it seems worthy of a little investigation. Thanks Nelson50T 15:49, 17 April 2009 (UTC)

Merging with Mixture density[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
Time to close discussion. No consensus for merge. -- P 1 9 9   17:24, 24 February 2014 (UTC)

I'm planning on merging mixture density and mixture model, and probably placing them in the latter article. Clearly, mixture densities and mixture models are different things. However, in practice, there is an enormous amount of overlap. In particular, you cannot discuss a mixture model without careful discussion of a mixture density. The result is that, currently, there is a huge amount of overlap on the two pages. It's not clear to me that there's any way to separate the two that simultaneously

  1. Avoids significant amounts of duplicated info
  2. Makes each page coherent unto itself
  3. Makes each page fairly self-contained, rather than requiring constant reference to the other page

I suggest placing them both under mixture model, which includes a discussion of mixture densities as part of theoretical preliminary discussion. Benwing (talk) 09:13, 27 September 2010 (UTC)

Keep separate to enable reasonable access for those arriving via categories without having to be forced to read to of no relevance. Overlaps can successfully be dealt with by using the "Main" template and restricting the amount that is duplicated. Melcombe (talk) 14:21, 27 September 2010 (UTC)
Comment Have you looked at the overlap between the two pages? There is an enormous amount of duplication, suggesting that there isn't that much material "of no relevance" to the other topic. Even if the pages are kept separate, some sections will be consolidated in only one of the articles. For example, the section on estimation of a mixture density will have almost all its current content ripped out and would mostly just refer to the appropriate section for mixture models. Benwing (talk) 00:58, 28 September 2010 (UTC)
Have you looked at the context written into the start of "mixture model" and in the categories it has been placed? Specifically as a part of cluster analysis and a model used within cluster analysis? Melcombe (talk) 08:39, 28 September 2010 (UTC)
Keep separate - I can see the arguments for merging, but right now the distribution article seems to address more of the pure mathematical properties of mixture distributions while the model article addresses creating and using them. Which seems about right since they are related but somewhat different concepts. If the articles were very short I think the argument for merging would be strong, but since both the articles are fairly long (and technical) I think it makes it easier to follow the concepts if we keep them separate, with links to the other article where appropriate. Rlendog (talk) 16:41, 7 November 2013 (UTC)

The above discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.


You cannot, nor ever should want to, say "N random variables". Quit it. (talk) 18:41, 2 December 2014 (UTC)


The figures in plate notation use k as index for the components. In the text i runs over all N observations as well as over all K clusters. Either change the figures, or change the text. Anne van Rossum (talk) 14:34, 8 July 2015 (UTC)