Talk:Multinomial distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated Start-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.

This page should be edited to conform to the layout of the other distributions. 08:30, 6 March 2007 (UTC)

You can either use sigma and pi for expressing sum and product, or you can use three dots. Why use both ? Bo Jacoby 09:43, 15 September 2005 (UTC)

note about conflation of categorical and multinomial distribution[edit]

I found this note in the opening section very confusing. The categorical distribution is a special case of the multinomial distribution so I don't consider that to be a conflation of the two things. To me this implied that it is technically incorrect. It is technically correct it just might be slightly misleading. I was going to edit this section to make this clearer but wanted to make sure I wasn't misinterpreting anything. emschorsch (talk) 22:48, 22 February 2016 (UTC)

Constraint on probabilities[edit]

This is just a matter of style, but I honestly think the page is clearer when the constraint on the probabilities is expressed in an equation next to the density. This way, the reader doesn’t have to “imagine” trials before realizing that the p's sum to one. And this density is expressed in a similar way on Wolfram’s page: Steve8675309 15:03, 4 February 2007 (UTC)

It is not just a matter of style to say "It is also required..." when you're not talking about things that are required. Michael Hardy 20:13, 4 February 2007 (UTC)

When Casella and Berger present this distribution in Statistical Inference, they show that the p’s sum to one (p.180 in 2nd edition). They also describe the trials and outcomes experiment that this distribution can model, but did not think the equation was so “redundant” that it needed to be omitted. Maybe it’s just more obvious to you than it is to Casella and Berger. Have a nice day! Steve8675309 18:06, 5 February 2007 (UTC)

No one ever said it was so redundant that it needs to be omitted. I said it was redundant to include it twice. Michael Hardy 20:45, 5 February 2007 (UTC)
...and what do you mean by saying they "showed that" the sum is 1? If it's one of the hypotheses, it's not something that one "shows" (i.e. proves). Michael Hardy 20:46, 5 February 2007 (UTC)

OK, now I've looked in Casella and Berger, 2nd edition, and they do NOT "show that" the sum is 1; rather they state as one of their hypotheses that the sum is 1. (In other words, I'm beginning to realize you were confused about language here.) Anyway, I've moved the statement to an appropriate place in the article. Once in a MASH episode, someone was reading some step-by-step instructions on how to defuse a bomb to some brave soldier essaying that task. After the completion of one of the instructions, the next instruction was "But first...". That's what it's like when you say "It is also required that..." AFTER the conclusion is stated. Michael Hardy 17:02, 6 February 2007 (UTC)

is there any reason why the Introduction and Properties sections use uppercase "Xi" to describe the number of outcomes; whereas, the Specificion section uses lowercase "xi"?Ams12358 21:28, 30 May 2007 (UTC)

Sigh........ Yes there is a very good reason. I for one, and all sensible people, would be outraged if you tried anything so absurd as to do otherwise. Your comment was very confusing: you said "Xi", when you did NOT mean the Greek capital letter Ξ (Xi) but something else. I looked all over the page for that Greek letter before realizing you had something else in mind.
The reason is of course that in one section one is speaking of random variables, thus using capital letters, and in the other section one is using dummies. This makes it poasible to write things like
The capital X and the lower-case x obviously refer to two different things.
(If I seem annoyed at the question it's because I wonder why people don't learn things like this LONG before they reach the point where they're studying this kind of article. Blame the public schools, I guess?) Michael Hardy 22:59, 30 May 2007 (UTC)
OK, now I've made the article explicit about the difference in meaning between capital X and lower-case x. We should think about the best way to make it unnecessary to keep answering this same question over and over and over again on different talk pages like this. Michael Hardy 23:08, 30 May 2007 (UTC)
Just because it's an everyday fact of life to experts like yourself, doesn't mean to say it's common knowledge. The very fact that it needs explaining repeatedly suggests it's a sticking point for many people! Bearing in mind that Wikipedia is supposed to be an encyclopedia, the vast majority of people reading an article won't be experts in it, and may sometimes stray out of their depth. This is understandable, and I guess we all do it, following up interesting leads to subjects we know little about. The kind thing to do would be to direct people to an explanation. For example, there could be a page called Probability Notations, or similar, linked from every probability page. (The probability notation on Wikipedia is actually in a bit of a mess, so such a page could also inform authors as to the appropriate symbols.) -- (talk) 18:30, 20 March 2008 (UTC)
The page Notation in probability and statistics exists, but it is writen as a glossary rather than as a guide for newcomers. This could be expanded into an explanation of the principles behind the notation. Then it could either be referenced in the text of every article or added to the Statistics box. (talk) 01:58, 12 September 2014 (UTC)

Modified the multi set coefficient in accordance to the multiset coefficient article Stephan sand 18:46, 12 June 2007 (UTC)

To fix: covariance of multinomial[edit]

Can someone fix the formula for the covariance of the multinomial for when n=1? In this case, the covariance for i≠j is obviously 0, whereas the current formula thinks -pi*pj. Thanks.Simon Lacoste-Julien (talk) 00:07, 12 February 2009 (UTC)

Simon Lacoste-Julien, this is the covariance of the random variable. Perhaps you are thinking of the sample covariance? PDBailey (talk) 03:28, 12 February 2009 (UTC)
Simon Lacoste-Julien : The formula on 2016-01-16 seems correct even when n = 1: The random vector will have one component 1, and the other components will be 0. DavidMCEddy (talk) 19:04, 16 January 2016 (UTC)

Moore–Penrose pseudoinverse of var(X)?[edit]

Can something like the binomial inverse theorem be used to obtain a simple expression for the Moore-Penrose pseudoinverse of var(X)?

var(X) is singular, because the sum of the Xi's must be n. A generalized inverse is just diag(p)-1, but this is not the Moore-Penrose pseudoinverse.

The binomial inverse theorem can be used to produce a simple formula for the inverse of any proper subset of a multinomial random vector, but it breaks down for the full vector. DavidMCEddy (talk) 19:04, 16 January 2016 (UTC)

Sampling from a multinomial[edit]

This section could be expanded a lot. Many different ways to sample from a multinomial, some of which are subtle. This paper gives a linear time/space algorithm that preprocesses a multinomial so that it's possible to draw samples in O(1) amortized time. In fact, there's some discussion on the Alias method page, which is pretty minimal. ChrisDyerCMU (talk) 06:06, 7 February 2012 (UTC)


I went to this page to find the proper notation for a multinomially distributed random variable, but i cannot find it. What i mean is the notation or whatever is the correct one. I think it should be added to the page. — Preceding unsigned comment added by (talk) 00:13, 6 March 2015 (UTC)