# Talk:Dirichlet distribution

WikiProject Statistics (Rated Start-class, Mid-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
Mid  This article has been rated as Mid-importance on the importance scale.
WikiProject Mathematics (Rated Start-class, Mid-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 Start Class
 Mid Importance
Field: Probability and statistics

## Delta

Should that be a Dirac delta, not a Kronecker delta? If it were Kronecker, then distribution would be everywhere finite, but nonzero only on a set of measure zero.

According to David J.C. MacKay and Linda C. Bauman Peto "A Hierarchical Dirichlet Language Model" it should be a Dirac delta function.

It is clearly a Dirac delta since it has to be defined over real numbers, and not integers.

## Question regarding chained Dirichlet distributions

If I draw a probability distribution $X\sim Dir(\alpha)$, and then another distribution $Y\sim Dir(rX)$ for some constant r, is the marginal distribution of Y Dirichlet? A5 15:12, 20 April 2006 (UTC)

## Dummy questions

Where it writes: "The Dirichlet distribution is conjugate to the multinomial distribution in the following sense: if", should the "X~Mult(X)" be "X~Mult(alpha)"? The notation Mult(X) doesn't make sense to me since X is a random variable, not a parameter. Chongman 01:28, 19 April 2007 (UTC)

I think, the sum which is 1 in the second paragraph should sum over all Alphas not over all x. Munibert 16:29, 15 March 2007 (UTC) Am I right or wrong?

Since, I now realise that the Alphas stand for events (from N) and the x are the probabilities I was wrong. But why does one use this terminology which seems to reverse the usage from multinominal distributions? Munibert 16:35, 15 March 2007 (UTC)

Can someone please explain what this distribution reflects? For the normal distribution, the authors go into lengths to cite examples what kinds of everyday values follow a normal distribution... cannot someone add an example like this for the dirichlet distribution? --Maximilianh 05:18, 4 June 2006 (UTC)

The "cutting up strings" text recently added provides a minimal example of where this distribution would come up. Having others would be good, of course. BSVulturis 20:47, 12 March 2007 (UTC)

Why is this distribution called a continuous distribution, when the cumulative distribution is not continuous? It should be neither continuous nor discrete. Albmont 18:58, 16 October 2006 (UTC)

It's now defined without the dirac delta function that was there before. Someone will have to check, but I think it's now defined over a set that we integrate and whose integral is continous. MisterSheik 21:36, 27 February 2007 (UTC)

What does this mean? "The characteristic function χ ensures that the density is zero unless..." The page doesn't define a characteristic function. I'm going to change it to say that the sum over the x's is defined as 1, which is what I think it means... MisterSheik 20:05, 26 February 2007 (UTC)

I just stumbled over the very first figure in this article, showing the probability densities. 1. I think it would be useful to explain $\alpha$ a bit more, like: $\alpha=(\alpha_x, \alpha_y, \alpha_z)=(6,2,2), (3,7,5), \ldots$ 2. I think the last two choices for $\alpha$ got twisted; from the pictures the order should be $(2,3,4), (6,2,6)$ rather t$(6,2,6),(2,3,4)$ (at least if my clock moves according to the standard). —Preceding unsigned comment added by 87.113.20.9 (talk) 10:25, 25 January 2008 (UTC)

Last sentence says The variance around this mean varies inversely with α0. Seems contradictory to :$\mathrm{Var}[X_i] = \frac{\alpha_i (\alpha_0-\alpha_i)}{\alpha_0^2 (\alpha_0+1)}$. —Preceding unsigned comment added by 217.133.67.206 (talk) 15:05, 4 April 2008 (UTC)

Maybe I'm fighting windmills here, but I simply can't see where the Jacobian has gone during the derivation of the Dirichlet distribution from Gamma distributions. It seems that the author has simply transformed the independent variables from Y to X and plugged those new Xs into the formula of the density f(Y1,...,Yk). But for obtaining the density of the Xs, don't you also need to multiply f with the Jacobian? — Preceding unsigned comment added by 130.60.6.54 (talk) 09:44, 16 June 2011 (UTC)

## Uniform Dirichlet Distribution

Can please somebody explain me what it is? Diego Torquemada (talk) 07:08, 23 April 2008 (UTC)

## First equation appears wrong

The first equation doesn't make sense. The product ranges from i=1 to i=K, which implies that there is an x[K] in the product. However, the left hand side makes it clear that x[K] is not present in the function. It's hard to have any confidence in the article as a whole when the defining equation is inconsistent: could someone who understands this please fix it?

213.162.107.11 (talk) 07:30, 12 June 2009 (UTC)

-- I am wondering about that myself. I think the author was trying to capture something about the fact that it is only defined on the K-1 dimensional "simplex" -- i.e. the vars x1 ... xK have to sum to 1. I am going to try to figure this out and update the article. --Ivan --74.56.167.228 (talk) 16:24, 19 November 2009 (UTC)

@Schmock for 2 vars (x) and (1-x) makes sense. Notice the left hand side of the Beta distribution funciton just shows x so it is all Kosher. Now in the dirichlet definition you have x_1 ... x_{K-1} in the LHS and x_K appears on the right hand side. This equation is syntactically wrong and doesn't make sense to the user until they read the following paragraph.

With the proposed change I made, equations stands on its own and we clarify that we are over a symplex and x_1 ... x_K must sum to 1. --74.13.201.69 (talk) 19:52, 5 December 2009 (UTC)

## math style

The agreed-upon math style is to not use math tags around things like $K$-dimensional vector. It's harder to read in certain browsers. Crasshopper (talk) 00:48, 23 January 2011 (UTC)

## Scaled Dirichlet distribution

So. I think the assumptions in the Wiki page are not correct. I believe that the provided PDF is only valid when equal scaling is applied to all shape parameters, i.e. $a_1= a_2=\cdots=a_k$. In the case where you want to introduce biasing in your distribution towards different variables, then you will need to use a scaled PDF. See Compositional Data Analysis: Theory and Applications (Section 10.2) for more information. — Preceding unsigned comment added by Datahipster (talkcontribs) 22:42, 26 November 2011 (UTC)

## Support

It is stated that each x_i must lie in (0,1) for x to be in the support. However, the support is, by definition, a closed set, so, for example, x = (0,1,0) should be in the support, right? Anyway, even some definition of "support" allows for this use, it is inconsistent with the infobox, which may be confusing to nonmathematicians like me. Franknarf11 (talk) 06:23, 3 October 2013 (UTC)

## error in right panel

In the right panel, in "support", it says: Σxi = 1. Shouldn't be Σxi < 1? At least this is how appears in the Mathematica help — Preceding unsigned comment added by Olimak9000 (talkcontribs) 17:17, 6 January 2014 (UTC)