This is an old revision of this page, as edited by 128.100.109.6(talk) at 19:40, 29 January 2010(This doesn't make sense, there is no x[K] in the parameter list, but it is in the sum.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Revision as of 19:40, 29 January 2010 by 128.100.109.6(talk)(This doesn't make sense, there is no x[K] in the parameter list, but it is in the sum.)
for all x1, ..., xK–1 > 0 satisfying x1 + ... + xK–1 < 1, where xK is an abbreviation for 1 – x1 – ... – xK–1. The density is zero outside this open (K − 1)-dimensional simplex.
The mode of the distribution is the vector (x1, ..., xK) with
The Dirichlet distribution is conjugate to the multinomial distribution in the following sense: if
where βi is the number of occurrences of i in a sample of n points from the discrete distribution on {1, ..., K} defined by X, then
This relationship is used in Bayesian statistics to estimate the hidden parameters, X, of a categorical distribution (discrete probability distribution) given a collection of n samples. Intuitively, if the prior is represented as Dir(α), then Dir(α + β) is the posterior following a sequence of observations with histogramβ.
From the above equation, it is obvious that the derived probability density function is actually a joint distribution of two independent parts, a Beta distributed part and a Dirichlet distributed part. By trivially integrating out , the result is obvious.
Related distributions
If, for
then
and
Though the Xis are not independent from one another, they can be seen to be generated from a set of independent gamma random variables. Unfortunately, since the sum is lost in forming X, it is not possible to recover the original gamma random variables from these values alone. Nevertheless, because independent random variables are simpler to work with, this reparametrization can still be useful for proofs about properties of the Dirichlet distribution.
The following is a derivation of Dirichlet distribution from Gamma distribution.
Let Yi, i=1,2,...K be a list of i.i.d variables, following Gamma distributions with the same scale parameter θ
Finally, we get the following Dirichlet distribution
where XK is (1-X1 - X2... -XK-1)
Multinomial opinions in subjective logic are equivalent to Dirichlet distributions.
Random number generation
Gamma distribution
A fast method to sample a random vector from the K-dimensional Dirichlet distribution with parameters follows immediately from this connection. First, draw K independent random samples from gamma distributions each with density
and then set
Marginal beta distributions
A less efficient algorithm[2] relies on the univariate marginal and conditional distributions being beta and proceeds as follows. Simulate from a distribution. Then simulate in order, as follows. For , simulate from a distribution, and let . Finally, set .
Intuitive interpretations of the parameters
String cutting
One example use of the Dirichlet distribution is if one wanted to cut strings (each of initial length 1.0) into K pieces with different lengths, where each piece had a designated average length, but allowing some variation in the relative sizes of the pieces. The α/α0 values specify the mean lengths of the cut pieces of string resulting from the distribution. The variance around this mean varies inversely with α0.
Pólya's urn
Consider an urn containing balls of K different colors. Initially, the urn contains α1 balls of color 1, α2 balls of color 2, and so on. Now perform N draws from the urn, where after each draw, the ball is placed back into the urn with an additional ball of the same color. In the limit as N approaches infinity, the proportions of different colored balls in the urn will be distributed as Dir(α1,...,αK).[3]
For a formal proof, note that the proportions of the different colored balls form a bounded [0,1]K-valued martingale, hence by the martingale convergence theorem, these proportions converge almost surely and in mean to a limiting random vector. To see that this limiting vector has the above Dirichlet distribution, check that all mixed moments agree.
Note that each draw from the urn modifies the probability of drawing a ball of any one color from the urn in the future. This modification diminishes with the number of draws, since the relative effect of adding a new ball to the urn diminishes as the urn accumulates increasing numbers of balls. This "diminishing returns" effect can also help explain how large α values yield Dirichlet distributions with most of the probability mass concentrated around a single point on the simplex.
^Connor, Robert J. (1969). "Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution". journal of the American statistical association. 64 (325): 194–206. doi:10.2307/2283728.
^A. Gelman and J. B. Carlin and H. S. Stern and D. B. Rubin (2003). Bayesian Data Analysis (2nd ed.). p. 582. ISBN1-58488-388-X.
^Blackwell, David (1973). "Ferguson distributions via Polya urn schemes". Ann. Stat. 1 (2): 353–355. doi:10.1214/aos/1176342372.