Talk:Gamma distribution: Difference between revisions

Content deleted Content added

Inline

Revision as of 23:31, 17 March 2013

Statistics B‑class High‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
B	This article has been rated as B-class on Wikipedia's content assessment scale.
High	This article has been rated as High-importance on the importance scale.

Mathematics B‑class High‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
B	This article has been rated as B-class on Wikipedia's content assessment scale.
High	This article has been rated as High-priority on the project's priority scale.

KL Divergence

Anyone mind if I rephrase the Kullback-Leibler Divergence? The pdf is defined above in terms of shape k and scale theta, but the KL divergence uses theta as shape, and inverse scale beta = theta^-1. Pretty confusing. — Preceding unsigned comment added by 152.3.196.217 (talk) 20:45, 21 September 2011 (UTC)[reply]

Important Expectations

I've found it important to know some expectations which I couldn't find on this page. For example, for the $g(x;\alpha ,\beta )$ inverse scale parametrization,

E[\ln {x}]=\psi (\alpha )-\ln {\beta }

Where should this expectation go? Bazugb07 (talk) 16:58, 20 May 2009 (UTC)[reply]

Implementing cdf

So I'm trying to get the cdf into gnuplot to generate a graph except it's not working right

cgamma(x,k,t) = igamma(k, x/t) / gamma(k)

set xtics 0,2
set ytics 0,0.1
set samples 1001
set terminal postscript enhanced color solid lw 2 "Times-Roman" 27
set output

plot [0:100] \
    cgamma(x,1,2) title "{/Times-Italic k} = 1, {/Symbol q} = 2", \
    cgamma(x,2,2) title "{/Times-Italic k} = 2, {/Symbol q} = 2", \
    cgamma(x,3,2) title "{/Times-Italic k} = 3, {/Symbol q} = 2", \
    cgamma(x,4,2) title "{/Times-Italic k} = 4, {/Symbol q} = 2", \
    cgamma(x,5,2) title "{/Times-Italic k} = 5, {/Symbol q} = 2", \
    cgamma(x,5,.5) title "{/Times-Italic k} = 5, {/Symbol q} = 0.5"

but this doesn't give me the correct plots

the first two tend to 1
third tends to 0.5
fourth tends to like .175
fifth & sixth tend to the same value at about 0.05

The cgamma function above is the incomplete gamma function over the gamma function...as shown in the article. Which is wrong: the cdf in the article or my gnuplot setup? Cburnett 09:00, 10 Mar 2005 (UTC)

I use the following gnuplot definitions (using names similar to those used by R/Splus):

_ln_dgamma(x, a, b) = a*log(b) - lgamma(a) + (a-1)*log(x) - b*x
dgamma(x, shape, rate) =\
 (x<0)? 0 :\
 (x==0)? ((shape<1)? 1/0 : (shape==1)? rate : 0) :\
 (rate==0)? 0 :\
 exp(_ln_dgamma(x, shape, rate))
pgamma(x, shape, rate) = (x<0)? 0 : igamma(shape, x*rate)

The problem is that "incomplete gamma function" is ambiguous, referring sometimes to the regularized incomplete gamma function. --MarkSweep 16:50, 10 Mar 2005 (UTC)

PDF/CDF confusion

Ok, so I wrote gnuplot code taken directly from the pdf listed in the article. It's under commons:Image:Gamma distribution pdf.png (see the cdf as well that uses MarkSweep's code from above) but if I take your pdf implementation I get much different curves. The CDF drastically does not agree (see yellow line).

If I can't take the PDF & CDF from the article and get correct plots then I think we need to change the article (even if it's gnuplot that's wrong and explain how some plotters could implement, say, the incomplete gamma function differently). Cburnett 19:47, 10 Mar 2005 (UTC)

I was just about to point out that the PDFs visually don't integrate to the CDFs. I also realize that the gnuplot snippet I posted above uses the same parameterization that R uses, which I suspect is different from the first parameterization used in the article. There is always the issue whether one should use a scale parameter directly or use its inverse instead. I suspect that's the underlying confusion here. --MarkSweep 20:07, 10 Mar 2005 (UTC)

Ah, there we go. I was using theta and you were using beta. I inverted the rate parameter and got the matching CDF. Uploading new one. Cburnett 20:13, 10 Mar 2005 (UTC)

There. *sigh* Finally. :) Cburnett 20:23, 10 Mar 2005 (UTC)

Aargh. I was working on the same plots and just replaced both your versions with new matching PDFs and CDFs. Note that the width is now 1300px, so that scaling it down to 325px will be easier or look better. I used fewer examples to avoid cyan-on-white and yellow-on-white. --MarkSweep 20:40, 10 Mar 2005 (UTC)

Related to this, the mgf is expressed in terms of alpha and beta too. What do you think of adding a note about alternative parameterizations and their uses (see e.g. the more convenient parameterization in Exponential distribution#Bayesian inference)? --MarkSweep 20:47, 10 Mar 2005 (UTC)

mgf updated to match the rest. I don't really care what parameterization we use as long as it's consistent. I prefer greek letters for paramters (just cuz I guess....) so I'd go for the alpha/beta notation over the k/theta. Either way.... More plots to generate if we change it. Cburnett 21:44, 10 Mar 2005 (UTC)

alternative parameterization

Isn't there an altertanative parameterization for the gamma? Is there a general method wikipedia deals with these? --Pdbailey 17:47, 16 Apr 2005 (UTC)

Okay, foot in mouth... I was confused. --Pdbailey 18:04, 16 Apr 2005 (UTC)

There is in fact an alternative parametrization ... but I'm too lazy to look it up now. Rp 01:46, 6 May 2006 (UTC)[reply]

Relation to maxwell-bolzmann

I removed the following text which doesn't make any sense: " $Y\sim \mathrm {Maxwell} (\beta )$ is a Maxwell-Boltzmann distribution if $X\sim \mathrm {Gamma} (\alpha =3/2,\beta )$ ." I couldn't figure out how to fix it by reading Maxwell-Boltzmann distribution A5 23:13, 16 April 2006 (UTC)[reply]

It's fixed (but check it, please). PAR 00:30, 17 April 2006 (UTC)[reply]

Real-world examples

It would be interesting to see some real-world examples of gamma distributions; the article is a bit technical at the moment. It would also be nice to learn why they might be distributed in that way. I know for instance that reaction times in psychological experiments are usually gamma-distributed (rather than the normal distribution that is assumed in the statistical tests based on them) but I'm not sure why. Junes 10:30, 25 May 2006 (UTC)[reply]

Yes, more insight into the function in plain language (vs formulae) would be very nice!

Gamma distribution is used to model claim severity in the general insurance industry 195.28.231.13 11:41, 20 February 2007 (UTC)[reply]

I also would like more plain language on what generates such distributions and when they are found. Some can be found in the Wikipedia article on the exponential distribution which is a special case of the gamma. It is dscribed as the natutal distribution of intervals between events that occur at a constant random rate(for example, phone calls within a certain period where the rate is steady. Timothy Mak on the AllStat list has also told me that gamma is the expected distribution of time until Bold textnBold text events have occurred.

Also on the AllStat list(in the archive) I have read that on "theoretical grounds" the distribution of rates of return across companies could be Bold textexpectedBold text to be gamma distributed.

I also have read James V. Bradley's article on the "L-shaped" distribution of response times, among other things, and the implications for normality-assuming statistics. Of course some just assume that the Bold textsamplingBold text distribution, the distribution of sample means around the population one, is normal, which is often true even when the actual data are distributed nonnormally. I have also heard of the insurance example.

I am currently working with an L-shaped distribution of the number of times a court case is "distinguished" to those it is "folllowed". I think it might be a gamma, and specifically an exonential, distribution, but am having trouble finding a way to test this hypothesis.

Yours Sincerely,

Alan E. Dunne24.235.165.89 15:32, 26 March 2007 (UTC)[reply]

Kotz and Norman Lloyd Johnson in Continuous Univariate Distributions 1970, chapter 17, give several examples Of the "time-to-event" and "insured casualty" types and also fibre-diameter measurements of wool tops, "internal comparisons in multipurpose experiments" and unspecified "medical applications"

Yours Sincerely,

Alan E. Dunne24.235.165.89 15:45, 2 April 2007 (UTC), With Respect[reply]

Further work with my ratio of court citation types has shown that it does not approximate the exponential but rather a gamma distribution with k less than 1 (a bit more than 0.5) I would be interested to hear what this might mean

Yours Sincerely

Alan E. Dunne

Cite? Approximation when some x = 0?

The approximation provided for k is very useful in general, but what is one to do when $x_{i}=0$ , for some i?

Also, it would be nice to have a cite, here, but neither of the listed references has this formula, AFAICT.

Ken K 30 Oct 2006

Theta or 1/theta ?

To me it seems that the in the probability density function we should have $\theta \,^{-k}$ instead of $\theta \,^{k}$ since all the other characteristics that are shown seem to be calculated with $1/\theta \,$ . I checked http://mathworld.wolfram.com/GammaDistribution.html to verify this but I would be happier if a more experienced wiki-editor/mathematician changed the article. Sorry if I am incorrect about this.

Artagas 20 Nov 2006

Both parameterizations exist, and both are already covered in the article (someone introduced a mistake recently, now corrected). In the first version, with parameters

(k,\theta )

, the parameter θ is a scale parameter. The second parameterization, in terms of

(\alpha ,\beta )

, uses an inverse scale parameter and has advantages when the Gamma distribution is used as a conjugate prior (see e.g. exponential distribution#Bayesian inference). --MarkSweep (call me collect) 02:59, 20 November 2006 (UTC)[reply]

Graphs of pdf misleading

The graph presented on the main page for k=1, $theta=2$ is misleading. The function actually diverges to infinitiy as x tends to zero under these parameters. This is an interesting property of the gamma distrubution and should be indicated in the graph.

isn't it just an exponential distribution in the case of k=1? (in other words, what you say is not true.) MisterSheik 02:07, 27 March 2007 (UTC)[reply]

Confusion about parameter names

With Respect

In the article gamma is a function of x, k, and theta, or alternatively paramtrized by alpha and beta but there seem to be many other names floating around. Gammma is sometimes said to have parameters r and lambda, or n and lambda. I have seen alpha and beta called A and B. There is also a 1/lambda parameter. Which of alpha or beta is 1/theta is and which parameter equals one are also sources of confusion. I have also seem k(I think) called kappa.

There are also references on the Allstat list and elsewhere to "three-parameter gamma" and a constant a.

Yours Sincerely,

Alan E. Dunne

24.235.165.89 15:55, 2 April 2007 (UTC)[reply]

They're just different names of the same things. Except, three-parameter gamma, which just has a location parameter, and is a trivial modification. I think it would be more confusing to include it. MisterSheik 17:44, 17 April 2007 (UTC)[reply]

infinitely divisible

It would be nice to add that the Gamma distribution is infinitely divisible, and to provide its L\'evy measure

Yes... MisterSheik 17:12, 17 April 2007 (UTC)[reply]

exponential family

In the article it says

The Gamma distribution is a two-parameter exponential family ...

Shouldn't it rather be "... is a one-parameter exponential family ..." ? I am aware that the Gamma distribution itself has two parameters, but in the context of exponential families, the number of parameters has a different meaning, in my opinion. Unfortunately this distinction is not made in the article on exponential families (maybe it should be?). Can another mathematician/statistician verify this? 134.60.66.52 14:37, 17 April 2007 (UTC)[reply]

No, the exponential family also has two parameters :) MisterSheik 17:06, 17 April 2007 (UTC)[reply]

Yes, meanwhile I noticed that too. I apologize for my mistake, I got confused by a particular setting where one of the parameters was considered a nuisance parameter, effectively making it a one parameter exponential family. Sorry again. --134.60.66.52 12:30, 18 April 2007 (UTC)[reply]

No worries... :) MisterSheik 12:41, 18 April 2007 (UTC)[reply]

Image

It would be useful if the images at the top of the page would include variations of θ for constant k. Currently, no two curves have the same k value. --EyrianAtWork 13:50, 11 July 2007 (UTC)[reply]

I disagree. θ is a scale parameter, and the curve doesn't change shape with θ. A plot in the scale parameter article might be useful though. -- Aastrup 20:19, 18 July 2007 (UTC)[reply]

Gamma or Γ?

I must admit that most of my Statistics books aren't written in English, but when it comes to the two ways of notation

X\sim \Gamma (k,\theta )\,\,\mathrm {or} \,\,X\sim {\textrm {Gamma}}(k,\theta )

it is clearly the first which is used most often. This is why I'm changing the notation in the article. Aastrup 21:58, 18 July 2007 (UTC)[reply]

The use of \Gamma is confusing as it refers to the gamma function either Gamma(a,b) or Ga(a,b) are more common.

Gentry White —Preceding unsigned comment added by 152.1.95.168 (talk) 19:20, 15 July 2008 (UTC)[reply]

Do not use the gamma function to refer to a gamma distribution. There are to many dang gamma symbols on the page which makes it confusing which is being referred to. Gammma(a, b) is obviously the most clear. —Preceding unsigned comment added by 69.204.243.36 (talk) 22:08, 7 December 2008 (UTC)[reply]

Another plea to restrict

\Gamma

to the gamma function (as it's universal throughout mathematics) and use Gamma for the distribution.

A similar thing happens with the beta distribution. Since its parameters are conventionally

\alpha

and

\beta

, it's just unacceptable to refer to the distribution as

\beta (\alpha ,\beta )

! Much better to write

\mathrm {Beta} (\alpha ,\beta )

, and similarly

\mathrm {Gamma} (\cdot ,\cdot )

(choose your parameters). --88.109.216.145 (talk) 23:02, 5 November 2009 (UTC)[reply]

The top of the webpage says not to confuse the "Gamma" distribution with the Gamma function. Throughout the webpage, however, the single-parameter Failed to parse (unknown function "\math"): {\displaystyle Gamma<\math> function (greek symbol) is used but nowhere is this explained as the "Gamma" function referred to at the top. This should be explained, and a link to the wikipedia page for the Gamma function should be provided. ~~~~ ==Proof of some of the basic stuff== I've made this little example on the page concerning [[Characteristic function (probability theory)|charactericstic function]]s :The [[Gamma distribution]] with scale parameter ''θ'' and a shape parameter ''k'' has the characteristic function :<math>(1 - \theta\,i\,t)^{-k}\,\!}

Now suppose that we have

X\sim \Gamma (k_{1},\theta ){\mbox{ and }}Y\sim \Gamma (k_{2},\theta )

with X and Y independent from each other, and we wish to know what the distribution of X + Y is. The characteristic functions are

\varphi _{X}(t)=(1-\theta \,i\,t)^{-k_{1}},\,\qquad \varphi _{Y}(t)=(1-\theta \,i\,t)^{-k_{2}}

which by indedendence and the basic properties of characteristic function leads to

\varphi _{X+Y}(t)=\varphi _{X}(t)\varphi _{Y}(t)=(1-\theta \,i\,t)^{-k_{1}}(1-\theta \,i\,t)^{-k_{2}}=\left(1-\theta \,i\,t\right)^{-(k_{1}+k_{2})}

This is the characteristic function of the gamma distribution scale parameter θ and shape parameter k₁ + k₂, and we therefore conclude

X+Y\sim \Gamma (k_{1}+k_{2},\theta )

The result can be expanded to n independent gamma distributed random variables with the same cale parameter and we get

\forall i\in \{1,\ldots ,n\}:X_{i}\sim \Gamma (k_{i},\theta )\qquad \Rightarrow \qquad \sum _{i=1}^{n}X_{i}\sim \Gamma \left(\sum _{i=1}^{n}k_{i},\theta \right)

I think it might be nice to have on the gamma districution page a well. Any thoughts? - Aastrup 12:00, 28 July 2007 (UTC)[reply]

sir i need some information about statistical distributions

sir please tell me the applications from real life and some solved examples of {Gamma Distribution,Weibul Dist,and exponential distribution}sir i will b very thank full to you.u can send me these information on "a_smile4me@yahoo.com".i will wait ur response. —Preceding unsigned comment added by 58.65.201.212 (talk) 18:48, 20 October 2007 (UTC)[reply]

Generating variables

Can anyone review the changes made by ClaudeLo? I wrote the original algorithm (rather adapted and fixed it based on some book) but no longer study math and my work is not math-related, so I do not quite trust my skills. -- Paul Pogonyshev (talk) 00:04, 22 November 2007 (UTC)[reply]

I'm not sure if the current version is these changes, but the current algorithm does not work for me. It consistently generates too many extremely high values. I also can't tell what the point of V1 is in that algorithm; it chooses between two branches, but there should be no need to do that in an acceptance-rejection method.

Here is a quickly hacked together C implementation. It's easily verified due to the simple rejection criterion:

double rand_flat() {
    double val = (double) rand() / (double) RAND_MAX;
    return val;
}

double gamma(const double k, const double theta)
{
    double delta = k - trunc(k);
    double x, y;
    bool done = 0;
    do {
        //   Generate x and y — independent uniformly distributed on (0, 1] variables.
        x = rand_flat();
        y = rand_flat();
        // Accept if y < P(x)
        if (y < (pow(x, delta - 1) * exp(-x) / tgamma(delta))) { done = 1; }
    } while (!done);
 
    double rest = 0;
    for (int i = 0; i < trunc(k); i++) {   
        rest += log(rand_flat());
    }

    double val = (x - rest) / theta;
    return val;
}

--Dyfrgi (talk) 09:26, 6 January 2010 (UTC)[reply]

This code above is bogus for the simple reason that this samples from the area below the probability distribution but assumes that the density lives with the unit square but that's very false. Close to zero the PDF far exceeds 1 and appreciable mass is distributed on x>1. Sorry! —Preceding unsigned comment added by 140.247.239.43 (talk) 05:04, 6 March 2010 (UTC)[reply]

Four standard references to reliable algorithms for generating gamma variates have been inserted above the explicit algorithm that is given without proof or source. It does not appear to be the algorithm in Ahrens and Dieter. Can anyone identify a source for the algorithm given? Does the article need to include an explicit (but unsourced) algorithm when one is available in the linked PDF of Ahrens and Dieter? Mathstat (talk) 11:49, 26 February 2012 (UTC)[reply]

Parameter Estimation

Is the MLE parameter estimation for theta wrong? should it be k*xbar instead of xbar/k? --67.109.70.3 (talk) 01:41, 19 June 2008 (UTC)[reply]

Nah, it seems correct. Recall that the mean = k * theta and is estimated by xbar. So it makes sense that theta is estimated by xbar/k. --Fangz (talk) 11:27, 19 June 2008 (UTC)[reply]

I think have found a mistake of the same kind in Chow, V. T. Maidment, D. and Mays, L. W. "Applied Hydrology" 1988. Mc Graw Hill, International Edition. In table 11.5.1 the scale parameter is named lambda and corresponds to the inverse of theta, the scale parameter described in this article. It is stated that the estimator for lamba is sx/sqrt(k), where sx is the standard deviation and k is the shape parameter. Being lambda the inverse of theta, it estimation should be sqrt(k)/sx. Could anyone confirm this? J. (talk) 12:33, 19 December 2008

Reference for entropy

Can someone post a reference for the entropy computation for the gamma density ? It's mentioned on the side table without a reference. —Preceding unsigned comment added by 128.95.224.31 (talk) 17:55, 13 October 2008 (UTC)[reply]

as in the reference from the Differential_entropy article.

Lazo, A. and P. Rathie. On the entropy of continuous probability distributions Information Theory, IEEE Transactions on, 1978. 24(1): p. : 120-122 129.242.167.37 (talk) 15:54, 26 October 2011 (UTC)[reply]

Right margin too near to tables in some math articles

(Question moved from MediaWiki talk:Common.css; this would have been best asked at the Wikipedia:Help desk)

With Firefox 3.0.7 on Ubuntu, when I look at Gamma distribution, I see that the right margin of the left text column is too near to the right column where the picture and the table are. There should be some more margin between the two columns. --Pot (talk) 08:46, 11 March 2009 (UTC)[reply]

This article uses template{{Probability distribution}}. Look at the documentation for the template and you will see it has a parameter for marginleft. I suggest you start with a value of 2em and fudge from there. The default margin is 1em; if you think that should be changed, discuss it on the template talk. --—— Gadget850 (Ed) ^talk - 10:47, 11 March 2009 (UTC)[reply]

Relation to dirac distribution

I'm pretty sure it shouldn't be in the article, but it's interesting anyway. The Gamma(α,c/α) distribution tends to a dirac distribution centered around c as α->infinity. This is easily visible by taking β=4/α (so that αβ=4), for large α it tends to dirac distribution. (or you can see that the variance is c^2/α, which tends to zero in such a process). —Preceding unsigned comment added by 131.155.212.223 (talk) 13:49, 4 November 2010 (UTC)[reply]

Gamma distribution in rainfall analysis

There is a comment on the main page that a citation is needed in relation to the use of the gamma distribution for rainfall analysis.

There is a good discussion of the use of the gamma distribution to fit monthly rainfall data on p303-307 of Jones et al. (2009)

Jones, O., Maillardet, R. and Robinson, A. (2009) Introduction to scientific programming and simulation using R. CRC Press. ^[1]

Tony.ladson (talk) 01:29, 1 September 2011 (UTC)[reply]

Distinguishing between k and alpha

What's the purpose of using two different letters for the shape parameter in the two different representations of the distribution? I think it just adds clutter.

Domminico (talk) 19:12, 25 April 2012 (UTC)[reply]

MLE

Rather than waving our swords around showing how clever we are at solving for the MLE under the standard parameterisation, would it not be simpler to present the alpha, mu parameterisation and solve that instead? — Preceding unsigned comment added by 90.218.192.173 (talk) 19:56, 11 October 2012 (UTC)[reply]

^ Jones, O., Maillardet, R. and Robinson, A. (2009) Introduction to scientific programming and simulation using R. CRC Press.

[1] Jones, O., Maillardet, R. and Robinson, A. (2009) Introduction to scientific programming and simulation using R. CRC Press.

[1]

@@ Line 192: / Line 192: @@
 :A similar thing happens with the beta distribution. Since its parameters are conventionally <math>\alpha</math> and <math>\beta</math>, it's just unacceptable to refer to the distribution as <math>\beta(\alpha, \beta)</math>! Much better to write <math>\mathrm{Beta}(\alpha, \beta)</math>, and similarly <math>\mathrm{Gamma}(\cdot, \cdot)</math> (choose your parameters). --[[Special:Contributions/88.109.216.145|88.109.216.145]] ([[User talk:88.109.216.145|talk]]) 23:02, 5 November 2009 (UTC)
+The top of the webpage says not to confuse the "Gamma" distribution with the Gamma function.  Throughout the webpage, however, the single-parameter <math>Gamma<\math> function (greek symbol) is used but nowhere is this explained as the "Gamma" function referred to at the top.  This should be explained, and a link to the wikipedia page for the Gamma function should be provided.
+~~~~
 ==Proof of some of the basic stuff==