# Talk:Exponential distribution

WikiProject Mathematics (Rated B-class, High-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 High Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.
WikiProject Statistics (Rated B-class, Top-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
Top  This article has been rated as Top-importance on the importance scale.

## The graphs

I don't understand how to read the two graphs at the top. Whats on the axis? —Preceding unsigned comment added by 87.48.41.14 (talk) 12:28, 20 January 2010 (UTC)

In both graphs, the thing on the horizontal axis is the values of the random variable that is exponentially distributed. In the second graph, the thing on the vertical axis is probability—that of being less than the corresponding value on the horizontal axis. In the first graph, the unit is probability divided by the units of measurement of the thing on the horizontal axis. See probability density function. Michael Hardy (talk) 19:01, 20 January 2010 (UTC)

Why not the same axis titles as used for the normal distribution? The P on the y-axis of the pdf plot is misleading (densities vs. chances). — Preceding unsigned comment added by 134.58.253.57 (talk) 14:54, 10 April 2013 (UTC)

## moment-generating functions

I noticed that the Gaussian and Uniform distributions have a section on the mgf, but this page lacks such a section. 75.142.209.115 (talk) 07:21, 30 October 2008 (UTC)

Is there a reason the moment generating function of the exponential is written as

$\left(1-\frac{t}{\lambda}\right)^{-1}$

instead of the more familiar, and easy to understand

$\frac{\lambda}{\lambda -t}$

in the sidebox? --129.34.20.19 (talk) 20:48, 11 August 2010 (UTC)

## Exponential

I understand that the infobox text reads "Name: Exponential" and that makes sense, but what is displayed is just "Exponential" and that makes no sense, its an adjective thats missing a noun to modify. Alternatively, we could change the infobox to display "{{{name}}} distribution"? PAR 10:31, 1 Apr 2005 (UTC)

Take the discussion here please: Template talk:Probability distribution Cburnett 15:04, 1 Apr 2005 (UTC)

Does this page need to be cited more for the technical parts? I ask specifically because I am unable to explain the memoryless property to somebody, and would like to cite more resources. —Preceding unsigned comment added by 151.190.254.108 (talk) 12:58, 2 June 2008 (UTC)

## Rayleigh & Expo fishyness

Something's fishy about the formula relating the Raleigh and the exponential distribution. The parameters λ and β have to be related somehow. 141.140.6.60 22:36, 7 May 2005 (UTC)

That's right. I will fix it right now. Also, you seem to know this subject. How about some help with the other Category:probability distributions? We could sure use it. We are trying to put full infoboxes in all of them. PAR 23:57, 7 May 2005 (UTC)
Yup, I'll continue to do little things here and there. I forgot to log in for the edits to this article. AxelBoldt 00:53, 8 May 2005 (UTC)
I got the relationship from Statistical Inference by Casella & Berger (ISBN 0534243126) but it I don't think it gave the parameter relationship (it might be lambda = beta) but I'd have to pull out some paper and confirm by transforming it. Unfortunately, I'm in the middle of moving and it's boxed up. Cburnett 06:36, May 8, 2005 (UTC)

## Email query

From an email to the Foundation:

"Could [you] explain why the Exponential distribution can be generated by taking the natural log of the uniform distribution? Is there any analytical proof? The author cited the quantile function, but it is not clear how. Thanks.

The part of the article quoted was:

"Given a random variate U drawn from the uniform distribution in the interval (0,1], the variate has an exponential distribution with parameter £f. This follows from the form of the quantile function given above and yields a convenient way to produce exponentially distributed values using a random number generator on a computer, for instance to conduct simulation experiments."

I've directed the correspondent here - can anyone help? Thanks -- sannse (talk) 19:33, 15 January 2006 (UTC)

This may be the wrong place. I've added a pointer to inverse transform sampling method here, and the (rather trivial) formal proof for the general case there. --MarkSweep (call me collect) 23:37, 15 January 2006 (UTC)

## Comment

Sorry if I'm doing this wrong, I'm a wikipedia newbe. I just wanted to mention that I got confused for a while in the Bayesian inference section because the Gamma distribution is not parameterized the same way as on the Gamma distribution page. I'm not an expert, there might be a good reason for this which is over my head. If not, it would probably be nice to change it to make it more consistant with the gamma distribution page so that the resulting likelyhood is

$p(\lambda) = \mathrm{Gamma}(\lambda \,;\, \alpha + n, (\beta + n \overline{x})^{-1}).$

--BenE 19:20, 14 July 2006 (UTC)

## Confusion with alternative spec.

Well, the text takes non-alternative spec., but I see at least the last two formulas of Related distributions are using alter. spec. So, I have modified them to non-alter. spec. Ping my talk page if you need a discussion. Thanks. --Amr 12:04, 1 September 2006 (UTC)

## Occurrence and applications - Gamma distribution

"In queueing theory, the inter-arrival times (i.e. the times between customers entering the system) are often modeled as exponentially distributed variables. The length of a process that can be thought of as a sequence of several independent tasks is better modeled by a variable following the gamma distribution (which is a sum of several independent exponentially distributed variables)."

This is quoted from the current page. Isn't the exponential distribution just a special case of the gamma distribution (alpha=1, beta=1/lambda)? If so, it doesn't make sense to me to say that gamma distributions are better when the exponential is itself a gamma distribution. Should this section of text be changed or clarified? --JRavn talk 22:09, 7 February 2007 (UTC)

## Sum of exponential distributions

It currently says that sum of exponential distributions is distributed as Gamma(n;lambda). Looks like it should be Gamma(n;1/lambda) in order to be consistent with the page about the Gamma distribution. 128.84.154.137 20:39, 28 October 2007 (UTC)

Where does it say that? I can't find it. Michael Hardy 22:57, 28 October 2007 (UTC)
Related distributions, the one before the last. 128.84.154.13 23:46, 11 November 2007 (UTC)
I was just about to write a similiar comment. It sure looks like an inconsistency to me. —Preceding unsigned comment added by 83.130.133.199 (talk) 18:03, 28 December 2007 (UTC)

Yes, I think this is false too. Should be 1/lambda —Preceding unsigned comment added by 129.67.134.154 (talk) 15:42, 29 January 2008 (UTC)

I think the sum of n variables i.i.d. exponential distributions with mean lambda should give a gamma distribution with parameter n and lambda rather than 1/lambda. Would appreciate a proof if otherwise.

It would probably be best to mention exponential distribution with mean 1/lambda. I've seen both mean lambda and mean 1/lambda distributions in texts, and it should be clarfied somewhere in the article.--161.7.96.220 (talk) 22:17, 28 June 2011 (UTC)

## Language: Model? Or something else?

"used to model" is used three times on this page. I think that this wording is a bit weak and could be strengthened. "Used to model" implies that the distribution is used out of convenience or an approximate fit. In all three cases where the phrase is used (such as with a Poisson process), the distribution can be derived mathematically from the basic assumptions. Thus, those assumptions are the modeling assumption and the exponential distribution is a result of the model, not really part of the model itself. Contrast this with, for example, the use of a beta distribution either as a Bayes prior or as the family for a parametric model. If these distributions are used mainly because of convenience or goodness of fit, then they are "used to model". I think the exponential distribution is a bit of a different animal, however. I'm going to change this but you may want to check over my changes because I'm not exactly sure what words/phrases to use. Cazort 14:55, 4 November 2007 (UTC)

## Shift parameter?

Why not discussing the exponential distribution with a rate parameter $\lambda$ and a shift parameter a, especially, focusing on the use of finding unbiased estimators for these parameters? —Preceding unsigned comment added by 84.83.33.64 (talk) 15:26, 11 February 2009 (UTC)

## index of the minimum

I am thinking to add the following to the section on Distribution of the minimum of exponential random variables

The index of the variable which achieves the minimum is distributed according to the law.
$\Pr(X_k=\min\{\,X_1,\dots,X_n\,\})=\frac{\lambda_k}{\lambda_1+\ldots+\lambda_n}$

Can someone verify my calculations or find a reference to this fact? (Igny (talk) 18:47, 1 May 2009 (UTC))

First find the conditional probability that X1 ≤ X2 and ... and X1 ≤ Xn given the value of X1.
That is a random variable that is just a function of X1. Its expected value is the unconditional probability that you seek (see law of total probability). The expected value is a simple integral. That proves the result.
But I'd rather write
${}+\cdots+{}\,$
than
${}+\ldots+{}\,$
Michael Hardy (talk) 00:25, 2 May 2009 (UTC)

OK, I can comment in a more leisurely way now. First we have

$\Pr(X_k > x) = e^{-\lambda_k x}. \,$

So (using independence)

$\Pr(X_k > X_1 \mid X_1) = e^{-\lambda_k X_1}\text{ for }k\in\{\,2,\dots,n\,\}. \,$

Then, again using independence,

\begin{align} \Pr(X_2 > X_1\text{ and }\dots\text{ and }X_n > X_1 \mid X_1) & = e^{-\lambda_2 X_1}\cdots e^{-\lambda_n X_1} \\ & = e^{-(\lambda_2+\cdots+\lambda_n)X_1}. \end{align}

Therefore, by the law of total probability,

\begin{align} & {} \quad \Pr(X_2 > X_1\text{ and }\dots\text{ and }X_n > X_1) = E\left( e^{-(\lambda_2+\cdots+\lambda_n)X_1} \right) \\ & = \int_0^\infty e^{-(\lambda_2+\cdots+\lambda_n)x} f_{X_1}(x)\,dx = \int_0^\infty e^{-(\lambda_2+\cdots+\lambda_n)x} \lambda_1 e^{-\lambda_1 x}\,dx \\ & = \int_0^\infty e^{-(\lambda_1+\cdots+\lambda_n)x} \lambda_1 \,dx = \frac{\lambda_1}{\lambda_1+\cdots+\lambda_n}. \end{align}

Michael Hardy (talk) 03:35, 2 May 2009 (UTC)

Ok I have added that. My reasoning was to prove (by integration) that
$\Pr(X_1
and then use induction over $\min\{\min\{X_1,\ldots,X_k\},X_{k+1}\}$ (Igny (talk) 18:28, 2 May 2009 (UTC))

## confusion between gamma and phase-type distributions

The article currently says "Both an exponential distribution and a gamma distribution are special cases of the phase-type distribution"

but gamma distributions are only phase-type when the shape parameter (k) is an integer. —Preceding unsigned comment added by 66.184.77.15 (talk) 18:16, 4 March 2011 (UTC)

## About the confidence interval section

I am not a mathematician. And I was shocked to see that the section about the confidence interval includes a link to a geology journal. Can any expert comment and/or revert the change? --Jbarcelo (talk) 10:28, 15 April 2011 (UTC)

Why are you shocked that this article references a paper in a geology journal? The Journal of Structural Geology is peer-reviewed and published by Elsevier, a major scientific publishing house, so appears to be a reliable source. I agree, however, that the content of this new section does not appear to be very useful, as the exact CI is given in the section exponential distribution#Maximum likelihood and is quite straightforward, involving nothing more complicated than percentile points of the chi-squared distribution. Suggest removal of this new material (added on 30 March by an IP user in Rome with no other edits) if no-one objects. Qwfp

(talk) 13:41, 15 April 2011 (UTC)

## Relationship To Poisson Should Be Made Explicit

The article states there is a relationship between the exponential and Poisson distributions, but I don't see a passage that states this relationship precisely. I think it would best to state the relationship explicitly rather than leave readers to infer it from the traditional use of of the symbol lambda in both distributions.

Tashiro (talk) 15:10, 18 October 2011 (UTC)

## Entropy confusion

I'm puzzled about the expression for entropy, 1 - ln(lambda). The number 1 has no units, but lambda does, its units are the inverse of those for x. Since we can't add two quantities with different units, what is the meaning of this entropy expression? — Preceding unsigned comment added by 173.160.49.201 (talk) 05:08, 13 December 2011 (UTC)

Interesting observation. The entropy measures information content in natural units (counts bits) , so lambda should not possess a physical dimension. I think the best way is, to interpret lambda and x as dimensionless numbers, which happen to represent the number of expected events per unit interval (lambda) and multiples of the unit interval (x). --131.220.161.244 (talk) 14:23, 8 May 2013 (UTC)
This is differential entropy, not ordinary entropy which has units of bits or nats (or bans or other such units). Since the pdf is continuous, the amount of information carried by the random variable is infinite – i.e., it is impossible to represent the random variable perfectly using any code with finite expected length. At least I think that's the case – not 100% sure. Please double-check. —2001:4898:1A:2:5044:C97C:EBFE:DF8B (talk) 16:25, 8 May 2013 (UTC)
I just thought a bit more about this. It is true that when the pdf is continuous, the amount of information carried by the random variable is infinite. Thus, some (expected) distortion must always be introduced when representing the random variable, and as the expected amount of introduced distortion becomes smaller and smaller, the necessary expected bit rate will increase to infinity. However, there is a limit on the necessary expected bit rate as a function of expected distortion (for non-zero distortion). Please see the rate–distortion theory article. The differential entropy acts as a constant offset in determining the necessary bit rate (or nat rate) as a function of the distortion in the limit as the distortion becomes infinitely small – at least when distortion is measured by mean squared error. The fact that it acts as a constant offset may explain why it is called differential entropy. Since the offset is an offset to the necessary expected bit rate, which is in units of bits (or nats), the differential entropy can, in some sense, be interpreted as having units of bits (or nats). In the formula used in this article, it seems to be using nats —2001:4898:0:FFF:0:5EFE:A53:2043 (talk) 18:44, 8 May 2013 (UTC)
When it comes to thinking about the relationship between the bit rate (or differential entropy) and λ (or the mean μ = 1/λ) the thing to notice is that as λ becomes smaller (and the mean μ becomes larger), the pdf of the random variable becomes more dispersed – thus, as μ increases, the expected bit rate necessary to represent the random variable within a given expected (infinitely-small amount of) distortion must also increase. As the mean μ increases, the values of the random variable become more dispersed, the differential entropy h = 1 + ln(μ) increases, and the necessary rate in nats also increases by the same amount that h increases. —2001:4898:1A:2:5044:C97C:EBFE:DF8B (talk) 19:12, 8 May 2013 (UTC)
I think it is not correct to say that λ and x should be interpreted as dimensionless numbers. For example, if we estimate that babies are born in a particular hospital ward at a rate of approximately one every 30 minutes, but we think the birth events are otherwise rather unpredictable, we can model the expected time between births as being exponentially distributed. We then have μ = 1/λ = 30 minutes per birth on average, so μ is in units of minutes per birth, and λ is in units of babies being born per minute, and x is in units of minutes between births. (Here, the birth process is modelled as a Poisson process.) —2001:4898:E0:2019:B581:94D:4C40:2BDC (talk) 19:39, 8 May 2013 (UTC)
the question of the OP still stands open...--92.228.199.196 (talk) 21:23, 9 May 2013 (UTC)
What does OP mean? What question is this referring to? —2001:4898:E0:2019:48B:5779:5448:A8D1 (talk) 22:45, 9 May 2013 (UTC)
it is the question of the original poster that started this section (the lines below Entropy Confusion)...--131.220.161.244 (talk) 08:33, 10 May 2013 (UTC)

## Memoryless distributions

In this article it states in the Properties section on Memorylessness: The exponential distributions and the geometric distributions are the only memoryless probability distributions

I know the exponential is the only memoryless continuous distribution, but in the Geometric distribution article it states: The geometric distribution is by far the most common memoryless discrete distribution.

Implying there are more?

146.232.65.201 (talk) 19:11, 5 September 2013 (UTC)

## The MLE for $\lambda$ is NOT an unbiased estimator!

I have now removed a section from /* Maximum likelihood */ which stated that the MLE is an unbiased estimator for $\lambda$. This is completely wrong, and the proof that is presented is based on a wrong transition, which is that E(1/x)=1/E(x) (which is false, as is stated by implementing Jensen's inequality to 1/x).

At some point I would like to expend on this section to show why the estimator is biased. If someone wishes to help I have a word file with the proof (written in mathtype), but I am not sure if it should be added in the article or somewhere else. Thoughts?

Tal Galili (talk) 19:35, 4 March 2014 (UTC)

## Should summary table at the top right include Exp(\lambda) as notation for this distribution?

Should summary table at the top right include Exp(\lambda) as notation for this distribution? Other distributions seem to include the standard notation in the summary table. — Preceding unsigned comment added by Craniator (talkcontribs) 15:29, 18 March 2014 (UTC)