Talk:Log-normal distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated B-class, Top-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B-Class article B  This article has been rated as B-Class on the quality scale.
 Top  This article has been rated as Top-importance on the importance scale.
 

Derivation of Log-normal Distribution[edit]

How do you derive the log-normal distribution from the normal distrubution?

By letting X ~ N(\mu, \sigma^2) and finding the distribution of Y = exp X.

D. Clason — Preceding unsigned comment added by 128.123.198.136 (talk) 00:21, 11 November 2011 (UTC)

Old talk[edit]

Hello. I have changed the intro from "log-normal distributions" to "log-normal distribution". I do understand the notion that, for each pair of values (mu, sigma), it is a different distribution. However, common parlance is to call all the members of a parametric family by a collective name -- normal distribution, beta distribution, exponential distribution, .... In each case these terms denote a family of distributions. This causes no misunderstandings, and I see no advantage in abandoning that convention. Happy editing, Wile E. Heresiarch 03:42, 8 Apr 2004 (UTC)

In the formula for the maximum likelihood estimate of the logsd, shouldn't it be over n-1, not n?

Unless you see an error in the math, I think its ok. The n-1 term usually comes in when doing unbiased estimators, not maximum likelihood estimators.
You're right; I was confused.

QUESTION: Shouldn't there be a square root at the ML estimation of the standard deviation? User:flonks

Right - I fixed it, thanks. PAR 09:15, 27 September 2005 (UTC)

Could I ask a question?[edit]

If Y=a^2; a is a log normal distribution ; then What kind of distribution is Y?

a is a lognormal distribution
so log(a) is a normal distribution
log(a^2) = 2 log(a) is also a normal distribution
a^2 is a lognormal distribution --Buglee 00:47, 9 May 2006 (UTC)

One should say rather that a has---not is---a lognormal distribution. The object called a is a random variable, not a probability distribution. Michael Hardy 01:25, 9 May 2006 (UTC)


Maria 13 Feb 207: I've never written anything in wikipedia, so I apologise if I am doing the wrong thing. I wanted to note that the following may not be clear to the reader: in the formulas, E(X)^2 represents the square of the mean, rather than the second moment. I would suggest one of the following solutions: 1) skip the parentheses around X and represent the mean by EX. Then it is clear that (EX)^2 will be its square. However, one might wonder about EX^2 (which should represent the second moment...) 2) skip the E operator and put a letter there, i.e. let m be the mean and s the standard deviation. Then there will be no confusion. 3) add a line at some point in the text giving the notation: i.e. that by E(X)^2 you mean the square of the first moment, while the second moment is denoted by E(X^2) (I presume). I had to invert the formula myself in order to figure out what it is supposed to mean.

I've just attended to this. Michael Hardy 00:52, 14 February 2007 (UTC)

A mistake?[edit]

I think there is a mistake here : the density function should include a term in sigma squared divided by two, and the mean of the log normal variable becomes mu - sigma ^2/2 Basically what happened is that, I think, the author forgot the Ito term.

I believe the article is correct. See for example http://mathworld.wolfram.com/LogNormalDistribution.html for an alternate source of the density function and the mean. They are the same as shown here, but with a different notation. (M in place of mu and S in place of sigma). Encyclops 00:23, 4 February 2006 (UTC)
Either the graph of the density function is wrong, or the expected value formula is wrong. As you can see from the graph, as sigma decreases, the expected value moves towards 1 from below. This is consistent with the mean being exp(mu - sigma^2/2), which is what I recall it as. 69.107.6.4 19:29, 5 April 2007 (UTC)
Here's you're mistake. You cannot see the expected value from the graph at all. It is highly influenced by the fat upper tail, which the graph does not make apparent. See also my comments below. Michael Hardy 20:19, 5 April 2007 (UTC)

I've just computed the integral and I get

e^{\mu + \sigma^2/2}.\,

So with μ = 0, as σ decreases to 0, the expected value decreases to 1. Thus it would appear that the graph is wrong. Michael Hardy 19:57, 5 April 2007 (UTC)

...and now I've done some graphs by computer, and they agree with what the illustration shows. More later.... Michael Hardy 20:06, 5 April 2007 (UTC)

OK, there's no error. As the mode decreases, the mean increases, because the upper tail gets fatter! So the graphs and the mean and the mode are correct. Michael Hardy 20:15, 5 April 2007 (UTC)

You're right. My mistake. The mean is highly influenced by the upper tail, so the means are actually decreasing to 1 as sigma decreases. It just looks like the means approach from below because the modes do. 71.198.244.61 23:50, 7 April 2007 (UTC)

Question on the example charts on the right. Don't these have μ of 1, not 0 (as listed)? They're listed as 1. If the cdf hits 0.5 at 1 for all of them, shouldn't expected value be 1? —Preceding unsigned comment added by 12.17.237.67 (talk) 18:28, 15 December 2008 (UTC)

The expected value is e^{\mu+\sigma^2/2}, not μ. /Pontus (talk) 19:19, 16 December 2008 (UTC)
Yet the caption indicates the underlying µ is held fixed at 0. In which case we should see the expected value growing with sigma. —Preceding unsigned comment added by 140.247.249.76 (talk) 09:13, 29 April 2009 (UTC)
Expected value is not the value y at which P[X<y] = P[X > y]. Rookie mistake. —Preceding unsigned comment added by 140.247.249.76 (talk) 09:24, 29 April 2009 (UTC)

A Typo[edit]

There is a typo in the PDF formula, a missing '['

Erf and normal cdf[edit]

There are formulas that use Erf and formulas that use the cdf of the normal distribution, IMHO this is confusing, because those functions are related but not identical. Albmont 15:02, 23 August 2006 (UTC)

Technical[edit]

Please remember that Wikipedia articles need to be accessible to people like high school studends, or younger, or without any background in math. I consider myself rather knowledgable in math (had it at college level, and still do) but (taking into account English is not my native language) I found the lead to this article pretty difficult. Please make it more accessible.-- Piotr Konieczny aka Prokonsul Piotrus | talk  22:48, 31 August 2006 (UTC)

To expect all Wikipedia math articles to be accessible to high-school students is unreasonable. Some can be accessible only to mathematicians; perhaps more can be accessible to a broad audience of professionals who use mathematics; others to anyone who's had a couple of years of calculus and no more; others to a broader audience still. Anyone who knows what the normal distribution is, what a random variable is, and what logarithms are, will readily understand the first sentence in this article. Can you be specific about what it is you found difficult about it? Michael Hardy 23:28, 31 August 2006 (UTC)

I removed the "too technical" tag. Feel free to reinsert it, but please leave some more details about what specifically you find difficult to understand. Thanks, Lunch 22:18, 22 October 2006 (UTC)

Skewness formual incorrect?[edit]

The formula for the skewness appears to be incorrect: the leading exponent term you have is not present in the definitions given by Mathworld and NIST, see http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm and http://mathworld.wolfram.com/LogNormalDistribution.html.

Many thanks.

X log normal, not normal.[edit]

I think the definition of X as normal and Y as lognormal in the beginning of the page should be changed. The rest of the page treats X as the log normal variable. —The preceding unsigned comment was added by 213.115.25.62 (talk) 17:40, 2 February 2007 (UTC).

The skewness is fine but the kurtosis is wrong - the last term in the kurtosis is -3 not -6 —Preceding unsigned comment added by 129.31.242.252 (talk) 02:08, 17 February 2009 (UTC)

Yup I picked up that mistake too and have changed is. The wolfram website also has it wrong, although if you calculate it from their central moments you get -3. I've sent them a message too. Cheers Occa —Preceding unsigned comment added by Occawen (talkcontribs) 21:19, 1 December 2009 (UTC)

Partial expectation[edit]

I think that there was a mistake in the formula for the partial expectation: the last term should not be there. Here is a proof: http://faculty.london.edu/ruppal/zenSlides/zCH08%20Black-Scholes.slide.doc See Corollary 2 in Appendix A 2.

I did put my earlier correction back in. Of course, I may be wrong (but, right now, I don't see why). If you change this again, please let me know why I was wrong. Thank you.

Alex —The preceding unsigned comment was added by 72.255.36.161 (talk) 19:39, 27 February 2007 (UTC).

Thanks. I see the problem. You have the correct expression for

g(k)=\int_k^\infty x f(x)\, dx

while what I had there before would be correct if we were trying to find

g_2(k)=\int_k^\infty (x-k) f(x)\, dx

which is (essentially) the B-S formula but is not the partial mean (or partial expectation) by my (or your) definition. (Actually I did find a few sources where the partial expectation is defined as g_2 but this usage seems to be rare. For ex. [1]). The term that you dropped occurs in g_2(k) but not g(k), the correct form of the partial mean. So I will leave the formula as it is now. Encyclops 00:47, 28 February 2007 (UTC)

  • The rest of the page uses \operatorname{erf} rather than \Phi, I suggest that also be used here (in addition to \Phi which is a nice way to put it). I didn't add it myself since with 50 percent probability I'd botch a \sqrt{2}. --Tom3118 (talk) 19:29, 16 June 2009 (UTC)

Generalize distribution of product of lognormal variables[edit]

About the distribution of a product of independent log-normal variables:

Wouldn't it be possible to generalize it to variables with different average ( mu NOT the same for every variable)?

the name: log vs exponential[edit]

log normal, sometimes, it is a little bit confusing for me, so a little bit note here:

For variable Y, if X=log(Y) is normal, then Y is log normal, which says after being taken log, it becomes normal. Similarly, there might be exponential normal: for variable Z, exp(Z) is normal. However, exp(Z) can never be normal, so the name log normal. Furthermore, if X is normal, then log(X) is undefined.

In other cases, variable X is in whatever distribution (XXX), we need a name for the distribution of Y=log(X) (in the case it is defined). X=exp(Y), Such a name should exponential XXX. For instance, X is in IG, then Y=log(X) is in exponential IG. Jackzhp 15:37, 13 July 2007 (UTC)

Mean, \mu and \sigma[edit]

The relationship given for \mu in terms of Var(x) and E(x) suggest that \mu is undefined when E(x)\le 0. However, I see no reason why E(x) must be strictly positive. I propose defining the relatinship in terms of E^2(x) such that

\mu = \frac{1}{2}\left(\ln\left(E^2(x)\right) - \sigma^2\right)

I am suspicious that this causes \mu to be...well, wrong. It suggests that two different values for E(x) could result in the same \mu, which I find improbable. In any case, if there is a a way to calculate \mu when E(x)\le 0 then we should include it, if not, we need to explain this subtlety. In my humble opinion.--Phays 20:35, 6 August 2007 (UTC)

I'm not fully following your comment. I have now made the notation consistent throughout the article: X is the random variable that's log-normally distributed, so E(X) must of course be positive, and μ = E(Y) = E(log(X)).
I don't know what you mean by "E2". It's as if you're squaring the expectation operator. "E2(X) would mean "E(E(X))", but that would be the same thing as E(X), since E(X) is a constant. Michael Hardy 20:56, 6 August 2007 (UTC)

Maximum Likelihood Estimation[edit]

Are there mistakes in the MLE? It looks to me as though the provided method is a MLE for the mean and variance, not for the parameters \mu and \sigma. If that is so it should be changed to the parameters estimated \hat{E}(x) and \hat{Var}(x)and then a redirect to extracting the parameter values from the mean and variance.--Phays 20:40, 6 August 2007 (UTC)

The MLEs given for μ and σ2 are not for the mean and variance of the log-normal distribution, but for the mean and variance of the distribution of the normally distribution logarithm of the log-normally distributed random variable. They are correct MLEs for μ and σ2. The "functional invariance" of MLEs generally, is being relied on here. Michael Hardy 20:47, 6 August 2007 (UTC)
I'm afraid I still don't fully understand, but it is simple to explain my confusion. Are the parameters being estimated μ and σ2 from
f(x;\mu,\sigma) = \frac{e^{-(\ln x - \mu)^2/(2\sigma^2)}}{x \sigma \sqrt{2 \pi}}
or are these estimates describing the mean and variance? In other words, if X is N(\mu_{n}, \sigma_{n}) and Y = exp(X) then is E(Y)=\mu\approx \bar{\mu}? It is my understand that the parameters in the above equation, namely μ and σ are not the mean and standard deviation of Y. They may be the mean and standard deviation of X.--Phays 01:16, 7 August 2007 (UTC)
The answer to your first question is affirmative. The expected value of Y = exp(X) is not μ; its value if given elsewhere in the article. Michael Hardy 16:10, 10 August 2007 (UTC)

8/10/2007:

It is my understanding that confidence intervals use standard error of a population in the calculation not standard deviation (sigma).

Therefore I do not understand how the Table is using 2sigma e.tc. for confidence interval calulation as pertains to the log normal distribution.

Why is it shown as 2*sigma?

Angusmdmclean 12:35, 10 August 2007 (UTC) angusmdmclean


Hi. The formula relating the density of the log normal to that of the normal -- where does the product come form on the r.h.s. ? I think this is a typo. should read: f_L = {1\over x} \times f_N, no?

This page lacks adequate citations!![edit]

Wikipedia policy (see WP:CITE#HOW) suggests citation of specific pages in specific books or peer-reviewed articles to support claims made in Wikipedia. Surely this applies to mathematics articles just as much as it does to articles about history, TV shows, or anything else?

I say this because I was looking for a formula for the partial expectation of a lognormal variable, and I was delighted to discover that this excellent, comprehensive article offers one. But how am I supposed to know if the formula is correct? I trust the competence of the people who wrote this article, but how can I know whether or not some mischievous high schooler reversed a sign somewhere? I tried verifying the expectation formula by calculating the integral myself, but I got lost quickly (sorry! some users of these articles are less technically adept than the authors!) I will soon go to the library to look for the formula (the unconditional expectation appears in some books I own, but not the partial expectation) but that defeats the purpose of turning to Wikipedia in the first place.

Of course, I am thankful that Wikipedia cites one book specifically on the lognormal distribution (Aitchison and Brown 1957). That reference may help me when I get to the library. But I'm not sure if that was the source of the formula in question. My point is more general, of course. Since Wikipedia is inevitably subject to errors and vandalism, math formulas can never be trusted, unless they follow in a highly transparent way from prior mathematical statements in the same article. Pages like this one would be vastly more useful if specific mathematical statements were backed by page-specific citations of (one or preferably more) books or articles where they could be verified. --Rinconsoleao 15:11, 28 September 2007 (UTC)

Normally I do not do this because I think it is rude, but I really should say {{sofixit}} because you are headed to the library and will be able to add good cites. Even if we had a good source for it, the formula could still be incorrect due to vandalism or transcription errors. Such is the reality of Wikipedia. Can you write a program to test it, perhaps? Acct4 15:23, 28 September 2007 (UTC)
I believe Aitchinson and Brown does have that formula in it, but since I haven't looked at that book in many years I wouldn't swear by it. I will have to check. I derived the formula myself before adding it to Wikipedia, unfortunately there was a slip up in my post which was caught by an anonymous user and corrected. FWIW, at this point I have a near 100% confidence in its correctness. And I am watching this page for vandalism or other problems. In general your point is a good one. Encyclops 22:34, 28 September 2007 (UTC)

Why has nobody mentioned whether the mean and standard deviation are cacultaed from x or y?. if y = exp(x). Then mean and stdev are from the x values. Book by - Athansious Papoulis. Siddhartha,here. —Preceding unsigned comment added by 203.199.41.181 (talk) 09:26, 2 February 2008 (UTC)

Derivation of Partial Expectation[edit]

As requested by Rinconsoleao and others, here is a derivation of the partial expectation formula. It is tedious, so I do not include it in the article itself.

We want to find

g(k)=\int_k^{\infty} x f(x) dx

where f(x) is the lognormal distribution

f(x)=\frac{1}{x \sigma \sqrt{2 \pi}}\exp\left(-\frac{(\ln x -\mu)^2}{2 \sigma^2}\right)

so we have

g(k)=\int_k^{\infty} \frac{1}{\sigma \sqrt{2 \pi}} \exp\left(-\frac{(\ln x -\mu)^2}{2 \sigma^2}\right)dx

Make a change of variables

y = \frac{\ln x - \mu}{\sigma} and dx=\sigma \exp(\sigma y + \mu) dy giving

\int_{y=(\ln k-\mu)/\sigma}^{\infty} \frac{1}{\sigma \sqrt{2 \pi}}\exp(-\frac{1}{2}y^2)\sigma\exp(\sigma y +\mu)dy

combine the exponentials together

\int \frac{1}{\sqrt{2 \pi}} \exp(-\frac{1}{2}y^2+\sigma y+\mu) dy

fix the quadratic by 'completing the square'

\int \frac{1}{\sqrt{2 \pi}} \exp[-\frac{1}{2}(y-\sigma)^2+(\mu+\frac{1}{2}\sigma^2)] dy

at this point we can pull out some stuff from the integral

g(k) = \exp(\mu+\frac{\sigma^2}{2}) \frac{1}{\sqrt{2 \pi}} \int_{y=(\ln K-\mu)/\sigma}^{\infty} \exp(-\frac{1}{2}(y-\sigma)^2) dy

one more change of variable

v=y-\sigma and dy=dv

gives

g(k)=\exp(\mu+\frac{\sigma^2}{2}) \frac{1}{\sqrt{2 \pi}} \int_{v=(\ln k-\mu)/\sigma-\sigma}^{\infty} \exp(-\frac{1}{2} v^2)dv

We recognize the integral and the fraction in front of it as the complement of the cdf of the std normal rv

g(k)=\exp\left(\mu+\frac{\sigma^2}{2}\right)\left[1-\Phi(\frac{\ln k -\mu-\sigma^2}{\sigma})\right]

using 1-\Phi(x)=\Phi(-x) we finally have

g(k)=\exp\left(\mu+\frac{\sigma^2}{2}\right) \Phi\left(\frac{-\ln k +\mu+\sigma^2}{\sigma}\right)

Regards, Encyclops (talk) 21:49, 29 August 2009 (UTC)

examples for log normal distributions in nature/economy?[edit]

Some examples would be nice! —Preceding unsigned comment added by 146.113.42.220 (talk) 16:41, 8 February 2008 (UTC)

One example is neurological reaction time. This distribution has been seen in studies on automobile braking and other responses to stimuli. See also mental chronometry.--IanOsgood (talk) 02:32, 26 February 2008 (UTC)
This is also useful in telecom. in order to compute slow fading effects on a transmitted signal. -- 82.123.94.169 (talk) 14:42, 28 February 2008 (UTC)

I think the Black–Scholes Option model uses a log-normal assumption about the price of a stock. This makes sense, because its the percentage change in the price that has real meaning, not the price itself. If some external event makes the stock price fall, the amount that it falls is not very important to an investor, its the percent change that really matters. This suggests a log normal distribution. PAR (talk) 17:13, 28 February 2008 (UTC)

I recall reading on wiki that high IQs are log-normally distributed. Also, incomes (in a given country) are approximately as well. Elithrion (talk) 21:26, 2 November 2009 (UTC)

Parameters boundaries ?[edit]

If the relationship between the log-normal distribution and the normal distribution is right, then I don't understand why \mu needs to be greater than 0 (since \mu is expected to be a real with no boundary in the normal distribution). At least, it can be null since it's the case with the graphs shown for the pdf and cdf (I've edited the article in consequence). Also, that's not \sigma that needs to be greater than 0, but \sigma^2 (which simply means that \sigma can't be null since it's a real number). -- 82.123.94.169 (talk) 15:04, 28 February 2008 (UTC)

Question: What can possibly be the interpretation of, say, \sigma=-3 as opposed to \sigma=3? By strong convention (and quite widely assumed in derivations) standard deviations are taken to be in the domain [0,\infty), although I suppose in this case algebraically \sigma can be negative... It's confusing to start talking about negative sds, and unless there's a good reason for it, please don't. --128.59.111.72 (talk) 22:59, 10 March 2008 (UTC)

Yes, you're right: \sigma can't be negative or null (it's also obvious reading the PDF formula). I was confused by the Normal Distribution article where only \sigma^2 is expected to be positive (which is also not sufficient there). Thanks for your answer, and sorry for that. I guess \mu can't be negative as well because that would be meaningless if it was (even if it would be mathematically correct). -- 82.123.102.83 (talk) 19:33, 13 March 2008 (UTC)

Logarithm Base[edit]

Although yes, any base is OK, the derivations and moments, etc. are all done assuming a natural logarithm. Although the distribution would still be lognormal in another base b, the details would all change by a factor of ln(b). A note should probably be added in this section, that we are using by convention the natural logarithm here. (And possibly re-mention it in the PDF.) --128.59.111.72 (talk) 22:59, 10 March 2008 (UTC)

Product of "any" distributions[edit]

I think it should be highlighted in the article that the Log-normal distribution is the analogue of the normal distribution in this way: if we take n independent distributions and add them we "get" the normal distribution (NB: here I am lazy on purpose, the precise idea is the Central Limit Theorem). If we take n positive independent distributions and multiply them, we "get" the log-normal (also lazy). Albmont (talk) 11:58, 5 June 2008 (UTC)

This is to some extent expressed (or at least suggested) where the article says "A variable might be modeled as log-normal if it can be thought of as the multiplicative product of many small independent factors". Perhaps it could be said better, but the idea is there. Encyclops (talk) 14:58, 5 June 2008 (UTC)
So we're talking about the difference between "expressed (or at least suggested)" on the one hand, and on the other hand "highlighted". Michael Hardy (talk) 17:39, 5 June 2008 (UTC)
Yes, the ubiquity of the log-normal in Finance comes from this property, so I think this property is important enough to deserve being stated in the initial paragraphs. Just MHO, of course. Albmont (talk) 20:39, 5 June 2008 (UTC)
The factors need to have small departure from 1 ... I have corrected this, but can someone think of a rephrasing for the bit about "the product of the daily return rates"? Is a "return rate" defined so as to be close to 1 (no profit =1) or close to zero (no profit=0)? Melcombe (talk) 13:49, 11 September 2008 (UTC)
The "return rate" should be the one "close to 1 (no profit == 1)." The author must be talking about discount factors rather than rates of return. Rates of return correspond to specific time periods and are therefore neither additive nor multiplicative. Returns are often thought of as normally distributed in finance, so the discount factor would be lognormally distributed. I'll fix this. Alue (talk) 05:14, 19 February 2009 (UTC)
Moreover, it would be nice to have a reference for this section. 188.97.0.158 (talk) 14:21, 4 September 2012 (UTC)

Why the pdf value would be greater than 1 in the pdf picture?[edit]

Why the pdf value would be greater than 1 in the pdf picture? Am I missing something here? I am really puzzled. —Preceding unsigned comment added by 208.13.41.5 (talk) 01:55, 11 September 2008 (UTC)

Why are you puzzled? When probability is concentrated near a point, the value of the pdf is large. That's what happens here? Could it be that you're mistaking this situation with that of probability mass functions? Those cannot exceed 1, since their values are probabilities. The values of a pdf, however, are not generally probabilities. Michael Hardy (talk) 02:20, 11 September 2008 (UTC)
Just to put it another way, the area under a pdf is equal to one, not the curve itself. Encyclops (talk) 03:01, 11 September 2008 (UTC)

Now I am unpuzzled. Thanks ;-) —Preceding unsigned comment added by 208.13.41.5 (talk) 16:54, 11 September 2008 (UTC)


The moment problem[edit]

In the article it should really mention that the Log-normal distribution suffers from the Moment problem (see for example Counterexamples in Probability, Stoyanov). Basically, there exists infinitely many distributions that have the same moments as the LN, but have a different pdf. In fact (I think), there are also discrete distributions which have the same moments as the LN distribution. ColinGillespie (talk) 11:45, 30 October 2008 (UTC)

moment generating function is defined as
 M_X(t) := E\left(e^{tX}\right), \quad t \in \mathbb{R},
On the whole domain R, it doesn not exist, but for t=0, it does exist for sure, and so is for any t<0. so why don't we try to find the domain set on which it exists? the set {t: Mx(t)<infinite}. Jackzhp (talk) 14:39, 21 January 2009 (UTC)
The cumulant/moment generating function g(t) is convex, 0 belong to the set {t: g(t)<infinite}, if the interior of the set is not empty, then g(t) is analytic there, and infinitely differentialbe there, on the set, g(t) is strictly convex, and g'(t) is strictly increasing. please edit Cumulant#Some_properties_of_cumulant_generating_function or moment generating functionJackzhp (talk) 15:09, 21 January 2009 (UTC)

My edit summary got truncated[edit]

Here's the whole summary to my last edit:

Two problems: X is more conventional here, and the new edit fails to distinguish between capital for the random variable and lower case for the argument to the density function.

Michael Hardy (talk) 20:15, 18 February 2009 (UTC)

Are the plots accurate?[edit]

Something seems a bit odd with the plots. In particular the CDF plot appears to demonstrate that all the curves have a mean at about 1, but if the underlying parameter µ is held fixed, we should see P = 0.5 at around x=3 for sigma = 3/2; and at around 1.35 for sigma=1, and all the way at e^50 for sigma=10. The curves appear to have been plotted with the mean of the lognormal distribution fixed at (µ+o^2/2)=1? ~confused~

Don't confuse the expected value with the point at which the probability is one-half. The latter is well-defined for the Cauchy distribution, while the former is not; thus although x=1 is the point at which all these distributions have P[x < 1] = 1/2; it's not the expected value. Hurr hurr, let no overtired idiots make this mistake again. (signed, Original Poster) —Preceding unsigned comment added by 140.247.249.76 (talk) 09:26, 29 April 2009 (UTC)
We had this discussion on this page in considerable detail before, a couple of years ago. Yes, they're accurates; they're also somewhat counterintuitive. Michael Hardy (talk) 16:40, 17 June 2009 (UTC)

I think it's worth pointing out the that the formula in the code that generates the pdf plots is wrong. The numerator in the exponent is log(x-mu)^2, when it should be (log(x)-mu)^2. It doesn't actually change the plots, because they all use mu=0, but it's an important difference, in case someone else used and modified the code. Sorry if this isn't the place to discuss this - this is my first time discussing on wikipedia. Crichardsns (talk) 01:25, 19 February 2011 (UTC)

The pdf plot is wrong if for no other reason (unless I've missed something important) because one of the curves exceed 1 and can't be a proper pdf. —Preceding unsigned comment added by 203.141.92.14 (talk) 05:31, 1 March 2011 (UTC)


Nonsense about confidence intervals[edit]

I commented out this table:

Confidence interval bounds log space geometric
3σ lower bound \mu - 3\sigma\,\! \mu_\mathrm{geo} / \sigma_\mathrm{geo}^3\,\!
2σ lower bound \mu - 2\sigma\,\! \mu_\mathrm{geo} / \sigma_\mathrm{geo}^2\,\!
1σ lower bound \mu - \sigma\,\! \mu_\mathrm{geo} / \sigma_\mathrm{geo}\,\!
1σ upper bound \mu + \sigma\,\! \mu_\mathrm{geo} \sigma_\mathrm{geo}\,\!
2σ upper bound \mu + 2\sigma\,\! \mu_\mathrm{geo} \sigma_\mathrm{geo}^2\,\!
3σ upper bound \mu + 3\sigma\,\! \mu_\mathrm{geo} \sigma_\mathrm{geo}^3\,\!

The table has nothing to do with confidence intervals as those are normally understood. I'm not sure there's much point in doing confidence intervals for the parameters here as a separate topic from confidence intervals for the normal distribution.

Obviously, you cannot use μ and σ to form confidence intervals. They're the things you'd want confidence intervals for! You can't observe them. If you could observe them, what would be the point of confidence intervals? Michael Hardy (talk) 16:43, 17 June 2009 (UTC)

This edit was a colossal mistake that stood for almost five years!! Whoever wrote it didn't have a clue what confidence intervals are. Michael Hardy (talk) 16:49, 17 June 2009 (UTC)

Characteristic function[edit]

Roy Lepnik [1] obtained the following series formula for the characteristic function:

\varphi(t) = \sqrt\frac{\pi}{2\sigma^2} \exp\bigg(-\frac{(\ln t + \mu + \pi i/2)^2}{2\sigma^2}\bigg)\times\sum_{k=0}^{\infty}(-1)^k a_{k+1}(2\sigma^2)^{-k/2} H_k\!\bigg(\frac{\ln t + \mu + \pi i/2}{\sigma\sqrt{2}}\bigg)

where a_k are coefficients in Taylor expansion of Reciprocal Gamma function, and H_k are Hermite functions.

  1. ^ Lepnik, R. (1991). On lognormal random variables: I-the characteristic function. J Austral Math Soc Ser B, 32, pp327--347, 1991.


Scaling & inverse[edit]

In the relation section, we should mention the scaling & inverse of a log normal variable:

  • If X \sim \operatorname {Log-N} (\mu, \sigma^2) then X + c is called shifted log-normal. E(X+c)=E(X)+c, var(X+c)=var(X)
  • If X \sim \operatorname {Log-N} (\mu, \sigma^2) , then Y=aX is also log normal, F_Y(y)=F_X(\frac{y}{a}) and f_Y(y)=\frac{1}{a}f_X(\frac{y}{a}), E(Y)=aE(X), E(Y^2)=a^2E(X^2)
  • If X \sim \operatorname {Log-N} (\mu, \sigma^2) , the Y=\frac{1}{X} is called inverse log normal,

F_Y(y)=1-F_X(\frac{1}{y}) and f_Y(y)=\frac{1}{y^2}f_X(\frac{1}{y}) EY=?, var(Y)=?

Jackzhp (talk) 12:53, 28 July 2009 (UTC)

If Y=aX then Y\sim\operatorname{logN}(\mu+\ln a,\sigma^2), formulas for ƒ and F are immediate application of the formulas from the beginning of the article. If Y=1/X then Y\sim\operatorname{logN}(-\mu,\sigma^2), and again formulas immediately follow. It's actually much easier to work with this representation because one may want to calculate not only mean+variance, but other quantities as well. ... stpasha » talk » 18:32, 28 July 2009 (UTC)

Partial expectation again[edit]

As User:Encyclops proves above, the formula in the "partial expectation" section is the quantity \int_k^{\infty} x f(x) dx.

However, a recent edit defined the term "partial expectation" as a synonym for the "conditional expectation" E(x|x>k).

That doesn't seem correct; it seems unlikely that the uncommon term "partial expectation" would be a synonym for the more standard term "conditional expectation". Instead, it makes sense that "partial expectation" would mean part of the expectation, as this definition states.

Anyway, regardless of semantics, \int_k^{\infty} x f(x) dx is not E(x|x>k). Instead, \int_k^{\infty} x f(x) dx is E(x|x>k)prob(x>k).

Therefore, the current "partial expectation" section is incorrect. It is self-consistent if we instead define "partial expectation" as E(x|x>k)prob(x>k). So I will make that change. (unsigned edit by 213.170.45.3 )

Well that is one definition of "partial expectation" that you have found, and I can't find another. If you make the change to be a formal definition of the term then include the citation, otherwise you might change the text to avoid it being a "definition" at all. 08:54, 24 September 2009 (UTC)

Certainly E(X|X>k) must be greater than k, whereas the currently displayed formula for g(k) need not be, in particular when k is large and positive. So certainly something needs putting right in that section.Fathead99 (talk) 15:52, 2 January 2013 (UTC)

I agree that E(X|X>k) must be greater than k, but not g(k). Recall that in the definition of E(X|X>k) you divide the partial expectation term \int_k^{\infty} x f(x) dx by the probability of the event {X>k}. AndreaGerali (talk) 11:47, 15 January 2013 (UTC)

Yes, my comment applied to a version before the recent edits: I'm happy with what's there now.Fathead99 (talk) 10:27, 16 January 2013 (UTC)

Properties?[edit]

I would like to start a new section on properties, where one of the properties is, that data that arise from the the log-normal distribution has a symmetric Lorenz curve (see also Lorenz asymmetry coefficient), any objections?

Christian Damgaard —Preceding unsigned comment added by Christian Damgaard (talkcontribs) 10:36, 13 October 2010 (UTC)

The present section "Characterization" might reasonably be split-up, some of it going into a new section headed "Properties". But "Characterization" doesn't mean here what it usually means so the rest could be renamed. Adding the info you suggest seems OK. Melcombe (talk) 12:24, 13 October 2010 (UTC)

I have made the section "Properties" but I hope that others will will move relevant parts from the characterisation section

Is the format of the reference OK? —Preceding unsigned comment added by Christian Damgaard (talkcontribs) 13:43, 13 October 2010 (UTC)

Is the median the same?[edit]

Is the median of the distribution of the random variable, after converting it to its logarithm, the same as the corresponding median after logarithmizing the whole distribution? Theoretically, it should, because the ranks of the values from smallest to largest should remain the same. If so, it should be mentioned in the article. Mikael Häggström (talk) 05:58, 1 March 2011 (UTC)

Related distributions[edit]

The approximate mean and variance for the sum Y=\textstyle\sum_{j=1}^n X_j, for i.i.d. log-normal X_j \sim \operatorname{Log-\mathcal{N}}(\mu_j, \sigma^2), are given incorrectly, I think. The expressions below for \sigma_Z and \mu_Z are meant to be for an approximately normally distributed Z\approx log(Y). So that


\begin{align}
  & \operatorname{E}[Y] \approx e^{\mu_Z + \tfrac{1}{2}\sigma^2_Z}, \\
  & \operatorname{Var}[Y] \approx e^{2\mu_Z + \sigma^2_Z} (e^{\sigma^2_Z} - 1)  \\
\end{align}

and thus direct substitution (for constant \sigma) gives  
\operatorname{Var}[Y] \approx \operatorname{Var}[e^Z] = e^{\sigma^2} (e^{\sigma^2} - 1)  \textstyle\sum_{j=1}^n e^{2\mu_j}
as expected, since the variance of the sum of i.i.d. variables equals to the sum of variances for each variable.

I therefore suggest changing approximated by another log-normal distribution Z to approximated by a normal distribution Z \approx log(Y).

Please let me know if I am wrong. The references to Gao and to Fenton & Wilkinson should also be cited correctly. — Preceding unsigned comment added by Raiontov (talkcontribs) 05:43, 15 November 2011 (UTC)

Improvement of pictures[edit]

The pdf and cdf graphs of the normal distribution are very beautiful. If the log-normal ones were changed in the same way (thicker lines, grid, ...) I think the article would be more legible. Jbbinder (talk) 12:16, 18 June 2012 (UTC)

Confusion about location and shape in the Probability distribution table, row on parameters[edit]

In the table on the right it reads

Parameters σ2 > 0 — log-scale,
μR — shape (real)

Surely, the use of "shape" must be a mistake. The shape of the distribution is determined by σ2 while the location is determined by μ. This fact can easily be verified by plotting \frac{1}{x\sqrt{2\pi}\sigma}\ e^{-\frac{\left(\ln x-\mu\right)^2}{2\sigma^2}} normalized by its maximum value as a function of \frac{x}{\mu}. Curves with varying μ will coincide. Changing σ2 on the other hand will change the shape of the pdf.

In my opinion it would be better if the table entry read

Parameters σ2 > 0 — log-scale (shape),
μR — location (real)

Comments? — Preceding unsigned comment added by 193.11.28.112 (talk) 13:33, 24 October 2013 (UTC)

Wrong formula for parameter μ as a function of the mean and variance[edit]

The formula is clearly wrong (although it was correctly coppied from the reference given).

The formula given is: \mu=\ln\left(\frac{m^2}{\sqrt{v+m^2}}\right)

But the fraction inside the logarythim is clearly not "dimensionless" and it should be.

I have done the calculation on my own and I arrived at a similar (and dimension consisntent) result: \mu=\ln\left(\frac{m}{\sqrt{v+m^2}}\right) = \frac{1}{2}\ln\left(\frac{m^2}{v+m^2}\right) G Furtado (talk) 00:34, 1 December 2013 (UTC)

Power laws[edit]

Power law distributions are very similar to but not the same as log-normal distributions. This is mentioned in the power-law article. It should also be brought up here. — Preceding unsigned comment added by 211.225.33.104 (talk) 05:08, 11 July 2014 (UTC)