# Talk:Kurtosis

WikiProject Statistics (Rated B-class, High-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
High  This article has been rated as High-importance on the importance scale.
WikiProject Mathematics (Rated B-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 Low Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.

## Range of Values Kurtosis can take

I think it would be helpful to be clear what range of values the kurtosis statistic can take. I can infer that there is a lower bound of -2, when the article discusses the binomial distribution being an extreme case; this took a fair bit of reading. There is nothing about an upper bound; presumably one exists else you end up with improper distribution? — Preceding unsigned comment added by 194.176.105.139 (talk) 09:48, 4 July 2012 (UTC)

No, there is no upper bound. Try three atoms: two atoms at -1 and +1, of probability 0.01 each, and atom at 0 of probability 0.98. And you'll easily guess how the kurtosis can be arbitrarily large. Boris Tsirelson (talk) 08:22, 25 March 2015 (UTC)

## What is Gamma?

The author defines kurtosis in terms of Gamma, but fails to define Gamma? Is it the Gamma Distribution? Then why doesn't it have 2 parameters? Is it the Factorial? Then why does it have a non-integer parameter? Is it acceptable to use ambiguous functions in a definition without disambiguating them?

Hsfrey (talk) 00:46, 17 March 2012 (UTC)

## Wikipedia inconsistancy

Hi statistic wikipedia folks. In this page the Kurtosis definition has a "-3" in it (because the normal has a Kurtosis of 3 so this definition "normalises" things so to say). Subtracting this 3 is actually a convention, maybe this should be mentioned.

A more important point is that every single page on distributions I've encountered here does NOT include the -3 in the Kurtosis formula given on the right (correct me if I'm wrong? I didn't recalculate them all manually :)). So while this is only a matter of convention, we should at least get wikipedia consistent with its own definition conventions? The easiest way seems adapting the definition in this page.

Regards

woutersmet

The reason for this (I think!) is that people who have contributed to this page are from an econometrics background where its common to assume a conditional normal distribution. Hence the -3. —Preceding unsigned comment added by 62.30.156.106 (talk) 21:45, 14 March 2008 (UTC)

## Standardized moment

If this is the "fourth standardized moment", what are the other 3 and what is a standardized moment anyway? do we need an article on it? -- Tarquin 10:39 Feb 6, 2003 (UTC)

The first three are the mean, standard deviation, and skewness, if I recall correctly.
Actually, the word "standarized" refers to the fact that the fourth moment is divided by the 4th power of the standard deviation. — Miguel 15:53, 2005 Apr 19 (UTC)
Thank you :-) It's nice when wikipedia comes up with answers so quickly! -- Tarquin 11:04 Feb 6, 2003 (UTC)
I think the term "central moments" is also used. See also http://planetmath.org/encyclopedia/Moment.htm
No, central moments are distinct from standardized moments. --MarkSweep (call me collect) 02:14, 6 December 2006 (UTC)

## Peakedness

Kurtosis is a measure of the peakedness ... so what does that mean? If I have a positive kurtosis, is my distribution pointy? Is it flat? -- JohnFouhy, 01:53, 11 Nov 2004

I've tried to put the answer to this in the article: high kurtosis is 'peaked' or 'pointy', low kurtosis is 'rounded'. Kappa 05:15, 9 Nov 2004 (UTC)

It has been pointed out that kurtosis is not synonymous with shape or peakedness, even for symmetric unimodal distributions, please see: 1) A common error concerning kurtosis, Kaplansky - Journal of the American Statistical Association, 1945 2) Kurtosis: a critical review, Balanda, HL MacGillivray - American Statistician, 1988 —Preceding unsigned comment added by Studentt2046 (talkcontribs) 16:27, 10 March 2009 (UTC)

Just backing up that we should not describe Kurtosis as "peakedness", see "Kurtosis as Peakedness, 1905–2014. R.I.P." in The American Statistician 11 Aug 2014 — Preceding unsigned comment added by 130.225.116.170 (talk) 09:33, 2 October 2014 (UTC)

## Mistake

I believe the equation for the sample kurtosis is incorrect (n should be in denominator, not numerator). I fixed it. Neema Sep 7, 2005

## Ratio of cumulants

The statement, "This is because the kurtosis as we have defined it is the ratio of the fourth cumulant and the square of the second cumulant of the probability distribution," does not explain (to me, at least) why it is obvious that subtracting three gives the pretty sample mean result. Isn't it just a result of cranking through the algebra, and if so, should we include this explanation? More concretely, the kurtosis is a ratio of central moments, not cumulants. I don't want to change one false explanation that I don't understand to another, though. Gray 01:30, 15 January 2006 (UTC)

After thinking a little more, I'm just going to remove the sentence. Please explain why if you restore it. Gray 20:58, 15 January 2006 (UTC)

## Mesokurtic

It says: "Distributions with zero kurtosis are called mesokurtic. The most prominent example of a mesokurtic distribution is the normal distribution family, regardless of the values of its parameters." Yet here: http://en.wikipedia.org/wiki/Normal_distribution, we can see that Kurtosis = 3, it's Skewness that = 0 for normal. Agree? Disagree?

Thanks, that's now fixed. There are two ways to define kurtosis (ratio of moments vs. ratio of cumulants), as explained in the article. Wikipedia uses the convention (as do most modern sources) that kurtosis is defined as a ratio of cumulants, which makes the kurtosis of the normal distribution identically zero. --MarkSweep (call me collect) 14:43, 24 July 2006 (UTC)

## Unbiasedness

I have just added a discussion to the skewness page. Similar comments apply here. Unbiasedness of the given kurtosis estimator requires independence of the observations and does not therefore apply to a finite population.

The independent observations version is biased, but the bias is small. This is because, although we can make the numerator and denominator unbiased separately, the ratio will still be biased. Removing this bias can be done only for specific populations. The best we can do is either: 1 use an unbiased estimate for the fourth moment about the mean,
2 use an unbiased estimate of the fourth cumulant,
in the numerator; and either: 3 use an unbiased estimate for the variance,
4 use an unbiased estimate for the square of the variance, in the denominator.

According to the article, the given formula is 2 and 3 but I have not checked this. User:Terry Moore 11 Jun 2005

## So who's Kurt?

I mean, what is the etymology of the term? -FZ 19:48, 22 Jun 2005 (UTC)

It's obviously a modern term of Greek origin (κυρτωσις, fem.). The OED gives the non-specialized meaning as "a bulging, convexity". The Liddell-Scott-Jones lexicon has "bulging, of blood-vessels", "convexity of the sea's surface" and "being humpbacked". According to the OED (corroborated by "Earliest Known Uses of Some of the Words of Mathematics" and by a search on JSTOR), the first occurrence in print of the modern technical term is in an article by Karl Pearson from June 1905. --MarkSweep 21:05, 22 Jun 2005 (UTC)

## Kurtosis Excess?

I've heard of "excess kurtosis," but not vice-versa. Is "kurtosis excess" a common term? Gray 01:12, 15 January 2006 (UTC)

## Diagram?

A picture would be nice ... (one is needed for skewness as well. I'd whip one up, but final projects have me beat right now. 24.7.106.155 08:27, 19 April 2006 (UTC)

The current picture is nice because it shows real data, but it has some problems:
1. it does not cite any references for the source of the data
2. it is not what we need here: kurtosis comes along when variance is not enough, so for a real case to be interesting one should find a situation where two distributions with the same mean and variance are symmetric or similarly asimmetric (null or same skewness) and yet have a different kurtosis; is this the case here? I am not sure, as no mention of variance is made in the comment to the picture
3. it is not in vector form (SVG) --Pot (talk) 13:21, 18 December 2008 (UTC)

The picture is taken from in my doctoral thesis ("A.Meskauskas.Gravitorpic reaction: the role of calcium and phytochrome", defended in 1997, Vilnius State University, Lithuania). I added this note to the picture description in commons. The picture represents my own experimental data but the dissertation should be considered a published work. The real experimental data cannot be "adjusted" in any "preferred" way but in reality likely no scientist will ever observe an "absolutely clean" effect that only changes kurtosis and nothing else. Audriusa (talk) 14:41, 19 December 2008 (UTC)

And in fact kurtosis is rarely used in real experimental data. One of the fields where it is used is for big quantities of experimental data that would seem well modelled by a Gaussian process. If the weight of the tails turns out out be important, then a kurtosis estimate can be necessary. It allows one to take apart a Gaussian from something that resembles it. As I said above, this mostly makes sense when comparing distributions with same mean, variance and skewness. And, if you want to give an example, I argue that this is indeed necessary. So I think that your example is illustrative only if the two variances are equal. Can you tell us if this is the case? --Pot (talk) 16:50, 19 December 2008 (UTC)
The dispersion changes from 21.780 (control) to 16.597 (far red). The mean, however, does not change much if to take the +- intervals into consideration (from 10.173 +- 0.975 to 8.687 +- 0.831). So, if comparing only the mean, we would likely conclude that the far red light has no any effect in experiment. But the histograms do look very different. One of the possible explanations can be periodic oscillations around the mean value in time (when the experiment gives the "momentary picture"). Far red light may stop these oscillations, making the output more uniform. Audriusa (talk) 20:38, 20 December 2008 (UTC)
Thank you for clarifying this. However, as I pointed out above, once you have two distributions with the same mean, you start considering higher moments. The first one above the mean is variance. Only if variances are equal you resort to using even higher moments; this is not very common, as the higher the moment the more sensitive to noise. And in practice using moments higher than the variance with few samples is not very significant. So, once again, are the variances equal for the two cases you proposed as an illustration? --Pot (talk) 14:27, 21 December 2008 (UTC)
From your talk just comes that the higher moments should only be compared if the lower moments are equal. This is a simple and clear sentence. How sure you are about this? Any references? Audriusa (talk) 17:59, 30 December 2008 (UTC)
Google books may let you browse this. It is the famous handbook "Numerical Recipes in C" (but the Fortran version contains the same text). The issue is that higher moments are generally less robust than lower moments, because they involve higher powers of the input data. The advice given in the book is that skewness and kurtosis should be used with caution or, better yet, not at all. More specifically, the skewness of a batch of N samples from a normal distribution is about ${\displaystyle {\sqrt {15/N}}.}$ The book goes on suggesting that In real life it is good practice to believe in skewnesses only when they are several or many times as large as this. Here we are speaking about kurtosis, for which the relevant figure is ${\displaystyle {\sqrt {96/N}}.}$ For the example figure that you added, this means that the difference on sample kurtoses can be considered significant only if ${\displaystyle 0.05+0.20\gg {\sqrt {96/N}}\Rightarrow N\gtrsim 10000.}$ Even if this is the case, resorting to higher moments, which are inherently less robust, will only be justified where lower moments cannot do the job. I think that adding a section both in skewness and kurtosis explaining these concepts is a good idea. Pot (talk) 12:54, 8 January 2009 (UTC)

## Range?

Is the range -2, +infinity correct? why not -3, +infinity?

Yes, the range is correct. In general all distributions must satisfy ${\displaystyle \gamma _{2}-\gamma _{1}^{2}+2\geq 0}$. The minimum value of ${\displaystyle \gamma _{2}}$ is −2. --MarkSweep (call me collect) 02:26, 6 December 2006 (UTC)
I take that back, will look into it later. --MarkSweep (call me collect) 09:32, 6 December 2006 (UTC)

I corrected the french article, which given 0, +infinity for kurtosis ( so -3, +infinity for excess kurtosis). The good range for kurtosis are : 1 , +infinity and for the excess kurtosis : -2 , +infinity

Very simple demonstration :

${\displaystyle \gamma _{2}={\frac {\mathbb {E} \left[(X-\mu )^{2}(X-\mu )^{2}\right]}{\sigma ^{4}}}}$

We have

${\displaystyle \gamma _{2}={\frac {\mathbb {E} \left[(X-\mu )^{2}\right]\mathbb {E} \left[(X-\mu )^{2}\right]}{\sigma ^{4}}}+{\frac {\operatorname {cov} \left[(X-\mu )^{2},(X-\mu )^{2}\right]}{\sigma ^{4}}},}$

or

${\displaystyle \mathbb {E} \left[(X-\mu )^{2}\right]=\sigma ^{2},}$

so

${\displaystyle \gamma _{2}=1+{\frac {\operatorname {var} \left[(X-\mu )^{2}\right]}{\sigma ^{4}}},}$

with ${\displaystyle \alpha ^{2}={\frac {\operatorname {var} \left[(X-\mu )^{2}\right]}{\sigma ^{4}}}\geq 0}$, we have

${\displaystyle \gamma _{2}=1+\alpha ^{2}\geq 1}$

This demonstration can be realize with Jensen's inegality (add 10/16/09).

Jensen's Inegality :

${\displaystyle \mathbb {E} \left[f(X)\right]\geq f\left(\mathbb {E} [X]\right).}$

We have

${\displaystyle \mathbb {E} \left[X^{4}\right]\geq \left(\mathbb {E} \left[X^{2}\right]\right)^{2},}$

${\displaystyle \mathbb {E} \left[X^{4}\right]\geq \left(\sigma ^{2}\right)^{2}.}$

so

${\displaystyle {\frac {\mathbb {E} \left[X^{4}\right]}{\sigma ^{4}}}=\gamma _{2}\geq 1.}$

Thierry —Preceding unsigned comment added by 132.169.19.128 (talk) 08:22, 4 June 2009 (UTC)

## Sample kurtosis

Is the given formula for the sample kurtosis really right? Isn't it supposed to have the -3 in the denominator? --60.12.8.166

In the discussion of the "D" formula, the summation seems to be over i terms, whereas the post lists: "xi - the value of the x'th measurement" I think this should read: "xi - the value of the i'th measurement of x" (or something close) --Twopoint718 19:25, 13 May 2007 (UTC)

## Shape

In terms of shape, a leptokurtic distribution has a more acute "peak" around the mean (that is, a higher probability than a normally distributed variable of values near the mean) and "fat tails" (that is, a higher probability than a normally distributed variable of extreme values)

Is that right? How can a function have both a greater probability near the mean and a greater probability at the tails? Ditto for platykurtic distributions--DocGov 21:49, 18 November 2006 (UTC)

Yes, that's right. One typically has in mind symmetric unimodal distributions, and leptokurtic ones have a higher peak at the mode and fatter tails than the standard normal distribution. For an example have a look at the section on the Pearson type VII family I just added. --MarkSweep (call me collect) 02:29, 6 December 2006 (UTC)
On the other hand, the Cauchy distribution has a lower peak than the standard normal yet fatter tails than any density in the Pearson type VII family. However, its kurtosis and other moments are undefined. --MarkSweep (call me collect) 04:00, 6 December 2006 (UTC)
Another explanation: it's not just peaks and tails, don't forget about the shoulders. Leptokurtic density with a higher peak and fatter tails have lower shoulders than the normal distribution. Take the density of the Laplace distribution with unit variance:
${\displaystyle f(x)={\frac {1}{\sqrt {2}}}\exp(-{\sqrt {2}}|x|)\!}$
For reference, the standard normal density is
${\displaystyle g(x)={\frac {1}{\sqrt {2\pi }}}\exp(-x^{2}/2)\!}$
Now f and g intersect at four points, whose x values are ${\displaystyle \pm {\sqrt {2}}\pm {\sqrt {2-\ln \pi }}}$. Focus on three intervals (on the positive half-line, the negative case is the same under symmetry):
• Peak ${\displaystyle (0,{\sqrt {2}}-{\sqrt {2-\ln \pi }})\approx (0,0.49)}$ Here the Laplace density is greater than the normal density and so the Laplace probability of this interval (that is, the definite integral of the density) is greater (0.25 vs. 0.19 for the normal density).
• Shoulder ${\displaystyle ({\sqrt {2}}-{\sqrt {2-\ln \pi }},{\sqrt {2}}+{\sqrt {2-\ln \pi }})\approx (0.49,2.34)}$ Here the normal density is greater than the Laplace. The normal probability of this interval is 0.30 vs. 0.23 for the Laplace.
• Tail ${\displaystyle ({\sqrt {2}}+{\sqrt {2-\ln \pi }},\infty )\approx (2.34,\infty )}$ Here the Laplace density is again greater. Laplace probability is 0.02, normal probability is 0.01.
Because we focus on the positive half-line, the probabilities for each distribution sum to 0.5. And even though the Laplace density allocates about twice as much mass to the tail compared with the normal density, in absolute terms the difference is very small. The peak of the Laplace is acute and the region around it is narrow, hence the difference in probability between the two distributions is not very pronounced. The normal distribution compensates by having more mass in the shoulder interval (0.49,2.34). --MarkSweep (call me collect) 08:57, 6 December 2006 (UTC)

Looking at the Pearson Distribution page - isn't the example a Pearson V, not Pearson VII as stated in the title? And, if not, where is more info on Type VII - the Pearson Wikipedia page only goes up to V. 128.152.20.33 19:34, 7 December 2006 (UTC)

Obviously the article on the Pearson distributions is woefully incomplete. As the present article points out, the Pearson type VII distributions are precisely the symmetric type IV distributions. --MarkSweep (call me collect) 05:35, 8 December 2006 (UTC)

## Was someone having us on ? (hoax)

"A distribution whose kurtosis is deemed unaccepatably large or small is said to be kurtoxic. Similarly, if the degree of skew is too great or little, it is said to be skewicked" – two words that had no hits in Google. I think someone was kidding us. DFH 20:33, 9 February 2007 (UTC)

Agree, zero google hits. --Salix alba (talk) 21:24, 9 February 2007 (UTC)

## leptokurtic / platykurtic

I think the definitions of lepto-/platy- kurtic in the article are confusing: the prefixes are reversed. I'm not confident enough in statistics to change this. Could someone who understands the subject check that this is the correct usage?

A distribution with positive kurtosis is called leptokurtic, or leptokurtotic. In terms of shape, a leptokurtic distribution has a more acute "peak" around the mean (that is, a higher probability than a normally distributed variable of values near the mean) and "thin tails" (that is, a lower probability than a normally distributed variable of extreme values). Examples of leptokurtic distributions include the Laplace distribution and the logistic distribution.

A distribution with negative kurtosis is called platykurtic, or platykurtotic. In terms of shape, a platykurtic distribution has a smaller "peak" around the mean (that is, a lower probability than a normally distributed variable of values near the mean) and "heavy tails" (that is, a higher probability than a normally distributed variable of extreme values).

leptokurtic: –adjective Statistics. 1. (of a frequency distribution) being more concentrated about the mean than the corresponding normal distribution. 2. (of a frequency distribution curve) having a high, narrow concentration about the mode. [Origin: 1900–05; lepto- + irreg. transliteration of Gk kyrt(ós) swelling + -ic]

lepto- a combining form meaning "thin," "fine," "slight"

platykurtic: 1. (of a frequency distribution) less concentrated about the mean than the corresponding normal distribution. 2. (of a frequency distribution curve) having a wide, rather flat distribution about the mode. [Origin: 1900–05; platy- + kurt- (irreg. < Gk kyrtós bulging, swelling) + -ic]

platy- a combining form meaning "flat," "broad".

--Blick 19:43, 21 February 2007 (UTC)

The current usage is correct and agrees with other references, e.g. [1][2][3]. DFH 21:39, 21 February 2007 (UTC)
I don't think that the problem is with the words platykurtic and leptokurtic, which is what your references are to. It's the issue that leptokurtic is described as having heavy tails. The more common explanation is that leptokurtic distributions have thin tails and that platykurtic distributions have heavy tails. Phillipkwood (talk)
I'm not sure about the prefixes, but the changes you made earlier today were definitely wrong and did not agree with other outside sources. I wasted a lot of time trying to make sense of it before I noticed your edit, and then I undid it. Tabako (talk) 00:11, 11 November 2008 (UTC)

Well, I think it does agree with outside sources, at least the _American Statistican_ Maybe, to make it less confusing, it's helpful to talk about length (which is what you're talking about) and thinness. Here's a quote:(Kevin P. Balanda and H. L. MacGillivray The American Statistician, Vol. 42, No. 2 (May, 1988), pp. 111-119.) Who write: "Dyson (1943) gave two amusing mnemonics attributed to Student for these names: platykurtic curves, like playpuses, are square with short tails whereas leptokurtic curves are high with long tails, like kangaroos, noted for "lepping" The terms supposedly refer to the general shape of a distribution, withplatykurtic distributions being flat topped compared with the normal, leptokurtic distributions being more sharply peaked than the normal, and mesokurtic distributions having shape comparable to that of the normal. So, yes, "leptokurtic" distributions have long and thin tails, Platykurtic distributions have short heavy tails.). —Preceding unsigned comment added by 128.206.28.43 (talk) 15:56, 11 November 2008 (UTC)

I'm still not sure about this. I'm suggesting that instead of describing a platykurtic distribution as one with "thin tails", we should say "broad peak". Would you agree? --Blick 07:30, 5 March 2007 (UTC)

Not really. Moments are more sensitive to the tails, because of the way powers work. The squares of 1 , 2, 3 etc. are 1, 4, 9 etc. which are successively spaced farther apart. The effect is greater for 4th powers. So, although the names playkurtic and leptokurtic are inspired by the appearance of the centre of the density function, the tails are more important. Also it is the behaviour of the tails that determine how robust statistical methods will be and the kurtosis is one diagnostic for that.203.97.74.238 00:46, 1 September 2007 (UTC)Terry Moore

I agree with Terry, but given the American Statistician terminology, I made a minor edit to the page to reflect this discussion- i.e., that "thin" and "thick" refers to the height of the PDF. Reading the original American Statistician paper, reflects some of the language on this point and this seemed to be the most accurate compromise. I checked the "standard terminology" references above, and nothing is mentioned in those about thickness versus thinness- they're just definitions that all of us seem to agree on. Phillipkwood (talk) 15:00, 8 December 2008 (UTC)
Hm. I looked at it and I suggest that:
• "peak" should be peakdone
• "fat tail", "thin tail" should be fat tail, thin taildone
• fat tail should be a link → done
• thin tail should be sub Gaussian (not super Gaussian, and without quotes) → done
• fat tail should be super Gaussian (not sub Gaussian, and without quotes) → done
Other than these typographical changes, the terms leptokurtic and mesokurtic should be made consistent in the article and between articles (such as those about fat tail and heavy tail) → done. --Fpoto (talk) 18:37, 8 December 2008 (UTC)

## L-kurtosis

I don't have the time to write about that, but I think the article should mention L-kurtosis, too. --Gaborgulya (talk) 01:13, 22 January 2008 (UTC)

## why 3?

to find out if its mesokurtic, platykurtic or leptokurtic, why compare it to 3? —Preceding unsigned comment added by Reesete (talkcontribs) 10:18, 5 March 2008 (UTC)

The expected Kurtosis for sample of IID standard normal data is 3 (see the wiki article on the normal distribution for more). We tend to refer to excess kurtosis as the sample kurtosis of a series -3 for that reason.. —Preceding unsigned comment added by 62.30.156.106 (talk) 21:42, 14 March 2008 (UTC)

## Bias?

Perhaps the article should include more explicit notes on bias. In particular, I'm wondering why the formula is using biased estimates of sample moments about the mean; perhaps someone more knowledgeable than I might explain why this is the preferred formula? —Preceding unsigned comment added by 140.247.11.37 (talk) 14:30, 25 June 2008 (UTC)

## Excess kurtosis - confusing phrasing

The way the "modern" definition is phrased in the article makes it look like ${\displaystyle \gamma _{2}}$ could be what they're referring to as excess kurtosis.

Yes, this is correct.

However, I get the impression that "excess kurtosis" is actually the "minus 3" term. Is this correct? kostmo (talk) 05:57, 25 September 2008 (UTC)

It is the expression containing the -3, which is equal to ${\displaystyle \gamma _{2}}$. I think the phrase is correctly stated. --Pot (talk) 22:27, 3 January 2009 (UTC)

## A nice definition

I found this:

 k is best interpreted as a measure of dispersion of the values
of Z^2 around their expected value of 1, where as usual
Z = (X-mu)/sigma


It has been written by Dick Darlington in an old mail thread. It does not account for the -3 used in Wikipedia's article, but it is clear and could be added to the initial definition. --Pot (talk) 10:40, 19 February 2009 (UTC)

## Get Better Examples

Okay, I barely understand the statistical part of the article, why do you have to use an example that involves something only botanists and biologists can understand.. I undertsnad that encyclopaedias are supposed to be erudite, but not pedantic. They shouldn't make you have to keep clicking on newer and newer subjects that you have to read up on just so you can understand the one you originally started with. An example in an encyclopaedia is supposed simple and straightforward, something the uninitiated laymen can understand, not something having to do with red-lights and gravitropic celeoptiles. It's the people's encyclopaedia, you don't have to dumb it down to make it more accessible. My point is, get a better visual example for what kurtosis is. —Preceding unsigned comment added by 70.73.34.109 (talk) 10:26, 30 April 2009 (UTC)

I agree - two clear examples, of very high and very low kurtosis, would make this article much clearer, and much easier to understand at a glance. Use a couple of every-day activities to prove the point. 165.193.168.6 (talk) 12:27, 13 August 2013 (UTC)

## Glaring Error?

I'm no statistician, but the description of leptokurtosis currently says it has a more acute peak and fatter tails, whereas playkurtosis has a flatter peak and thinner tails. A quick mental diagram demonstrates to me that this is impossible, and the author(s) must have confused the thickness of the tails for the two cases. A leptokurtic curve must have thinner tails and a platykurtic curve must have fatter tails. Unless anyone objects, I'll correct this in a moment. —Preceding unsigned comment added by 194.153.106.254 (talk) 10:33, 23 July 2009 (UTC)

No error, the description was correct, I've reverted your change. Just try to read and understand the article, the description is reasonably well done and graphical examples are in place. Next time, please do no change a math description unless you fully understand it. Raising a problem in the discussion page is good, but wait for someone to answer you doubts before editing. --Pot (talk) 14:17, 23 July 2009 (UTC)
I suspect this paragraph has been reversed again. It says "In terms of shape, a leptokurtic distribution has a more acute peak around the mean and thinner tails. Looking at the rest of the article, it should says fatter tails, isn't it ? However I don't feel confident enough to change this. -- Pierre —Preceding unsigned comment added by 160.228.203.130 (talk) 13:52, 17 May 2011 (UTC)
Well, since I've noticed the false change was just done today, I've reverted it myself -- Pierre —Preceding unsigned comment added by 160.228.203.130 (talk) 14:00, 17 May 2011 (UTC)

## Alternative to -3 for kurtosis!

On Latin wiki page for Distributio normalis you find a recent (2003) scientific paper which rearranges differently the fourth moment to define a number said in English arch (and in Latin fornix) which ranges from 0 to infinity (and for the normal distribution is 1) instead of the quite strange [-2, infinity). by Alexor65 — Preceding unsigned comment added by Alexor65 (talkcontribs) 20:24, 29 March 2011 (UTC)

Link "Celebrating 100 years of Kurtosis" does not work because file has changed address, now it is at least in faculty.etsu.edu/seier/doc/Kurtosis100years.doc ----Alexor65 —Preceding unsigned comment added by 151.76.68.54 (talk) 21:02, 2 April 2011 (UTC)

## First sentence and citation

The first sentence states "In probability theory and statistics, kurtosis is a measure of the "peakedness" of the probability distribution of a real-valued random variable, although some sources are insistent that heavy tails, and not peakedness, is what is really being measured by kurtosis.[1]"

The reference given says, "The heaviness of the tails of a distribution affects the behavior of many statistics. Hence it is useful to have a measure of tail heaviness. One such measure is kurtosis...Statistical literature sometimes reports that kurtosis measures the peakedness of a density. However, heavy tails have much more influence on kurtosis than does the shape of the distribution near the mean (Kaplansky 1945; Ali 1974; Johnson, et al. 1980)."

The reference seems to directly contradict the first sentence. — Preceding unsigned comment added by 140.226.46.75 (talk) 21:26, 7 October 2011 (UTC)

Two things need to be distinguished: (i) kurtosis as a general concept or thing that might be measured ... I have 3 stats dictionaries that say that this is essentially "peakedness"; (ii) the specific measure of kurtosis based on centred ordinary fourth (and second moments) ... for which the property that it is more a measure of log-tailedness might well be valid. Overall this article needs to be re-arranged to distinguish these two points, along the lines of what is current in the article Skewness and which includes several different measures of skewness. The use of the SAS reference (alone) is/was unclear as is can be considered correct as stated, since the SAS reference is an example of a "source" that does claim that "heavy tails have much more influence on kurtosis than does the shape of the distribution near the mean". (Thus a ref for the end of the sentence, not the whole thing.) I will add another ref and rearrange the start. Melcombe (talk) 12:03, 10 October 2011 (UTC)

## Clarity of Introduction

When I first read through the introduction, I did not understand what it was saying. However, I read through it again, and, perhaps because I picked up on some word I missed the fist time around, the meaning of the introduction became completely clear. As this seems to suggest that understanding the introduction hinges (or at least is heavily dependent on) on noticing and understanding a very small portion of it, I would suggest that a small, one-sentence "introduction introduction" be added above (or included in) the current introduction, such that it would quickly convey to readers a general "complexity level" at which the article deals with its subject. To clarify, in this context I am using the phrase "complexity level" to refer to a measure of a work's position on a "sliding scale" of sorts that measures the amount that a work is affected by the general tendency of larger words to become more critical to comprehension as the complexity of a work's subject (among other factors) increases. For instance, a college-level thermodynamics textbook is unlikely to spend the same amount of time leading up to a definition of thermal conduction and insulation that an elementary-level science textbook would. As such, a prior understanding of thermal conduction and insulation becomes more necessary to understand the rest of the book in the college textbook than in the elementary school textbook.

Alternately, the "complexity level" could be reduced, for instance, by using more familiar terms than, also for instance, "peakedness", which, while helping the reader to associate the concept with common phrases such as "highly peaked", could perhaps be moved lower in the introduction (or even put into the article itself) and replaced by another term, such as "sharpness", and reducing the repetition of clauses, such as removing the redundant "just as for skewness" in the second sentence.

Aero-Plex (talk) 17:24, 10 November 2011 (UTC)

(Split due to needing to reset the router)

EDIT: Unfortunately, I do not have the necessary time to read and edit the article now, and probably won't for some time, so I cannot edit the article for now. However, from my brief skim through, I did notice that 1. there is a noticeable amount of repetition of terminology, which could be improved, 2. no compact, direct description of a graph with high/low kurtosis is made in the text, and I could only find one by looking in the image descriptions, which may not be noticed by some (I suggest that a sentence along the lines of "High kurtosis causes narrow curves, while low kurtosis causes wide graphs." be added somewhere in the article where it would be noticed), and 3. the "coin toss" example could be better elaborated on, as it seems like it could be very helpful, especially for people who are only coming to this page for a quick summary.

Aero-Plex (talk) 17:41, 10 November 2011 (UTC)

## The usual estimator of the population kurtosis

I just conducted a simulation study which seems to confirm that "The usual estimator of the population kurtosis" is in fact an estimator for excess curtosis. Which seems to make sense given the lest part of the formula -3*X — Preceding unsigned comment added by 83.89.29.84 (talk) 23:28, 24 January 2012 (UTC) ALSO: it claims that the estimator is used in Excel, however the excel formula seems to use standard deviation rather than variance: http://office.microsoft.com/en-us/excel-help/kurt-HP005209150.aspx BUT the wiki artickle claims it must be the unbiased standard deviation estimator, which i dont believe exist.. — Preceding unsigned comment added by 83.89.29.84 (talk) 00:11, 25 January 2012 (UTC)

## wrong wrong wrong about peakedness

As has been pointed out above, kurtosis is NOT an accurate measure of peakedness. This should be obvious by looking at a graph of the Student's t-distribution with degrees of freedom above 4 and trying to see if you can see anything approaching sharp peakedness as the d.o.f. drops down to 4 and the kurtosis shoots up to infinity. Similarly, look at the gamma distribution graph and try to notice any correlation at all between the sharpness and softness of the peak when k > 1 and the smallness of k (higher kurtosis). The point is that kurtosis measures ONLY heaviness of the tails — and contrary to the former text, there's no difference in this respect between Pearson's kurtosis and excess kurtosis. (Nor can there be, since the two are identical save for being shifted by 3). In fact, it should be obvious that heavy tails and sharp peaks CANNOT in general be correlated -- i.e. could radically change the shape of the peak in the middle of the graph by a strictly local rearrangement of the nearby area while leaving the tails entirely untouched. It's rather sad that a basic article like this had such basic errors for such a long time, but at least they are fixed now. Benwing (talk) 07:10, 23 March 2012 (UTC)

OK, it's more complicated than I thought. In fact, Balanda and MacGillivray claim that kurtosis isn't necessarily an accurate measure of tail weight, either, and propose a vague definition of moving probability mass off the shoulders onto the peak and tail, which IMO isn't very helpful intuitively. I still think the most intuitive interpretation should basically specify tail-heaviness. I will rewrite the remainder of the article (post-intro) to be more careful about this. Benwing (talk) 08:09, 23 March 2012 (UTC)

I found the following article to be very helpful in explaining the common misconceptions about kurtosis, specifically related to its use a a measure of "peakedness" and "tail weight." (http://www.columbia.edu/~ld208/psymeth97.pdf) Basically, it explains that kurtosis is a movement of mass not explained by the variance. Thus, when we see heavier tails, this means that data points are spread further out, which should lead to an increase in the variance - BUT if the there is also an increase in the number of data points near the mean, this leads to a decrease in the variance; kurtosis is able to explain the change in shape of a distribution when both of these occurrences happen to equality.

## Expand Applications

Section Kurtosis#Applications is tagged with {{Expand section}}; here are some suggestions: Special:WhatLinksHere/Kurtosis. Fgnievinski (talk) 04:34, 19 January 2013 (UTC)

## Add sources from IDRE/UCLA for the various definitions of kurtosis, including citations and which versions SAS, SPSS and STATA use

Here are some sources from IDRE/UCLA for the various definitions of kurtosis, including citations and which versions SAS, SPSS and STATA use [4], [5]. Regards, Anameofmyveryown (talk) 18:53, 11 March 2013 (UTC)

## Edit protected erroneous reference ?

Hi. Ref. 4 (Pearson, Biometrika 1929) about the lower bound of the kurtosis, is edit protected and it is erroneous, including the doi: anybody can check. Even if the ref to Pearson can be corrected, the previous reference (paper of 2013) which was reverted the 5 February 2014 remains eligible because (a) it applies to a wider class of distributions than the finite discrete ones (because the proof of 2013 use math expectations), and (b) it is issued from a more general inequality applied to d-variate distributions, established in 2004 (ref cited in the 2013 paper). I can send the pdfs of the 2004 and of the 2013 paper to the administrator (please just tell me how to do that) and to interested people. At least please correct the erroneous reference about Pearson, if it is relevant. If it is not possible, please undo the change of the 5 February 2014, and replace ref. 4 by the previous ref 4, which is: [1] Thank you. Michel.

• Requested further details follow.

About the actual ref 4: Pearson, K. (1929). "Editorial note to ‘Inequalities for moments of frequency functions and for various statistical constants’". Biometrika 21 (1–4): 370–375. doi:10.1093/biomet/21.1-4.361 (1) The toc of Biometrika 1929, 21(1-4) is at: http://biomet.oxfordjournals.org/content/21/1-4.toc I failed to find this paper of Pearson on this toc, and I failed to find it with ZMATH. The doi redirects to the paper of Joanes and Gill, "The Statistician" 1998, vol 47, part 1, pp. 183-189. Indeed it deals with skeweness and kurtosis, but it does not cite Pearson and it does not give a general proof of the inequality valid for any random variable distribution. Anyway there is disagreement between the doi and the ref to Pearson. (2) The 2013 paper is publicly available on http://petitjeanmichel.free.fr/itoweb.petitjean.skewness.html (see ref. 2: "download pdf paper"): see the result top of p.3 and eq. 6. The proof of the more general inequality for random vectors is in my paper: "From Shape Similarity to Shape Complementarity: Toward a Docking Theory." J. Math. Chem. 2004,35[3],147-158. (DOI 10.1023/B:JOMC.0000033252.59423.6b), see eq. A10 in the appendix. I cannot load it on the web due to the copyright. Only one assumption: the moments of order 4 must exist (so, it is not restricted to samples). I do not claim to have discovered the sharp lower bound of the kurtosis, even in its more general form, and I do not care if my 2013 paper is not cited. However I was the first to mention the inequality on the Wikipedia page, and at first glance my own proof seems to be original. I just say that the reader should be directed to a proof valid in all cases, e.g. via a valid source. If the ref works only for samples, the text should be updated accordingly. To conclude, I give you the hint for the full proof for random variables (for vectors, see the 2004 paper), available to anybody aware of math expectations: X1 and X2 are random variables, translate X2, calculate the translation minimizing the variance of the squared difference of the random variables, and look at the expression of the minimized variance: it should be a non negative quantity, hence the desired inequality. Mailto: petitjean.chiral@gmail.com (preferred) or michel.petitjean@univ-paris-diderot.fr 81.194.29.18 (talk) 13:55, 10 December 2014 (UTC)

— Preceding unsigned comment added by 81.194.29.18 (talk) 18:54, 8 December 2014 (UTC)


References

1. ^ Petitjean M. (2013), "The Chiral Index: Applications to Multivariate Distributions and to 3D molecular graphs", Proceedings of 12th International Symposium on Operational Research in Slovenia SOR’13, pp. 11-16, L. Zadnik Stirn, J. Zerovnik, J. Povh, S. Drobne, A. Lisec, Eds., Slovenian Society INFORMATIKA (SDI), Section for Operations Research (SOR), ISBN 978-961-6165-40-2
Not done: According to the page's protection level and your user rights, you should be able to edit the page yourself. If you seem to be unable to, please reopen the request with further details. Anupmehra -Let's talk! 13:19, 9 December 2014 (UTC)

I pasted the wrong doi of ref.4 (mouse catched the doi of the line above). In fact the Editorial is appended to the paper of Shohat. I cancel my request. Please accept my apologies for the inconvenience caused. Thanks for your patience.81.194.29.18 (talk) 14:33, 10 December 2014 (UTC)