Talk:Quantile

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 

Objection to current definition[edit]

In my [Mood, Graybill, Boes] the quantile is defined in a completely different way:

The q-th quantile of a random variable X or of the corresponding distribution is defined as the smallest number ξ that satisfies F_X(ξ) <= q.

I find that definition simpler to explain and understand, and more general. Quantiles are continuous, not necessarily chopped in 4, 5, 10 or 100 or whatever. Else it would not be possible to express an irrational quantile (e.g. 1/sqrt(2)). And let's face it, what would the world be without irrational quantiles? :-)

Also this article, as it stands, does not clearly recognise that quantiles are properties of the distribution (be it continuous or discrete), and not of a sample. From a sample, we can only estimate (and not calculate, by the way) these attributes.--PizzaMargherita 20:18, 4 October 2005 (UTC)

While that definition may be more general it is not completely different. However quantiles are discreet not continuous (see with k an integer satisfying 0 < k < q). A continuous quantile would essentially be an inverse CDF. Where quantile 1/sqrt(2) of a 4-quantile (read quartile) would be equall to CDF^{-1}(\frac{1}{4\cdot\sqrt{2}})
More over I must object to that definition via reductio ad absurdum seeing as a 4-th quantile of an 100-quantile would result in F_X(ξ) <= 4, but assuming that F_X(ξ) is the CDF function of a random variable, this would result in all values of a random variable seeing as F_X can not go beyond 1. I say reductio ad absurdum because CDF^{-1}(4) is an invalid argument to the inverse CDF, unless your using complex values in your calculations \sin^{-1}(\sqrt{2})=\frac{\pi}{2} - i\frac{\ln(2\sqrt{2} + 3)}{2} (I got this from a TI-89 calculator), I think it is safe to assume that it is absurd, but this is all dependent upon my assumption that F_X(ξ) is the CDF function of a random variable.NormDor 02:08, 20 February 2006 (UTC)
"quantiles are discreet not continuous" - Precisely my objection to the current definition.
"A continuous quantile would essentially be an inverse CDF." - No, it wouldn't. Not all CDFs are invertible.
"quantile 1/sqrt(2) of a 4-quantile" - This is not allowed by the current definition, which mandates k to be an integer.
I'm not sure I understand your "reductio ad absurdum" argument against MGB's definition, but I accept that it allows q-th quantiles to be defined for q<0 and q>1. However, that's easy enough to work around:
The q-th quantile (with 0 ≤ q ≤ 1) of a random variable X or of the corresponding distribution is defined as the smallest number ξ that satisfies FX(ξ) ≤ q.
I don't understand what complex numbers have to do with any of this. I'm not aware of any extensions in measure theory that allow a the inverse CDF to be defined for complex arguments, or indeed for arguments outside [0,1], unlike the well-known extension for logarithms and therefore inverse trigonometric functions.
Finally, your recent "Error Correction" to the article is wrong. "P" is not the CDF. I think the article needs a thorough review. I'll do it when I find some time. PizzaMargherita 23:28, 20 February 2006 (UTC)
I've just never thought about a q-th quantile to be in the domain [0, 1] (when I see k'th something I take k to be an integer, because I just don't see .83752'th quantile working in a sentence.) and thus my main argument is incorrect in the first place. However I don't see how a continuos CDF can not be invertible.
""P" is not the CDF." Opps... my bad.
"I'm not aware of any extensions" neither am I, I was just using it as an example in my flawed argument.
Personally I disagree with such usages but I have found other usages similar to what you have described. I guess you could call it a 1-quantile without the restriction of k being an integer. NormDor 06:02, 21 February 2006 (UTC)
I just checked the edit history and you can blame this entire fiasco on me. Sorry. NormDor 06:20, 21 February 2006 (UTC)
Hey, no worries. Take it easy. PizzaMargherita 07:07, 21 February 2006 (UTC)
All CDF's of discrete random variables are invertible when you interpolate between every discrete element as is allowed with quantile's (Weighted average). The interpolation would result in a CDF as a strictly increasing continuous function satisfying the "ONTO" and "ONE-TO-ONE" properties for the existence of an inverse function. Regarding the complex values, they were the result of mistaking (0,1) bounded continuous values for discreet values, resulting in CDF(X)>1 which is impossible (thergo the absurditiy in a strict unclarified interpretation of the given equation) unless you could consider complex values. --ANONYMOUS COWARD0xC0DE 04:21, 2 April 2007 (UTC)

Be nice for some NON Mathematical dicussions of these terms - you guys make it too hard!

Equivalent Characterisation[edit]

I think there might be a slight mistake in the equvivalent characterization of the p- and q-quantil. Should the second line not be P(X>x)\leq p or somthing simlair?

No, I do not believe that the definition should be P(X>x)\leq p, because it is my opinion that it is almost always done as P(X<x)\leq p. I believe that the author added the second equivalent to account for the actual method used in estimating the quantiles, where some times you round up and some times you round down. Along those lines, I believe that the two equivalent functions could just as easily be expressed as P(X\le x)\cong p\mbox{ and }P(X\ge x)\cong 1-p. I hope I didn't go on and on as much as I did last time (⇩⇩⇩See below⇩⇩⇩!!!) sorry about that I don't know what I was thinking. NormDor 13:38, 22 June 2006 (UTC)

Two images[edit]

Two images from Yrithinnd's list of Commons images might be of interest here?

--DO11.10 00:19, 5 May 2007 (UTC)

Tertile[edit]

I came across the word tertile, but found no article for it, so I made a stub. The word is infrequent in use, but I ask that someone with a better command of the statistical lingo than mine refine the article. --Shingra 08:29, 23 July 2007 (UTC)

The tertile stub has since been rewritten and transwikified: tertile. --Shingra (talk) 11:24, 23 November 2007 (UTC)

Dr.Math[edit]

I came across this http://mathforum.org/library/drmath/view/60969.html which i believe have a much simplier explanation.

Should we put it in the external links at least? —Preceding unsigned comment added by 189.33.225.219 (talk) 05:41, 25 December 2007 (UTC)

Estimating the quantiles of a population[edit]

I had added the "Quantiles of a sample" section, distinct from the "Quantiles of a population" section, because I have used the ideas of the former for several years, and had assumed that they were commonplace. However, since the time of adding that section, I have not seen any published works that are similar to my approach. Thus, I must conclude that my approach is actually original research — and I have removed it. Quantling (talk) 20:39, 10 August 2009 (UTC)

Quantiles of a sample, revisited: Can we really compute a 1 percentile or 99 percentile if we have, say, only N=2 points drawn from some distribution? The expected (average) percentile of the smaller of two points is 33 1/3 and the expected percentile of the larger of two points is 66 2/3. (See order statistics.) I have no problem computing percentiles between 33 1/3 and 66 2/3 by interpolating the two sampled values, but I am tempted to say that, for percentiles that are below 33 1/3 or above 66 2/3, we don't have enough information to estimate the value. Does that make sense? If so, is it appropriate to touch on these ideas in the article? Quantling (talk) 18:39, 2 February 2010 (UTC)

The distinction manifests itself in interpolation as well. If I had to choose one of N=3 points to represent the 35 percentile, I would choose the smallest value (with an expected percentile of 25, which is only 10 away from 35) rather than the middle value (50 percentile). However, the article tells me that the smallest value represents the percentile range up to at most 33 1/3 and that, among the three points, the middle value is the appropriate choice. Quantling (talk) 18:53, 2 February 2010 (UTC)

The current "Estimating the quantiles of a population" section now addresses these issues, by listing all R and SAS approaches for estimating quantiles. Quantling (talk) 15:28, 22 March 2010 (UTC)

Preferred method[edit]

I have removed the following

  • Monte Carlo simulations show that method R-5 is the preferred method for continuous data. ref name="schoonjans"Schoonjans F, De Bacquer D, Schmid P (2011). "Estimation of population percentiles". Epidemiology 22 (5): 750–751. doi:10.1097/EDE.0b013e318225c1de. /ref

as this is a letter in a medical journal and it looks as it the writers are expressing personal opinions having looked a particular case rather than widely held view among statisticians.--Rumping (talk) 22:59, 13 December 2011 (UTC)

Quartiles[edit]

I've added Quartiles to the list of Specialized quantiles. Is there a reason it wasn't there in the first place, while less common quantiles like duo-deciles are present? --Adam Matan (talk) 11:35, 8 September 2013 (UTC)