|This article needs additional citations for verification. (May 2012) (Learn how and when to remove this template message)|
In descriptive statistics, the interquartile range (IQR), also called the midspread or middle 50%, or technically H-spread, is a measure of statistical dispersion, being equal to the difference between 75th and 25th percentiles, or between upper and lower quartiles, IQR = Q3 − Q1. In other words, the IQR is the 1st quartile subtracted from the 3rd quartile; these quartiles can be clearly seen on a box plot on the data. It is a trimmed estimator, defined as the 25% trimmed range, and is the most significant basic robust measure of scale.
The interquartile range (IQR) is a measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts. The values that separate parts are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively.
Data set in a table
i x[i] Quartile 1 7 2 7 3 21 4 31 Q1 5 47 6 75 7 87 Q2
8 115 9 116 10 119 Q3 11 119 12 155 13 177
For the data in this table the interquartile range is IQR = Q3 − Q1 = 119 − 31 = 88.
Data set in a plain-text box plot
+-----+-+ o * |-------| | |---| +-----+-+ +---+---+---+---+---+---+---+---+---+---+---+---+ number line 0 1 2 3 4 5 6 7 8 9 10 11 12
For the data set in this box plot:
- lower (first) quartile Q1 = 7
- median (second quartile) Q2 = 8.5
- upper (third) quartile Q3 = 9
- interquartile range, IQR = Q3 − Q1 = 2
Interquartile range of distributions
The interquartile range of a continuous distribution can be calculated by integrating the probability density function (which yields the cumulative distribution function — any other means of calculating the CDF will also work). The lower quartile, Q1, is a number such that integral of the PDF from -∞ to Q1 equals 0.25, while the upper quartile, Q3, is such a number that the integral from -∞ to Q3 equals 0.75; in terms of the CDF, the quartiles can be defined as follows:
where CDF−1 is the quantile function.
The interquartile range and median of some common distributions are shown below
|Normal||μ||2 Φ−1(0.75)σ ≈ 1.349σ ≈ (27/20)σ|
|Laplace||μ||2b ln(2) ≈ 1.386b|
Interquartile range test for normality of distribution
The IQR, mean, and standard deviation of a population P can be used in a simple test of whether or not P is normally distributed, or Gaussian. If P is normally distributed, then the standard score of the first quartile, z1, is -0.67, and the standard score of the third quartile, z3, is +0.67. Given mean = X and standard deviation = σ for P, if P is normally distributed, the first quartile
and the third quartile
If the actual values of the first or third quartiles differ substantially[clarification needed] from the calculated values, P is not normally distributed. However, a normal distribution can be trivially perturbed to maintain its Q1 and Q2 std. scores at 0.67 and -0.67 and not be normally distributed (so the above test would produce a false positive). A better test of normality, such as Q-Q plot would be indicated here.
Interquartile range and outliers
The interquartile range is often used to find outliers in data. Outliers are observations that fall below Q1 - 1.5(IQR) or above Q3 + 1.5(IQR). In a boxplot, the highest and lowest occurring value within this limit are drawn as bar of the whiskers, and the outliers as individual points.
- Upton, Graham; Cook, Ian (1996). Understanding Statistics. Oxford University Press. p. 55. ISBN 0-19-914391-9.
- Zwillinger, D., Kokoska, S. (2000) CRC Standard Probability and Statistics Tables and Formulae, CRC Press. ISBN 1-58488-059-7 page 18.
- Rousseeuw, Peter J.; Croux, Christophe (1992). Y. Dodge, ed. "Explicit Scale Estimators with High Breakdown Point" (PDF). L1-Statistical Analysis and Related Methods. Amsterdam: North-Holland. pp. 77–92.
- Yule, G. Udny (1911). An Introduction to the Theory of Statistics. Charles Griffin and Company. pp. 147–148.
- Weisstein, Eric W. "Quartile Deviation". MathWorld.