|This article needs additional citations for verification. (May 2012)|
In descriptive statistics, the interquartile range (IQR), also called the midspread or middle fifty, is a measure of statistical dispersion, being equal to the difference between the upper and lower quartiles, IQR = Q3 − Q1. In other words, the IQR is the 1st quartile subtracted from the 3rd quartile; these quartiles can be clearly seen on a box plot on the data. It is a trimmed estimator, defined as the 25% trimmed range, and is the most significant basic robust measure of scale.
The IQR is used to build box plots, simple graphical representations of a probability distribution.
Data set in a table
i x[i] Quartile))) 1 102 2 104 3 105 Q1 4 107 5 108 6 109 Q2
7 110 8 112 9 115 Q3 10 116 11 118
For the data in this table the interquartile range is IQR = 115 − 105 = 10.
Data set in a plain-text box plot
+-----+-+ o * |-------| | |---| +-----+-+ +---+---+---+---+---+---+---+---+---+---+---+---+ number line 0 1 2 3 4 5 6 7 8 9 10 11 12
For the data set in this box plot:
- lower (first) quartile Q1 = 7
- median (second quartile) Q2 = 8.5
- upper (third) quartile Q3 = 9
- interquartile range, IQR = Q3 − Q1 = 2
Interquartile range of distributions
The interquartile range of a continuous distribution can be calculated by integrating the probability density function (which yields the cumulative distribution function — any other means of calculating the CDF will also work). The lower quartile, Q1, is a number such that integral of the PDF from -∞ to Q1 equals 0.25, while the upper quartile, Q3, is such a number that the integral from -∞ to Q3 equals 0.75; in terms of the CDF, the quartiles can be defined as follows:
where CDF−1 is the quantile function.
The interquartile range and median of some common distributions are shown below
|Normal||μ||2 Φ−1(0.75)σ ≈ 1.349σ ≈ (27/20)σ|
|Laplace||μ||2b ln(2) ≈ 1.386b|
Interquartile range test for normality of distribution
The IQR, mean, and standard deviation of a population P can be used in a simple test of whether or not P is normally distributed, or Gaussian. If P is normally distributed, then the standard score of the first quartile, z1, is -0.67, and the standard score of the third quartile, z3, is +0.67. Given mean = X and standard deviation = σ for P, if P is normally distributed, the first quartile
and the third quartile
If the actual values of the first or third quartiles differ substantially[clarification needed] from the calculated values, P is not normally distributed.
Interquartile range and outliers
The interquartile range is often used to find outliers in data. Outliers are observations that fall below Q1 - 1.5(IQR) or above Q3 + 1.5(IQR). In a boxplot, the highest and lowest occurring value within this limit are drawn as bar of the whiskers, and the outliers as individual points.