Normal probability plot
The data are plotted against a theoretical normal distribution in such a way that the points should form an approximate straight line. Departures from this straight line indicate departures from normality.
The normal probability plot is formed by:
- Vertical axis: Ordered response values
- Horizontal axis: Normal order statistic medians or means; see rankit
These are calculated according to the following formula. For each data value , find such that:
That is, the observations are plotted as a function of the corresponding normal order statistic medians. Another way to think about this is that the sample values are plotted against what we would expect to see if it was strictly consistent with the normal distribution.
If the data is consistent with a sample from a normal distribution the points should lie close to a straight line. As a reference, a straight line can be fit to the points. The further the points vary from this line, the greater the indication of departure from normality. If the sample has mean 0, standard deviation 1 then a line through 0 with slope 1 could be used. How close to the line the points will lie does depend on the sample size. For a large sample, > 100, we would expect the points to be very close to the reference line. Smaller samples will see a much larger variation, but might still be consistent with a normal sample.
Probability plots for distributions other than the normal are computed in exactly the same way. The normal quantile function G is simply replaced by the quantile function of the desired distribution. That is, a probability plot can easily be generated for any distribution for which one has the quantile function.
One advantage of this method of computing probability plots is that the intercept and slope estimates of the fitted line are in fact estimates for the location and scale parameters of the distribution. Although this is not too important for the normal distribution since the location and scale are estimated by the mean and standard deviation, respectively, it can be useful for many other distributions.
The correlation coefficient of the points on the normal probability plot can be compared to a table of critical values to provide a formal test of the hypothesis that the data come from a normal distribution.
This is a sample of size 50 from a normal distribution, plotted as both a histogram, and a normal probability plot.
This is a sample of size 50 from a right-skewed distribution, plotted as both a histogram, and a normal probability plot.
This is a sample of size 50 from a uniform distribution, plotted as both a histogram, and a normal probability plot.
||This article includes a list of references, related reading or external links, but its sources remain unclear because it lacks inline citations. (July 2011)|
- Chambers, John; William Cleveland, Beat Kleiner, and Paul Tukey (1983). Graphical Methods for Data Analysis. Wadsworth.
|Wikimedia Commons has media related to Probability plots.|
- Engineering Statistics Handbook: Normal Probability Plot
- Statit Support: Testing for "Near-Normality": The Probability Plot