Parametric statistics is a branch of statistics which assumes that sample data comes from a population that follows a probability distribution based on a fixed set of parameters. Most well-known elementary statistical methods are parametric. Conversely a non-parametric model differs precisely in that the parameter set (or feature set in machine learning) is not fixed and can increase, or even decrease if new relevant information is collected.
Since a parametric model relies on a fixed parameter set, it assumes more about a given population than non-parametric methods do. When the assumptions are correct, parametric methods will produce more accurate and precise estimates than non-parametric methods, i.e. have more statistical power. However, as more is assumed by parametric methods, when the assumptions are not correct they have a greater chance of failing, and for this reason are not robust statistical methods. On the other hand, parametric formulae are often simpler to write down and faster to compute. For this reason their simplicity can make up for their lack of robustness, especially if care is taken to examine diagnostic statistics.
The normal family of distributions all have the same general shape and are parameterized by mean and standard deviation. That means that if the mean and standard deviation are known and if the distribution is normal, the probability of any future observation lying in a given range is known. Suppose we have a sample of 99 test scores with a mean of 100 and a standard deviation of 1. If we assume all 99 test scores are random observations from a normal distribution we predict there is a 1% chance that the 100th test score will be higher than 102.365 (that is, the mean plus 2.365 standard deviations) assuming that the 100th test score comes from the same distribution as the others. Parametric statistical methods are used to compute the 2.365 value above, given 99 independent observations from the same normal distribution.
A non-parametric estimate of the same thing is the maximum of the first 99 scores. We don't need to assume anything about the distribution of test scores to reason that before we gave the test it was equally likely that the highest score would be any of the first 100. Thus there is a 1% chance that the 100th is higher than any of the 99 that preceded it.
Parametric functions were mentioned by R. Fisher in his work Statistical Methods for Research Workers in 1925 which created the foundation for modern statistics.
- Geisser, S.; Johnson, W.M. (2006) Modes of Parametric Statistical Inference, John Wiley & Sons, ISBN 978-0-471-66726-1
- Cox, D.R. (2006) Principles of Statistical Inference, Cambridge University Press, ISBN 978-0-521-68567-2
- Murphy, Kevin (2012). Machine Learning: A Probabilistic Perspective. MIT. p. 16. ISBN 978-0262018029.
- Corder; Foreman (2009) Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach, John Wiley & Sons, ISBN 978-0-470-45461-9
- Freedman, D. (2000) Statistical Models: Theory and Practice, Cambridge University Press, ISBN 978-0-521-67105-7