Parametric statistics is a branch of statistics which assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. Most well-known elementary statistical methods are parametric. The difference between parametric model and non-parametric model is that the former has a fixed number of parameters, while the latter grows the number of parameters with the amount of training data.
Generally speaking parametric methods make more assumptions than non-parametric methods. If those extra assumptions are correct, parametric methods can produce more accurate and precise estimates. They are said to have more statistical power. However, if assumptions are incorrect, parametric methods can be very misleading. For that reason they are often not considered robust. On the other hand, parametric formulae are often simpler to write down and faster to compute. In some, but definitely not all cases, their simplicity makes up for their non-robustness, especially if care is taken to examine diagnostic statistics.
Suppose we have a sample of 99 test scores with a mean of 100 and a standard deviation of 1. If we assume all 99 test scores are random samples from a normal distribution we predict there is a 1% chance that the 100th test score will be higher than 102.365 (that is the mean plus 2.365 standard deviations) assuming that the 100th test score comes from the same distribution as the others. The normal family of distributions all have the same shape and are parameterized by mean and standard deviation. That means if you know the mean and standard deviation, and that the distribution is normal, you know the probability of any future observation. Parametric statistical methods are used to compute the 2.365 value above, given 99 independent observations from the same normal distribution.
A non-parametric estimate of the same thing is the maximum of the first 99 scores. We don't need to assume anything about the distribution of test scores to reason that before we gave the test it was equally likely that the highest score would be any of the first 100. Thus there is a 1% chance that the 100th is higher than any of the 99 that preceded it.
"Most of these developments have this feature in common, that the distribution functions of the various stochastic variables which enter into their problems are assumed to be of known functional form, and the theories of estimation and of testing hypotheses are theories of estimation of and of testing hypotheses about, one or more parameters. . ., the knowledge of which would completely determine the various distribution functions involved. We shall refer to this situation. . .as the parametric case, and denote the opposite case, where the functional forms of the distributions are unknown, as the non-parametric case."
- Geisser, S.; Johnson, W.M. (2006) Modes of Parametric Statistical Inference, John Wiley & Sons, ISBN 978-0-471-66726-1
- Cox, D.R. (2006) Principles of Statistical Inference, Cambridge University Press, ISBN 978-0-521-68567-2
- Murphy, Kevin (2012). Machine Learning: A Probabilistic Perspective. MIT. p. 16. ISBN 978-0262018029.
- Corder; Foreman (2009) Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach, John Wiley & Sons, ISBN 978-0-470-45461-9
- Freedman, D. (2000) Statistical Models: Theory and Practice, Cambridge University Press, ISBN 978-0-521-67105-7
- Wolfowitz, J. (1942) Annals of Mathematical Statistics, XIII, p. 264 (1942)