In statistical analysis, Freedman's paradox, named after David Freedman, is a problem in model selection whereby predictor variables with no relationship to the dependent variable can pass tests of significance – both individually via a t-test, and jointly via an F-test for the significance of the regression. Freedman demonstrated (through simulation and asymptotic calculation) that this is a common occurrence when the number of variables is similar to the number of data points.
Specifically, if the dependent variable and k regressors are independent normal variables, and there are n observations, then as k and n jointly go to infinity in the ratio k/n=ρ, (1) the R2 goes to ρ, (2) the F-statistic for the overall regression goes to 1.0, and (3) the number of spuriously significant regressors goes to αk where α is the chosen critical probability (probability of Type I error for a regressor). This third result is intuitive because it says that the number of Type I errors equals the probability of a Type I error on an individual parameter times the number of parameters for which significance is tested.
More recently, new information-theoretic estimators have been developed in an attempt to reduce this problem, in addition to the accompanying issue of model selection bias, whereby estimators of predictor variables that have a weak relationship with the response variable are biased.
- Freedman, David A. (1983). "A Note on Screening Regression Equations". The American Statistician. 37 (2): 152–155. doi:10.1080/00031305.1983.10482729. ISSN 0003-1305.
- Freedman, Laurence S.; Pee, David (November 1989). "Return to a Note on Screening Regression Equations". The American Statistician. 43 (4): 279–282. doi:10.2307/2685389. JSTOR 2685389.
- Lukacs, P. M., Burnham, K. P. & Anderson, D. R. (2010) "Model selection bias and Freedman's paradox." Annals of the Institute of Statistical Mathematics, 62(1), 117–125 doi:10.1007/s10463-009-0234-4
- Burnham, K. P., & Anderson, D. R. (2002). Model Selection and Multimodel Inference: A Practical-Theoretic Approach, 2nd ed. Springer-Verlag.
|This statistics-related article is a stub. You can help Wikipedia by expanding it.|