List of unsolved problems in statistics

There are many longstanding unsolved problems in mathematics for which a solution has still not yet been found. The notable unsolved problems in statistics are generally of a different flavor; according to John Tukey,^[1] "difficulties in identifying problems have delayed statistics far more than difficulties in solving problems." A list of "one or two open problems" (in fact 22 of them) was given by David Cox.^[2]

Inference and testing

How to detect and correct for systematic errors, especially in sciences where random errors are large (a situation Tukey termed uncomfortable science).
The Graybill–Deal estimator is often used to estimate the common mean of two normal populations with unknown and possibly unequal variances. Though this estimator is generally unbiased, its admissibility remains to be shown.^[3]
Meta-analysis: Though independent p-values can be combined using Fisher's method, techniques are still being developed to handle the case of dependent p-values.
Behrens–Fisher problem: Yuri Linnik showed in 1966 that there is no uniformly most powerful test for the difference of two means when the variances are unknown and possibly unequal. That is, there is no exact test (meaning that, if the means are in fact equal, one that rejects the null hypothesis with probability exactly α) that is also the most powerful for all values of the variances (which are thus nuisance parameters). Though there are many approximate solutions (such as Welch's t-test), the problem continues to attract attention^[4] as one of the classic problems in statistics.
Multiple comparisons: There are various ways to adjust p-values to compensate for the simultaneous or sequential testing of hypotheses. Of particular interest is how to simultaneously control the overall error rate, preserve statistical power, and incorporate the dependence between tests into the adjustment. These issues are especially relevant when the number of simultaneous tests can be very large, as is increasingly the case in the analysis of data from DNA microarrays.^{[citation needed]}
Bayesian statistics: A list of open problems in Bayesian statistics has been proposed.^[5]

Experimental design

As the theory of Latin squares is a cornerstone in the design of experiments, solving the problems in Latin squares could have immediate applicability to experimental design.^{[citation needed]}

Problems of a more philosophical nature

Sampling of species problem: How is a probability updated when there is unanticipated new data?^[6]
Doomsday argument: How valid is the probabilistic argument that claims to predict the future lifetime of the human race given only an estimate of the total number of humans born so far?

Notes

^ Tukey, John W. (1954). "Unsolved Problems of Experimental Statistics". Journal of the American Statistical Association. 49 (268): 706–731. doi:10.2307/2281535. JSTOR 2281535.
^ Cox, D. R. (1984). "Present Position and Potential Developments: Some Personal Views: Design of Experiments and Regression". Journal of the Royal Statistical Society. Series A (General). 147 (2): 306–315. doi:10.2307/2981685. JSTOR 2981685.
^ Pal, Nabendu; Lim, Wooi K. (1997). "A note on second-order admissibility of the Graybill-Deal estimator of a common mean of several normal populations". Journal of Statistical Planning and Inference. 63: 71–78. doi:10.1016/S0378-3758(96)00202-9.
^ Fraser, D.A.S.; Rousseau, J. (2008). "Studentization and deriving accurate p-values" (PDF). Biometrika. 95: 1–16. doi:10.1093/biomet/asm093.
^ Jordan, M. I. (2011). "What are the open problems in Bayesian statistics?" (PDF). The ISBA Bulletin. 18 (1): 1–5.
^ Zabell, S. L. (1992). "Predicting the unpredictable". Synthese. 90 (2): 205. doi:10.1007/bf00485351. S2CID 9416747.

References

Linnik, Jurii (1968). Statistical Problems with Nuisance Parameters. American Mathematical Society. ISBN 0-8218-1570-9.
Sawilowsky, Shlomo S. (2002). "Fermat, Schubert, Einstein, and Behrens–Fisher: The Probable Difference Between Two Means When σ₁ ≠ σ₂". Journal of Modern Applied Statistical Methods. 1 (2). doi:10.22237/jmasm/1036109940.

[1] Tukey, John W. (1954). "Unsolved Problems of Experimental Statistics". Journal of the American Statistical Association. 49 (268): 706–731. doi:10.2307/2281535. JSTOR 2281535.

[2] Cox, D. R. (1984). "Present Position and Potential Developments: Some Personal Views: Design of Experiments and Regression". Journal of the Royal Statistical Society. Series A (General). 147 (2): 306–315. doi:10.2307/2981685. JSTOR 2981685.

[3] Pal, Nabendu; Lim, Wooi K. (1997). "A note on second-order admissibility of the Graybill-Deal estimator of a common mean of several normal populations". Journal of Statistical Planning and Inference. 63: 71–78. doi:10.1016/S0378-3758(96)00202-9.

[4] Fraser, D.A.S.; Rousseau, J. (2008). "Studentization and deriving accurate p-values" (PDF). Biometrika. 95: 1–16. doi:10.1093/biomet/asm093.

[5] Jordan, M. I. (2011). "What are the open problems in Bayesian statistics?" (PDF). The ISBA Bulletin. 18 (1): 1–5.

[6] Zabell, S. L. (1992). "Predicting the unpredictable". Synthese. 90 (2): 205. doi:10.1007/bf00485351. S2CID 9416747.

[1]

[2]

[3]

[4]

[5]

[6]