In statistics, Barnard's test is an exact test used in the analysis of contingency tables. The test was first published by George Alfred Barnard (1945, 1947) who claimed this test is a more powerful alternative than Fisher's exact test for 2×2 contingency tables. A previous barrier to the widespread use of Barnard's test was likely the computational difficulty of calculating the p-value; nowadays, computers can implement Barnard's test in a few seconds, even for large sample sizes.
Purpose and scope
Barnard's test is used to test the independence of rows and columns in a contingency table. The test assumes each response is independent. Under independence, there are three types of study designs that yield a 2×2 table. To distinguish the different types of designs, suppose a researcher is interested in testing whether a treatment quickly heals an infection. One possible study design would be to sample 100 infected subjects, randomly give them the treatment or the placebo, and see if the infection is still present after a set time. This type of design is common in cross-sectional studies. Another possible study design would be to give 50 infected subjects the treatment, 50 infected subjects the placebo, and see if the infection is still present after a set time. This type of design is common in case-control studies. The final possible study design would be to give 50 infected subjects the treatment, 50 infected subjects the placebo, and stop the experiment once a set number of subjects has healed from the infection. This type of design is uncommon, but has the same structure as the lady tasting tea study that lead to R. A. Fisher creating the Fisher's Exact test. The probability of a 2×2 table under the first study design is given by the multinomial distribution; the second study design is given by the product of two independent binomial distributions; the third design is given by the hypergeometric distribution.
The difference between Barnard's exact test and Fisher's exact test is how they handle the nuisance parameter(s) of the common success probability when calculating the p-value. Fisher's test avoids estimating the nuisance parameter(s) by conditioning on the margins, an approximately ancillary statistic. Barnard's test considers all possible values of the nuisance parameter(s) and chooses the value(s) that maximizes the p-value. Both tests have sizes less than or equal to the type I error rate. However, Barnard's test can be more powerful than Fisher's test because it considers more 'as or more extreme' tables by not conditioning on both margins. In fact, one variant of Barnard's test, called Boschloo's test, is uniformly more powerful than Fisher's exact test.
While Barnard retracted his test in a published paper (1949), most researchers prefer using Barnard's exact test over Fisher's exact test for analyzing 2×2 contingency tables. The only exception is when the true sampling distribution of the table is hypergeometric. Barnard's test can be applied to larger tables, but the computation time increases and the power advantage quickly decreases. It remains unclear which test statistic is preferred when implementing Barnard's test; however, most test statistics yield uniformly more powerful tests than Fisher's exact test.
- Barnard, G.A (1945). "A New Test for 2×2 Tables". Nature 156 (3954): 177. doi:10.1038/156177a0.
- Barnard, G.A (1947). "Significance Tests for 2×2 Tables". Biometrika 34 (1/2): 123–138. doi:10.1093/biomet/34.1-2.123. JSTOR 2332517.
- Barnard, G.A (1949). "Statistical Inference". Journal of the Royal Statistical Society, Series B 11 (2/2): 115–149. JSTOR 2984075.
- Berger, R.L. (1996). "More Powerful Tests from Confidence Interval p Values". The American Statistician 50: 314–318. doi:10.1080/00031305.1996.10473559.
- Mehta, Cyrus R.; Senchaudhuri, Pralay (2003) "Conditional versus Unconditional Exact Tests for Comparing Two Binomials" Retrieved 20 November 2009
- Mehta, C.R.; Hilton, J.F. (1993). "Exact Power of Conditional and Unconditional Tests: Going Beyond the 2×2 Contingency Table". The American Statistician 47: 91–98. doi:10.1080/00031305.1993.10475946.