# Rule of three (statistics)

Comparison of the rule of three to the exact binomial one-sided confidence interval with no positive samples
This article is about one meaning of "rule of three" in statistics. For other meanings in mathematics and beyond, see Rule of three.

In statistical analysis, the rule of three states that if a certain event did not occur in a sample with n subjects (${\displaystyle {\hat {p}}=0}$), the interval from 0 to 3/n is a 95% confidence interval for the rate of occurrences in the population. When n is greater than 30, this is a good approximation to results from more sensitive tests. For example, a pain-relief drug is tested on 1500 human subjects, and no adverse event is recorded. From the rule of three, it can be concluded with 95% confidence that fewer than 1 person in 500 (or 3/1500) will experience an adverse event. By symmetry, one could expect for only successes (${\displaystyle {\hat {p}}=1}$), the 95% confidence interval is [1-3/n,1].

The rule is useful in the interpretation of clinical trials generally, particularly in phase II and phase III where often there are limitations in duration or statistical power. The rule of three applies well beyond medical research, to any trial done n times. If 300 parachutes are randomly tested and all open successfully, then it is concluded with 95% confidence that fewer than 1 in 100 parachutes with the same characteristics (3/300) will fail.[1]

## Outline of derivation

A 95% confidence interval is sought for the probability p of an event occurring for any randomly selected single individual in a population, given that it has not been observed to occur in n Bernoulli trials. Denoting the number of events by X, we therefore wish to find the values of the parameter p of a binomial distribution that give Pr(X = 0) ≥ 0.05. The rule can then be derived[2] either from the Poisson approximation to the binomial distribution, or from the formula (1−p)n for the probability of zero events in the binomial distribution by taking logarithms and keeping only the first term of a series expansion of the natural logarithm. In either case, the factor of three arises from −ln(0.05) = ln(20) = 2.9957 ≈ 3.

## Notes

1. ^ There are other meanings of the term "rule of three" in mathematics, and a further distinct meaning within statistics:

A century and a half ago Charles Darwin said he had "no Faith in anything short of actual measurement and the Rule of Three," by which he appeared to mean the peak of arithmetical accomplishment in a nineteenth-century gentleman, solving for x in "6 is to 3 as 9 is to x." Some decades later, in the early 1900s, Karl Pearson shifted the meaning of the rule of three – "take 3σ [three standard deviations] as definitely significant" – and claimed it for his new journal of significance testing, Biometrika. Even Darwin late in life seems to have fallen into the confusion. (Ziliak and McCloskey, 2008, p. 26; parenthetic gloss in original)

2. ^ "Professor Mean" (2010) "Confidence interval with zero events", The Children's Mercy Hospital. Retrieved 2013-01-01.

## References

• Ziliak, S. T.; D. N. McCloskey (2008). The cult of statistical significance: How the standard error costs us jobs, justice, and lives. University of Michigan Press. ISBN 0472050079