# Bonferroni correction

In statistics, the Bonferroni correction is a method used to counteract the problem of multiple comparisons. It is named after Italian mathematician Carlo Emilio Bonferroni for the use of Bonferroni inequalities,[1] but modern usage is often credited to Olive Jean Dunn, who described the procedure in a pair of articles written in 1959 and 1961.[2][3]

## Informal introduction

Statistical inference logic is based on rejecting the null hypotheses if the likelihood of the observed data under the null hypotheses is low. The problem of multiplicity arises from the fact that as we increase the number of hypotheses being tested, we also increase the likelihood of a rare event, and therefore, the likelihood of incorrectly rejecting a null hypothesis (i.e., make a Type I error).

The Bonferroni correction is based on the idea that if an experimenter is testing $m$ hypotheses, then one way of maintaining the familywise error rate (FWER) is to test each individual hypothesis at a statistical significance level of $1/m$ times what it would be if only one hypothesis were tested.

So, if the desired significance level for the whole family of tests is $\alpha$, then the Bonferroni correction would test each individual hypothesis at a significance level of $\alpha/m$. For example, if a trial is testing $m = 8$ hypotheses with a desired $\alpha = 0.05$, then the Bonferroni correction would test each individual hypothesis at $\alpha = 0.05/8 = 0.00625$.

Statistically significant simply means that a given result is unlikely to occur by chance if the null hypothesis is true (i.e., no difference among groups, no effect of treatment, no relation among variables).

## Definition

[dubious ]

Let $H_{1},...,H_{m}$ be a family of hypotheses and $p_{1},...,p_{m}$ their corresponding p-values for a given data. Let the null hypothesis be the union of $m_{0}$ hypotheses taken from this family of hypotheses,

$H_0 = \bigcup_{i\in I_0} H_i.$

The $H_0$ is true if all of the $H_i$ in $i \in I_0$ are true. This means, $H_0$ is false if any one of the $H_i$ in $i \in I_0$ is false. Let the decision strategy be such that $H_0$ is rejected if any one of $H_i$ in $i \in I_0$ is rejected.

The familywise error rate is the probability of rejecting at least one of the members in $I_{0}$ based on their corresponding p-values; that is, to make one or more type I error during the null hypothesis testing of individual $H_i$'s in $i \in I_0$. The Bonferroni Correction states that choosing all $p_{i}\leq\frac{\alpha}{m}$ will control the $\mathit{FWER}\leq\alpha$. The proof follows from Boole's inequality:

$\mathit{FWER}=\mathit{Pr}\left\{ \bigcup_{i \in I_{o}}\left(p_{i}\leq\frac{\alpha}{m}\right)\right\} \leq\sum_{i \in I_{o}}\left\{\mathit{Pr}\left(p_{i}\leq\frac{\alpha}{m}\right)\right\}\leq m_{0}\frac{\alpha}{m}\leq m\frac{\alpha}{m}=\alpha$

This result does not require that the tests be independent.

## Modifications

### Generalization

We have used the fact that $\sum_{i=1}^{n}\frac{\alpha}{n}=\alpha$, but the correction can be generalized and applied to any $\sum_{i=1}^{n}a_{i}=\alpha$, as long as the weights are defined prior to the test.

### Confidence intervals

Bonferroni correction can be used to adjust confidence intervals. If we are forming $m$ confidence intervals, and wish to have overall confidence level of $1-\alpha$, then adjusting each individual confidence interval to the level of $1-\frac{\alpha}{m}$ will be the analog confidence interval correction.

## Alternatives

There are other alternatives to control the familywise error rate. For example, the Holm–Bonferroni method and the Šidák correction are universally more powerful procedures than the Bonferroni correction, meaning that they are always at least as powerful.

## Criticisms

The Bonferroni correction can be somewhat conservative if there are a large number of tests and/or the test statistics are positively correlated. The correction also comes at the cost of increasing the probability of producing false negatives, and consequently reducing statistical power.

Another criticism concerns the concept of a family of hypotheses. There is not a definitive consensus on how to define a family in all cases. As there is no standard definition, test results may change dramatically, only by modifying the way we consider the hypotheses families.

All of these criticisms, however, apply to adjustments for multiple comparisons in general, and are not specific to the Bonferroni correction.