# Šidák correction

In statistics, the Šidák correction, or Dunn–Šidák correction, is a method used to counteract the problem of multiple comparisons. It is a simple method to control the family-wise error rate. When all null hypotheses are true, the method provides familywise error control that is exact for tests that are stochastically independent, conservative for tests that are positively dependent, and liberal for tests that are negatively dependent. It is credited to a 1967 paper [1] by the statistician and probabilist Zbyněk Šidák.[2] The Šidák method can be used to determine the statistical significance, and evaluate adjusted P value and confidence intervals.

## Usage

• Given m different null hypotheses and a familywise alpha level of ${\displaystyle \alpha }$, each null hypothesis is rejected that has a p-value lower than ${\displaystyle \alpha _{SID}=1-(1-\alpha )^{\frac {1}{m}}}$.
• This test produces a familywise Type I error rate of exactly ${\displaystyle \alpha }$ when the tests are independent of each other and all null hypotheses are true. It is less stringent than the Bonferroni correction, but only slightly. For example, for ${\displaystyle \alpha }$ = 0.05 and m = 10, the Bonferroni-adjusted level is 0.005 and the Šidák-adjusted level is approximately 0.005116.
• One can also compute confidence intervals matching the test decision using the Šidák correction by using 100 ${\displaystyle \cdot }$ (1 − α)1/m % confidence intervals.
• For continuous problems, one can employ Bayesian logic to compute ${\displaystyle m}$ from the prior-to-posterior volume ratio.[3]

When there are considerably large numbers of hypotheses or when the hypotheses are correlated, correction factors like Bonferroni and Šidák give in quite conservative results, which leads us to consider other approaches.

## Proof

The Šidák correction is derived by assuming that the individual tests are independent. Let the significance threshold for each test be ${\displaystyle \alpha _{1}}$; then the probability that at least one of the tests is significant under this threshold is (1 - the probability that none of them are significant). Since it is assumed that they are independent, the probability that all of them are not significant is the product of the probability that each of them is not significant, or ${\displaystyle 1-(1-\alpha _{1})^{m}}$. Our intention is for this probability to equal ${\displaystyle \alpha }$, the significance threshold for the entire series of tests. By solving for ${\displaystyle \alpha _{1}}$, we obtain ${\displaystyle \alpha _{1}=1-(1-\alpha )^{1/m}.}$ It shows that in order to reach a given ${\displaystyle \alpha }$ level, we need to adapt the ${\displaystyle \alpha _{1}}$values used for each test.[4]