False discovery rate

You must add a |reason= parameter to this Cleanup template – replace it with {{Cleanup|April 2006|reason=<Fill reason here>}}, or remove the Cleanup template.

False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. It controls the expected proportion of incorrectly rejected rejected null hypotheses (type I errors) in a list of rejected hypotheses ^[1]. it is a less conservative comparison procedure with greater power than familywise error rate^[2] (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.

The q value is defined to be the FDR analogue of the p-value. The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant. One approach is to directly estimating q-values rather than fixing a level at which to control the FDR.

Classification of m hypothesis tests

	# declared non-significant	# declared significant	Total
# true null hypotheses	$U$	$V$	$m_{0}$
# non-true null hpotheses	$T$	$S$	$m-m_{0}$
Total	$m-R$	$R$	$m$

$m_{0}$ is the number of true null hypotheses
$m-m_{0}$ is the number of false null hypotheses
$V$ is the number of false positives
$T$ is the number of false negatives
$H_{1}\ldots H_{m}$ the null hypotheses being tested
In m hypothesis tests of which m₀ are true null hypotheses, R is an observable random variable, and S, T, U, and V are all unobservable random variables.

The false discovery rate is given by $E[{\frac {V}{V+S}}]=E[{\frac {V}{R}}]$ and one wants to keep this value below a threshold $\alpha$ .

Controlling procedures

Independent tests

The Simes procedure ensures that its expected value $E\left[{\frac {V}{V+S}}\right]$ is less than a given $\alpha$ (Benjamini and Hochberg 1995). This procedure is only valid when the $m$ tests are independent. Let $H_{1}\ldots H_{m}$ be the null hypotheses and $P_{1}\ldots P_{m}$ their corresponding p-values. Order these values in increasing order and denote them by $P_{(1)}\ldots P_{(m)}$ . For a given $\alpha$ , find the largest $k$ such that

P_{(k)}\leq {\frac {k}{m}}\alpha .

Then reject (i.e. declare positive) all $H_{(i)}$ for $i=1,\ldots ,k$ .

Dependent tests

The Benjamini and Yekutieli procedure controls the false discovery rate under dependence assumptions. This refinement modifies the threshold and finds the largest $k$ such that:

P_{(k)}\leq {\frac {k}{m\cdot c(m)}}\alpha

If the tests are independent: $c(m)=1$ (same as above)
If the tests are positively correlated: $c(m)=1$
If the tests are negatively correlated: $c(m)=\sum _{i=1}^{m}{\frac {1}{i}}$

In the case of negative correlation, $c(m)$ can be approximated by using the Euler-Mascheroni constant

\sum _{i=1}^{m}{\frac {1}{i}}\approx \ln(m)+\gamma .

References

Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. [1]

Benjamini, Y., and Yekutieli, D. (2001). "The Control of the False Discovery Rate in Multiple Testing under Dependency,". The Annals of Statistics 29 (4), 1165–1188. [2]

Storey, J. D. (2002). "A direct approach to false discovery rates." Journal of the Royal Statistical Society, Series B 64, 479–498. [3]

Storey, J. D. (2003). "The positive false discovery rate: A Bayesian interpretation and the q-value." Annals of Statistics 31, 2013–2035. [4]

^ Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. [5]
^ Shaffer J.P. (1995) Multiple hypothesis testing, Annual Rview of Psychology 46:561-584 http://dx.doi.org/10.1146/annurev.ps.46.020195.003021

[1] Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. [5]

[2] Shaffer J.P. (1995) Multiple hypothesis testing, Annual Rview of Psychology 46:561-584 http://dx.doi.org/10.1146/annurev.ps.46.020195.003021

[1]

[2]