Jump to content

False discovery rate

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 12.23.96.197 (talk) at 19:30, 4 December 2006 (→‎Dependent tests). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

You must add a |reason= parameter to this Cleanup template – replace it with {{Cleanup|April 2006|reason=<Fill reason here>}}, or remove the Cleanup template.

False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. It controls the expected proportion of incorrectly rejected rejected null hypotheses (type I errors) in a list of rejected hypotheses [1]. it is a less conservative comparison procedure with greater power than familywise error rate[2] (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.

The q value is defined to be the FDR analogue of the p-value. The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant. One approach is to directly estimating q-values rather than fixing a level at which to control the FDR.

Classification of m hypothesis tests

# declared non-significant # declared significant Total
# true null hypotheses
# non-true null hpotheses
Total
  • is the number of true null hypotheses
  • is the number of false null hypotheses
  • is the number of false positives
  • is the number of false negatives
  • the null hypotheses being tested
  • In m hypothesis tests of which m0 are true null hypotheses, R is an observable random variable, and S, T, U, and V are all unobservable random variables.

The false discovery rate is given by and one wants to keep this value below a threshold .

Controlling procedures

Independent tests

The Simes procedure ensures that its expected value is less than a given (Benjamini and Hochberg 1995). This procedure is only valid when the tests are independent. Let be the null hypotheses and their corresponding p-values. Order these values in increasing order and denote them by . For a given , find the largest such that

Then reject (i.e. declare positive) all for .

Dependent tests

The Benjamini and Yekutieli procedure controls the false discovery rate under dependence assumptions. This refinement modifies the threshold and finds the largest such that:

  • If the tests are independent: (same as above)
  • If the tests are positively correlated:
  • If the tests are negatively correlated:

In the case of negative correlation, can be approximated by using the Euler-Mascheroni constant

References

  • Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. [1]
  • Benjamini, Y., and Yekutieli, D. (2001). "The Control of the False Discovery Rate in Multiple Testing under Dependency,". The Annals of Statistics 29 (4), 1165–1188. [2]
  • Storey, J. D. (2002). "A direct approach to false discovery rates." Journal of the Royal Statistical Society, Series B 64, 479–498. [3]
  • Storey, J. D. (2003). "The positive false discovery rate: A Bayesian interpretation and the q-value." Annals of Statistics 31, 2013–2035. [4]
  1. ^ Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. [5]
  2. ^ Shaffer J.P. (1995) Multiple hypothesis testing, Annual Rview of Psychology 46:561-584 http://dx.doi.org/10.1146/annurev.ps.46.020195.003021