False positive paradox
|
|
It has been suggested that this article be merged into Base rate fallacy. (Discuss) Proposed since July 2017.
|
|
|
This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. (July 2017) (Learn how and when to remove this template message)
|
The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by the characteristics of the sampled population.[1] When the incidence, the proportion of those who have a given condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive in an individual case will give more false than true positives overall.[2] So, in a society with very few infected people—fewer proportionately than the test gives false positives—there will actually be more who test positive for a disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.[3]
It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after having dealt with positive results drawn from a high-incidence population.[2] If the false positive rate of the test is higher than the proportion of the new population with the condition, then a test administrator whose experience has been drawn from testing in a high-incidence population may conclude from experience that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.
Not adjusting to the scarcity of the condition in the new population, and concluding that a positive test result probably indicates a positive subject, even though population incidence is below the false positive rate, is a "base rate fallacy".
Contents
Example[edit]
High-incidence population[edit]
| Number of people |
Infected | Uninfected | Total |
|---|---|---|---|
| Test positive |
400 (true positive) |
30 (false positive) |
430 |
| Test negative |
0 (false negative) |
570 (true negative) |
570 |
| Total | 400 | 600 | 1000 |
Imagine running an HIV test on population A of 1000 persons, in which 40% are infected. The test has a false positive rate of 5% (0.05) and no false negative rate. The expected outcome of the 1000 tests on population A would be:
- Infected and test indicates disease (true positive)
- 1000 × 40/100 = 400 people would receive a true positive
- Uninfected and test indicates disease (false positive)
- 1000 × 100 – 40/100 × 0.05 = 30 people would receive a false positive
- The remaining 570 tests are correctly negative.
So, in population A, a person receiving a positive test could be over 93% confident (400/30 + 400) that it correctly indicates infection.
Low-incidence population[edit]
| Number of people |
Infected | Uninfected | Total |
|---|---|---|---|
| Test positive |
20 (true positive) |
49 (false positive) |
69 |
| Test negative |
0 (false negative) |
931 (true negative) |
931 |
| Total | 20 | 980 | 1000 |
Now consider the same test applied to population B, in which only 2% is infected. The expected outcome of 1000 tests on population B would be:
- Infected and test indicates disease (true positive)
- 1000 × 2/100 = 20 people would receive a true positive
- Uninfected and test indicates disease (false positive)
- 1000 × 100 – 2/100 × 0.05 = 49 people would receive a false positive
- The remaining 931 tests are correctly negative.
In population B, only 20 of the 69 total people with a positive test result are actually infected. So, the probability of actually being infected after one is told that one is infected is only 29% (20/20 + 49) for a test that otherwise appears to be "95% accurate".
A tester with experience of group A might find it a paradox that in group B, a result that had usually correctly indicated infection is now usually a false positive. The confusion of the posterior probability of infection with the prior probability of receiving a false positive is a natural error after receiving a life-threatening test result.
See also[edit]
- Bayes' theorem
- List of paradoxes
- Prosecutor's fallacy, a mistake in reasoning that involves ignoring a low prior probability
- Simpson's paradox, another error in statistical reasoning dealing with comparing groups
- Prevention paradox
References[edit]
- ^ Rheinfurth, M. H.; Howell, L. W. (March 1998). Probability and Statistics in Aerospace Engineering (PDF). NASA. p. 16.
MESSAGE: False positive tests are more probable than true positive tests when the overall population has a low incidence of the disease. This is called the false-positive paradox.
- ^ a b Vacher, H. L. (May 2003). "Quantitative literacy - drug testing, cancer screening, and the identification of igneous rocks". Journal of Geoscience Education: 2.
At first glance, this seems perverse: the less the students as a whole use steroids, the more likely a student identified as a user will be a non-user. This has been called the False Positive Paradox
- Citing: Gonick, L.; Smith, W. (1993). The cartoon guide to statistics. New York: Harper Collins. p. 49. - ^ Madison, B. L. (August 2007). "Mathematical Proficiency for Citizenship". In Schoenfeld, A. H. Assessing Mathematical Proficiency. Mathematical Sciences Research Institute Publications (New ed.). Cambridge University Press. p. 122. ISBN 978-0-521-69766-8.
The correct [probability estimate...] is surprising to many; hence, the term paradox.