False positive paradox
The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by the characteristics of the sampled population.[1] When the incidence, the proportion of those who have a given condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive in an individual case will give more false than true positives overall.[2] So, in a society with very few infected people—fewer proportionately than the test gives false positives—there will actually be more who test positive for a disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.[3]
It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after having dealt with positive results drawn from a high-incidence population.[2] If the false positive rate of the test is higher than the proportion of the new population with the condition, then a test administrator whose experience has been drawn from testing in a high-incidence population may conclude from experience that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.
Not adjusting to the scarcity of the condition in the new population, and concluding that a positive test result probably indicates a positive subject, even though population incidence is below the false positive rate is a "base rate fallacy".
Contents |
Example[edit]
High-incidence population[edit]
Imagine running an HIV test on population A,(of 1,000,000 persons) in which 200 out of every 10,000 people are infected. (2%) The test has a false positive rate of .0004 (.04%) and no false negative rate. The expected outcome of a million tests on population A would be:
- Unhealthy and test indicates disease (true positive)
- 1,000,000 × (200/10000) = 20,000 people would receive a true positive
- Healthy and test indicates disease (false positive)
- 1,000,000 × (9800/10000) × .0004 = 392 people would receive a false positive
- The remaining 979,608 tests are correctly negative.
So, in population A, a person receiving a positive test could be over 98% confident (20,000/20,392) that it correctly indicates infection.
Low-incidence population[edit]
Now consider the same test applied to population B, in which only 1 person in 10,000 (.01%) is infected . The expected outcome of a million tests on population B would be:
- Unhealthy and test indicates disease (true positive)
- 1,000,000 × (1/10,000) = 100 people would receive a true positive
- Healthy and test indicates disease (false positive)
- 1,000,000 × (9999/10,000) × .0004 ≈ 400 people would receive a false positive
- The remaining 999,500 tests are correctly negative.
In population B, only 100 of the 500 total people with a positive test result are actually infected. So, the probability of actually being infected after you are told you are infected is only 20% (100/500) for a test that otherwise appears to be "over 99.95% accurate".
A tester with experience of group A might find it a paradox that in group B, a result that had almost always indicated infection is now usually a false positive. The confusion of the posterior probability of infection with the prior probability of receiving a false negative is a natural error after receiving a life-threatening test result.
See also[edit]
- Prosecutor's fallacy, a mistake in reasoning that involves ignoring a low prior probability
- Simpson's paradox, another error in statistical reasoning dealing with comparing groups
- Bayes' theorem
References[edit]
- ^ Rheinfurth, M. H.; Howell, L. W. (March 1998). Probability and statistics in aerospace engineering (pdf). NASA. p. 16. "MESSAGE: False positive tests are more probable than true positive tests when the overall population has a low incidence of the disease. This is called the false-positive paradox." Text "Probability and Statistics in Aerospace Engineering " ignored (help)
- ^ a b Vacher, H. L. (May 2003). "Quantitative literacy - drug testing, cancer screening, and the identification of igneous rocks". Journal of Geoscience Education: 2. "At first glance, this seems perverse: the less the students as a whole use steroids, the more likely a student identified as a user will be a non-user. This has been called the False Positive Paradox" - Citing: Smith, W. (1993). The cartoon guide to statistics. New York: Harper Collins. p. 49.
- ^ Madison, B. L. (August 2007). "Mathematical Proficiency for Citizenship". In Schoenfeld, A. H. Assessing Mathematical Proficiency. Mathematical Sciences Research Institute Publications (New ed.). Cambridge University Press. p. 122. ISBN 978-0-521-69766-8. "The correct [probability estimate...] is surprising to many; hence, the term paradox."