Base rate

From Wikipedia, the free encyclopedia

In probability and statistics, the base rate (also known as prior probabilities) is the class of probabilities unconditional on "featural evidence" (likelihoods).

It is the proportion of individuals in a population who have a certain characteristic or trait. For example, if 1% of the population were medical professionals, and remaining 99% were not medical professionals, then the base rate of medical professionals is 1%. The method for integrating base rates and featural evidence is given by Bayes' rule.

In the sciences, including medicine, the base rate is critical for comparison.[1] In medicine a treatment's effectiveness is clear when the base rate is available. For example, if the control group, using no treatment at all, had their own base rate of 1/20 recoveries within 1 day (meaning 1 out of every 20 people recover in 1 day) and a treatment had a 1/100 base rate of recovery within 1 day, we see that the treatment actively decreases the recovery in the first day for the winter cold.

The base rate is an important concept in statistical inference, particularly in Bayesian statistics.[2] In Bayesian analysis, the base rate is combined with the observed data to update our belief about the probability of the characteristic or trait of interest. The updated probability is known as the posterior probability and is denoted as P(A|B), where B represents the observed data. For example, suppose we are interested in estimating the prevalence of a disease in a population. The base rate would be the proportion of individuals in the population who have the disease. If we observe a positive test result for a particular individual, we can use Bayesian analysis to update our belief about the probability that the individual has the disease. The updated probability would be a combination of the base rate and the likelihood of the test result given the disease status.

The base rate is also important in decision-making, particularly in situations where the cost of false positives and false negatives are different.[3] For example, in medical testing, a false negative (failing to diagnose a disease) could be much more costly than a false positive (incorrectly diagnosing a disease). In such cases, the base rate can help inform decisions about the appropriate threshold for a positive test result.

Base rate fallacy[edit]

Many psychological studies have examined a phenomenon called base-rate neglect or base rate fallacy, in which category base rates are not integrated with presented evidence in a normative manner,[4] although not all evidence is consistent regarding how common this fallacy is.[5] Mathematician Keith Devlin illustrates the risks as a hypothetical type of cancer that afflicts 1% of all people. Suppose a doctor then says there is a test for said cancer that is approximately 80% reliable, and that the test provides a positive result for 100% of people who have cancer, but it also results in a 'false positive' for 20% of people - who do not have cancer. Testing positive may therefore lead people to believe that it is 80% likely that they have cancer. Devlin explains that the odds are instead less than 5%. What is missing from these statistics is the relevant base rate information. The doctor should be asked, "Out of the number of people who test positive (base rate group), how many have cancer?"[6] In assessing the probability that a given individual is a member of a particular class, information other than the base rate needs to be accounted for, especially featural evidence. For example, when a person wearing a white doctor's coat and stethoscope is seen prescribing medication, there is evidence that allows for the conclusion that the probability of this particular individual being a medical professional is considerably more significant than the category base rate of 1%.[citation needed]

See also[edit]


  1. ^ Stephanie (2015-08-18). "Base Rates and the Base Rate Fallacy: Definition, Examples". Statistics How To. Retrieved 2022-10-07.
  2. ^ Birnbaum, Michael H. (Spring 1983). "Base Rates in Bayesian Inference: Signal Detection Analysis of the Cab Problem". The American Journal of Psychology. 96 (1): 85–94. doi:10.2307/1422211. JSTOR 1422211.
  3. ^ Darling, John A.; Jerde, Christopher L.; Sepulveda, Adam J. (September 2021). "What do you mean by false positive?". Environmental DNA. 3 (5): 879–883. doi:10.1002/edn3.194. ISSN 2637-4943. PMC 8941663. PMID 35330629.
  4. ^ Bar-Hillel, Maya (1980). "The base-rate fallacy in probability judgments". Acta Psychologica. 44 (3): 211–233. doi:10.1016/0001-6918(80)90046-3.
  5. ^ Koehler, Jonathan J. (1996). "The base rate fallacy reconsidered: Descriptive, normative, and methodological challenges". Behavioral and Brain Sciences. 19 (1): 1–17. doi:10.1017/S0140525X00041157. ISSN 0140-525X. S2CID 53343238.
  6. ^ "". Retrieved 2021-03-22.