Berkson's paradox also known as Berkson's bias or Berkson's fallacy is a result in conditional probability and statistics which is counterintuitive for some people, and hence a veridical paradox. It is a complicating factor arising in statistical tests of proportions. Specifically, it arises when there is an ascertainment bias inherent in a study design. The effect is related to the explaining away phenomenon in Bayesian networks.
- if 0 < P(A) < 1 and 0 < P(B) < 1,
- and P(A|B) = P(A), i.e. they are independent,
- then P(A|B,C) < P(A|C) where C = A∪B (i.e. A or B).
In words, given two independent events, if you only consider outcomes where at least one occurs, then they become negatively dependent.
The cause is that the conditional probability of event A occurring, given that it or B occurs, is inflated: it is higher than the unconditional probability, because we have excluded cases where neither occur.
- P(A|A∪B) > P(A)
- conditional probability inflated relative to unconditional
One can see this in tabular form as follows: the gray regions are the outcomes where at least one event occurs (and ~A means "not A").
|B||A & B||~A & B|
|~B||A & ~B||~A & ~B|
For instance, if one has a sample of 100, and both A and B occur independently half the time (So P(A) = P(B) = 1/2), one obtains:
So in 75 outcomes, either A or B occurs, of which 50 have A occurring, so
- P(A|A∪B) = 50/75 = 2/3 > 1/2 = 50/100 = P(A).
Thus the probability of A is higher in the subset (of outcomes where it or B occurs), 2/3, than in the overall population, 1/2.
Berkson's paradox arises because the conditional probability of A given B within this subset equals the conditional probability in the overall population, but the unconditional probability within the subset is inflated relative to the unconditional probability in the overall population, hence, within the subset, the presence of B decreases the conditional probability of A (back to its overall unconditional probability):
- P(A|B, A∪B) = P(A|B) = P(A)
- P(A|A∪B) > P(A).
Berkson's original illustration involves a retrospective study examining a risk factor for a disease in a statistical sample from a hospital in-patient population. If a control group is also ascertained from the in-patient population, a difference in hospital admission rates for the control sample and case sample can result in a spurious negative association between the disease and the risk factor. For example, a hospital patient without diabetes is more likely to have cholecystis, since they must have had some non-diabetes reason to enter the hospital in the first place.
An example presented by Jordan Ellenberg: Suppose you will only date a man if his niceness plus his handsomeness exceeds some threshold. Then nicer men do not have to be as handsome in order to qualify for your dating pool. So, among the men that you date, you may observe that the nicer ones are less handsome on average (and vice versa), even if these traits are uncorrelated in the general population.
Note that this does not mean that men in your dating pool compare unfavorably with men in the population. On the contrary, your selection criterion means that you have high standards. The average nice man that you date is actually more handsome than the average man in the population (since even among nice men, you skip the ugliest portion of the population). Berkson's negative correlation is an effect that arises within your dating pool: the rude men that you date must have been even more handsome to qualify.
As a quantitative example, suppose a collector has 1000 postage stamps, of which 300 are pretty and 100 are rare, with 30 being both pretty and rare. 10% of all her stamps are rare and 10% of her pretty stamps are rare, so prettiness tells nothing about rarity. She puts the 370 stamps which are pretty or rare on display. Just over 27% of the stamps on display are rare, but still only 10% of the pretty stamps are rare (and 100% of the 70 not-pretty stamps on display are rare). If an observer only considers stamps on display, he will observe a spurious negative relationship between prettiness and rarity as a result of the selection bias (that is, not-prettiness strongly indicates rarity in the display, but not in the total collection).
- Berkson, Joseph (June 1946). "Limitations of the Application of Fourfold Table Analysis to Hospital Data". Biometrics Bulletin 2 (3): 47–53. doi:10.2307/3002000. JSTOR 3002000. (The paper is frequently miscited as Berkson, J. (1949) Biological Bulletin 2, 47–53.)
- Jordan Ellenberg, "Why are handsome men such jerks?"