Base rate fallacy: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Undid revision 571196951 by Spannerjam (talk)
Changed trying to simplify.
Line 1: Line 1:
'''Base rate fallacy''' is a cognitive bias reffering to the subconcious disregard of [[prior probabilities]].<ref>{{cite web|url=http://www.fallacyfiles.org/baserate.html |title=Logical Fallacy: The Base Rate Fallacy |publisher=Fallacyfiles.org |date= |accessdate=2013-06-15}}</ref> Or more simply put: If Information is presented with an initial probability. And if it is followed by subsequent probabilities related to prior ones. Then it is likely that the mind ignores how the initital probabilities affect the subsequent.
{{More footnotes|date=May 2010}}
The '''base rate fallacy''', also called '''base rate neglect''' or '''base rate bias''', is an error that occurs when the [[conditional probability]] of some hypothesis H given some evidence E is assessed without taking into account the [[prior probability]] ("[[base rate]]") of H and the total probability of evidence E.<ref>{{cite web|url=http://www.fallacyfiles.org/baserate.html |title=Logical Fallacy: The Base Rate Fallacy |publisher=Fallacyfiles.org |date= |accessdate=2013-06-15}}</ref> The conditional probability can be expressed as P(H&nbsp;|&nbsp;E), the probability of H given E, and the base rate error happens when the values of [[sensitivity and specificity]], which depend only on the test itself, are used in place of [[positive predictive value]] and [[negative predictive value]], which depend on both the test and the baseline prevalence of event.


==Example==
==Example==
A group of policemen have [[breathalyzer]]s displaying false drunkness in 5% of the cases. 1/1000 chauffeurs are drunk drivers. If the policemen stop a driver and force him to take a breathalyzer test. How high is the probability, that a chauffeur (assuming you don't anything else about him or her), is driving drunk?
In a city of 1 million inhabitants let there be 100 terrorists and 999,900 non-terrorists. To simplify the example, it is assumed that all people present in the city are inhabitants. Thus, the [[base rate]] probability of a randomly selected inhabitant of the city being a terrorist is 0.0001, and the [[base rate]] probability of that same inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic [[facial recognition system|facial recognition software]]. The software has two failure rates of 1%:


Many would answer as highly as 0.95, but the correct probability is 0.02.
# The false negative rate: If the camera scans a terrorist, a bell will ring 99% of the time, and it will fail to ring 1% of the time.
# The false positive rate: If the camera scans a non-terrorist, a bell will not ring 99% of the time, but it will ring 1% of the time.

Suppose now that an inhabitant triggers the alarm. What is the chance that the person is a terrorist? In other words, what is P(''T''&nbsp;|&nbsp;''B''), the probability that a terrorist has been detected given the ringing of the bell? Someone making the 'base rate fallacy' would infer that there is a 99% chance that the detected person is a terrorist. Although the inference seems to make sense, it is actually bad reasoning, and a [[Base rate fallacy#Mathematical formalism|calculation below]] will show that the chances they are a terrorist are actually near 1%, not near 99%.

The fallacy arises from confusing the natures of two different failure rates. The 'number of non-bells per 100 terrorists' and the 'number of non-terrorists per 100 bells' are unrelated quantities. One does not necessarily equal the other, and they don't even have to be almost equal. To show this, consider what happens if an identical alarm system were set up in a second city with no terrorists at all. As in the first city, the alarm sounds for 1 out of every 100 non-terrorist inhabitants detected, but unlike in the first city, the alarm never sounds for a terrorist. Therefore 100% of all occasions of the alarm sounding are for non-terrorists, but a false negative rate cannot even be calculated. The 'number of non-terrorists per 100 bells' in that city is 100, yet P(''T''&nbsp;|&nbsp;''B'') = 0%. There is zero chance that a terrorist has been detected given the ringing of the bell.

Imagine that the city's entire population of one million people pass in front of the camera. About 99 of the 100 terrorists will trigger the alarm — and so will about 9,999 of the 999,900 non-terrorists. Therefore, about 10,098 people will trigger the alarm, among which about 99 will be terrorists. So, the probability that a person triggering the alarm actually is a terrorist, is only about 99 in 10,098, which is less than 1%, and very, very far below our initial guess of 99%.

The base rate fallacy is so misleading in this example because there are many more non-terrorists than terrorists. If, instead, the city had about as many terrorists as non-terrorists, and the false-positive rate and the false-negative rate were nearly equal, then the probability of misidentification would be about the same as the false-positive rate of the device. These special conditions hold sometimes: as for instance, about half the women undergoing a pregnancy test are actually pregnant, and some pregnancy tests give about the same rates of false positives and of false negatives. In this case, the rate of false positives per positive test will be nearly equal to the rate of false positives per nonpregnant woman. This is why it is very easy to fall into this fallacy: by coincidence it gives the correct answer in many common situations.

In many real-world situations, though, particularly problems like detecting criminals in a largely law-abiding population, the small proportion of targets in the large population makes the base rate fallacy very applicable. Even a very low false-positive rate will result in so many false alarms as to make such a system useless in practice.

==Mathematical formalism==
In the above example, where P(''T''&nbsp;|&nbsp;''B'') means the probability of ''T'' given ''B'', the base rate fallacy is committed by assuming that P(terrorist&nbsp;|&nbsp;bell) equals P(bell&nbsp;|&nbsp;terrorist) and then adding the premise that P(bell&nbsp;|&nbsp;terrorist)=99%. Now, is it true that P(terrorist&nbsp;|&nbsp;bell) equals P(bell&nbsp;|&nbsp;terrorist)?

: <math>P(\mathrm{terrorist}\mid\mathrm{bell}) \,\overset{\underset{\mathrm{?}}{}}{=}\, P(\mathrm{bell}\mid\mathrm{terrorist}).</math>

That is not true. Instead, the correct calculation uses [[Bayes' theorem]] to take into account the prior probability of any randomly selected inhabitant in the city being a terrorist and the total probability of the bell ringing:

: <math>
\begin{align}
& {}\quad P(\mathrm{terrorist}\mid\mathrm{bell}) \\[10pt]
&= \frac{P(\mathrm{bell} \mid \mathrm{terrorist}) P(\mathrm{terrorist})}
{P(\mathrm{bell})} \\[10pt]
&= \frac{P(\mathrm{bell} \mid \mathrm{terrorist}) \times P(\mathrm{terrorist})}
{ P(\mathrm{bell} \mid \mathrm{terrorist}) \times P(\mathrm{terrorist}) + P(\mathrm{bell} \mid \mathrm{nonterrorist}) \times P(\mathrm{nonterrorist})} \\[10pt]
&= \frac{ 0.99 \cdot (100/1,000,000)}
{\frac{0.99 \cdot 100}{1,000,000} + \frac{0.01 \cdot 999,900}{1,000,000}} \\[10pt]
&= 1/102 \approx 1\%
\end{align}
</math>

Thus, in the example the probability was overestimated by more than 100 times due to the failure to take into account the fact that there are about 10000 times more nonterrorists than terrorists (a.k.a. failure to take into account the 'prior probability' of being a terrorist).


==Findings in psychology==
==Findings in psychology==
Line 54: Line 20:
* [[False positive paradox]]
* [[False positive paradox]]
* [[Inductive argument]]
* [[Inductive argument]]
* [[List of cognitive biases]]
* [[Misleading vividness]]
* [[Misleading vividness]]
* [[Prosecutor's fallacy]]
* [[Prosecutor's fallacy]]


==References==
==References list==
<references />
{{Reflist}}


==External links==
==External links==

Revision as of 12:59, 2 September 2013

Base rate fallacy is a cognitive bias reffering to the subconcious disregard of prior probabilities.[1] Or more simply put: If Information is presented with an initial probability. And if it is followed by subsequent probabilities related to prior ones. Then it is likely that the mind ignores how the initital probabilities affect the subsequent.

Example

A group of policemen have breathalyzers displaying false drunkness in 5% of the cases. 1/1000 chauffeurs are drunk drivers. If the policemen stop a driver and force him to take a breathalyzer test. How high is the probability, that a chauffeur (assuming you don't anything else about him or her), is driving drunk?

Many would answer as highly as 0.95, but the correct probability is 0.02.

Findings in psychology

In experiments, people have been found to prefer individuating information over general information when the former is available.[2][3][4]

In some experiments, students were asked to estimate the grade point averages (GPAs) of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student, even if the new descriptive information was obviously of little or no relevance to school performance.[3] This finding has been used to argue that interviews are an unnecessary part of the college admissions process because interviewers are unable to pick successful candidates better than basic statistics.

Psychologists Daniel Kahneman and Amos Tversky attempted to explain this finding in terms of a simple rule or "heuristic" called representativeness. They argued that many judgements relating to likelihood, or to cause and effect, are based on how representative one thing is of another, or of a category.[3] Richard Nisbett has argued that some attributional biases like the fundamental attribution error are instances of the base rate fallacy: people underutilize "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler dispositional attributions.[5]

Kahneman considers base rate neglect to be a specific form of extension neglect.[6]

See also

References list

  1. ^ "Logical Fallacy: The Base Rate Fallacy". Fallacyfiles.org. Retrieved 2013-06-15.
  2. ^ Bar-Hillel, Maya (1980). "The base-rate fallacy in probability judgments". Acta Psychologica. 44: 211–233.
  3. ^ a b c Kahneman, Daniel (1973). "On the psychology of prediction". Psychological Review. 80: 237–251. doi:10.1037/h0034747. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  4. ^ Kahneman, Daniel (1985). "Evidential impact of base rates". In Daniel Kahneman, Paul Slovic & Amos Tversky (Eds.) (ed.). Judgment under uncertainty: Heuristics and biases. pp. 153–160. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  5. ^ Nisbett, Richard E. (1976). "Popular induction: Information is not always informative". In J. S. Carroll & J. W. Payne (Eds.) (ed.). Cognition and social behavior. Vol. 2. pp. 227–236. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  6. ^ Kahneman, Daniel (2000). "Evaluation by moments, past and future". In Daniel Kahneman and Amos Tversky (Eds.) (ed.). Choices, Values and Frames.

External links