Jump to content

Correlation does not imply causation: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 43: Line 43:
== Determining causation ==
== Determining causation ==


The Enlightenment Philosopher [[David Hume]] argued that causality cannot be perceived (and therefore cannot be known or proven), and instead we can only perceive correlation. However, we can use the [[scientific method]] to rule out false causes.
The Enlightenment Philosopher [[David Hume]] argued that causality cannot be perceived (and therefore cannot be known or proven), and instead we can only perceive correlation. However, he argued that we can use the [[scientific method]] to rule out false causes.


In modern science, causation is defined by a counterfactual. Suppose that a student performed poorly on a test and guesses that the cause was not studying. To prove this, we think of the counterfactual - the same student writing the same test under the same circumstances, but having studied the night before. This counterfactual is certainly possible, but it is not what happened and so the counterfactual test score cannot be observed. If we could rewind history, and change only one small thing (making the student study for the exam), then causation could be observed (by comparing version 1 to version 2). Because we cannot rewind history and replay events after making small controlled changes, causation can only be inferred, never exactly known. This is referred to as the Fundamental Problem of Causal Inference - it is impossible to directly observe causal effects.
In modern science, causation is defined by a counterfactual. Suppose that a student performed poorly on a test and guesses that the cause was not studying. To prove this, we think of the counterfactual - the same student writing the same test under the same circumstances, but having studied the night before. This counterfactual is certainly possible, but it is not what happened and so the counterfactual test score cannot be observed. If we could rewind history, and change only one small thing (making the student study for the exam), then causation could be observed (by comparing version 1 to version 2). Because we cannot rewind history and replay events after making small controlled changes, causation can only be inferred, never exactly known. This is referred to as the Fundamental Problem of Causal Inference - it is impossible to directly observe causal effects.

Revision as of 06:40, 28 August 2006

You must add a |reason= parameter to this Cleanup template – replace it with {{Cleanup|July 2006|reason=<Fill reason here>}}, or remove the Cleanup template.

Correlation implies causation, also known as cum hoc ergo propter hoc (Latin for "with this, therefore because of this") and false cause, is a logical fallacy by which two events that occur together are prematurely claimed to have a cause-and-effect relationship.

General pattern

In this type of logical fallacy, one makes a premature conclusion about causality after observing only a correlation between two or more factors. Generally, if one factor (A) is observed to only be correlated with another factor (B), it is sometimes taken for granted that A is automatically causing B even when no evidence supports a cause and effect relationship. Because it is erroneously assumed that A must be causing B, this is a logical fallacy because there are at least four other possibilities:

  1. B may be the cause of A, or
  2. some unknown third factor is actually the cause of the relationship between A and B, or
  3. the "relationship" is so complex it can be labelled coincidental (i.e., two events occurring at the same time that have no simple relationship to each other besides the fact that they are occurring at the same time).
  4. B may be the cause of A at the same time as A is the cause of B (contradicting that the only relationship between A and B is that A causes B)

In other words, (1.) there can be no conclusion made regarding the directionality of a possible cause and effect relationship only from the observation that A is correlated with B, and (2.) in a correlation between two factors there is always the possibility that a third unknown factor is the cause of the relationship between A and B. Determining whether there is an actual cause and effect relationship requires further scientific investigation, even in cases in which the relationship between A and B is determined to be a "perfect" statistical correlation (i.e., statistically a +1 or -1 coefficient, which means in every case one factor is either always present with the other or always absent). Example (4.) describes a system that is self-reinforcing.

Examples

Scientific research finds that people who use cannabis (A) have a higher risk of developing a psychiatric disorder (B).

This correlation is sometimes used to support the theory that the use of cannabis causes a psychiatric disorder (A is the cause of B). Although this may be possible, we cannot automatically discern a cause and effect relationship from research that has only determined people who use cannabis are more likely to develop a psychiatric disorder. From the same research, it can also be the case that (1.) having the predisposition for a psychiatric disorder causes these individuals to use cannabis (B causes A), OR (2.)it may be the case that in the above study some unknown third factor (e.g., poverty) is the actual cause for there being found a higher number of people (compared to the general public) who both use cannabis and who have been diagnosed as having a psychiatric disorder. To assume that A causes B is tempting, but further scientific investigation of the type that can isolate extraneous variables is needed when the current research has only determined a statistical correlation.


Another example:

Ice-cream sales are strongly correlated with crime rates.
Therefore, higher ice-cream sales cause crime.

The above argument commits the cum hoc ergo propter hoc fallacy, because it prematurely concludes ice cream sales cause crime. A more plausible explanation is that high temperatures increase ice-cream sales but also increase crime rates -- perhaps by making people irritable or restless, or by increasing the number of people outside at night.

A recent scientific example:

Young children who sleep with the light on are much more likely to develop myopia in later life.

This result of a study at University of Pennsylvania Medical Center was published in the May 13, 1999, issue of Nature and received much coverage at the time in the popular press [1]. However a later study at Ohio State University did not find any link between infants sleeping with the light on and developing myopia but did find a strong link between parental myopia and the development of child myopia and also noted that myopic parents were more likely to leave a light on in their children's bedroom [2].

Another example:

Not eating causes anorexia nervosa.

This could, however, be an example of case (4.): It is correct that not eating does cause anorexia nervosa, but it can also be claimed that having developed anorexia nervosa causes one not to eat. As all cases of (4.), it describes a phenomenon that is self-reinforcing.

Determining causation

The Enlightenment Philosopher David Hume argued that causality cannot be perceived (and therefore cannot be known or proven), and instead we can only perceive correlation. However, he argued that we can use the scientific method to rule out false causes.

In modern science, causation is defined by a counterfactual. Suppose that a student performed poorly on a test and guesses that the cause was not studying. To prove this, we think of the counterfactual - the same student writing the same test under the same circumstances, but having studied the night before. This counterfactual is certainly possible, but it is not what happened and so the counterfactual test score cannot be observed. If we could rewind history, and change only one small thing (making the student study for the exam), then causation could be observed (by comparing version 1 to version 2). Because we cannot rewind history and replay events after making small controlled changes, causation can only be inferred, never exactly known. This is referred to as the Fundamental Problem of Causal Inference - it is impossible to directly observe causal effects.

The central goal of scientific experiments and statistical methods is to approximate as best as possible the counterfactual state of the world. For example, one run an experiment on indentical twins who were known to consistently get the same grades on their tests. One twin is sent to study for six hours while the other is sent to the amusement park. If their test scores suddenly diverged by a large degree, this would be strong evidence that studying (or going to the amusment park) had a causal effect on test scores. In this case, correlation between studying and test scores would almost certainly indicate causation.

While few experiments or statistical studies are as compelling as this, it is certainly possible to find powerful evidence of causation. While causation cannot be directly observed, science can still learn a great deal about causation.

An entertaining demonstration of this fallacy once appeared in an episode of The Simpsons (Season 7, "Much Apu About Nothing"). The city had just spent millions of dollars creating a highly sophisticated "Bear Patrol" in response to the sighting of a single bear the week before.

Homer: Not a bear in sight. The "Bear Patrol" is working like a charm!
Lisa: That's specious reasoning, Dad.
Homer: [uncomprehendingly] Thanks, honey.
Lisa: By your logic, I could claim that this rock keeps tigers away.
Homer: Hmm. How does it work?
Lisa: It doesn't work. (pause) It's just a stupid rock!
Homer: Uh-huh.
Lisa: But I don't see any tigers around, do you?
Homer: (pause) Lisa, I want to buy your rock.

Another example is the Witch hunting scene from Monty Python and the Holy Grail:

Sir Bedevere: Tell me, what do you do with witches?
Mr. Newt: Burn them!
Sir Bedevere: And what do you burn apart from witches?
Peasant #1: More witches! [Peasant gets slapped]
Peasant #2: Wood!
Sir Bedevere: So, why do witches burn?
Peasant #3: .......... 'Cause they're made of... wood?
Sir Bedevere: Good! So how do we tell whether she is made of wood?
Peasant #1: Build a bridge out of her!
Sir Bedevere: Ahh, but can you not also make bridges out of stone?
Peasant #1: Oh ya.
Sir Bedevere: Tell me, Does wood sink in water?
Peasant #1: No, no, it floats. Throw her into the pond!
Sir Bedevere: No, no. What also floats in water?
Peasants yell various answers: (Bread!) (Apples!) (Very small rocks!) (Cider!) (Gravy!) (Cherries!) (Mud!) (Churches!) (Lead! Lead!)
King Arthur: A duck!
Sir Bedevere: Exactly! So, logically.....
Peasant: If she weighs the same as a duck, she's made of wood.
Sir Bedevere: And therefore?
Peasant: A Witch!

See also