# Causal decision theory

Causal decision theory is a mathematical theory intended to determine the set of rational choices in a given situation. In informal terms, it maintains that the rational choice is that with the best expected causal consequences. This theory is often contrasted with evidential decision theory, which recommends those actions that provide the best evidence about the world.

## Informal Description

Very informally, causal decision theory advises decision makers to make the decision with the best expected causal consequences. The basic idea is simple enough: if eating an apple will cause you to be happy and eating an orange will cause you to be sad then you would be rational to eat the apple. One complication is the notion of expected causal consequences. Imagine that eating a good apple will cause you to be happy and eating a bad apple will cause you to be sad but you aren't sure if the apple is good or bad. In this case you don't know the causal effects of eating the apple. Instead, then, you work from the expected causal effects, where these will depend on three things: (1) how likely you think the apple is to be good and how likely you think it is to be bad; (2) how happy eating a good apple makes you; and (3) how sad eating a bad apple makes you. In informal terms, causal decision theory advises the agent to make the decision with the best expected causal effects.

## Formal Description

In a 1981 article, Allan Gibbard and William Harper explained causal decision theory as maximization of the expected utility $U$ of an action $A$ of an action "calculated from probabilities of counterfactuals":[1]

$U(A)=\sum\limits_{j} P(A > O_j) D(O_j),$

where $D(O_j)$ is the desirability of outcome $O_j$ and $P(A > O_j)$ is the counterfactual probability that, if $A$ were done, then $O_j$ would hold.

## Difference from evidential decision theory

David Lewis proved[2] that the probability of a conditional $P(A > O_j)$ does not always equal the conditional probability $P(O_j | A)$.[3] If that were the case, causal decision theory would be equivalent to evidential decision theory, which uses conditional probabilities.

Gibbard and Harper showed that if we accept two axioms (one related to the controversial principle of the conditional excluded middle[4]), then the statistical independence of $A$ and $A > O_j$ suffices to guarantee that $P(A > O_j) = P(O_j | A)$. However, there are cases in which actions and conditionals are not independent. Gibbard and Harper give an example in which King David wants Bathsheba but fears that summoning her would provoke a revolt.

Further, David has studied works on psychology and political science which teach him the following: Kings have two personality types, charismatic and uncharismatic. A king's degree of charisma depends on his genetic make-up and early childhood experiences, and cannot be changed in adulthood. Now, charismatic kings tend to act justly and uncharismatic kings unjustly. Successful revolts against charismatic kings are rare, whereas successful revolts against uncharismatic kings are frequent. Unjust acts themselves, though, do not cause successful revolts; the reason uncharismatic kings are prone to successful revolts is that they have a sneaky, ignoble bearing. David does not know whether or not he is charismatic; he does know that it is unjust to send for another man's wife. (p. 164)

In this case, evidential decision theory recommends that David abstain from Bathsheba, while causal decision theory—noting that whether David is charismatic or uncharismatic cannot be changed—recommends sending for her.

## Criticism

### Counterexamples

Newcomb's paradox is a classic example illustrating the potential conflict between causal and evidential decision theory: Because your choice of one or two boxes can't causally affect the Predictor's guess, causal decision theory recommends the two-boxing strategy.[1] However, this results in getting only $1,000, not$1,000,000. Similar concerns arise in problems like the prisoner's dilemma[5] and various other thought experiments.[6]

### Probabilities of conditionals

As Michael John Shaffer points out,[4] there are difficulties with assigning probabilities to counterfactuals. One proposal is the "imaging" technique suggested by Lewis:[7] To evaluate $P(A > O_j)$, move probability mass from each possible world $w$ to the closest possible world $w_A$ in which $A$ holds, assuming $A$ is possible. However, this procedure requires that we know what we would believe if we were certain of $A$; this is itself a conditional to which we might assign probability less than 1, leading to regress.[4]