# Evidential decision theory

Evidential decision theory (EDT) is a school of thought within decision theory which states that, when a rational agent is confronted with a set of possible actions, one should select the action with the highest news value, that is, the action which would be indicative of the best outcome in expectation if one received the "news" that it had been taken. In other words, it recommends to "do what you most want to learn that you will do."[1]: 7

EDT contrasts with causal decision theory (CDT), which prescribes taking the action that will causally produce the best outcome. While these two theories agree in many cases, they give different verdicts in certain philosophical thought experiments. For example, EDT prescribes taking only one box in Newcomb's paradox, while CDT recommends taking both boxes.[1]: 22–26

## Formal description

In a 1976 paper, Allan Gibbard and William Harper distinguished between two kinds of expected utility maximization. EDT proposes to maximize the expected utility of actions computed using conditional probabilities, namely

${\displaystyle V(A)=\sum \limits _{j}P(O_{j}|A)D(O_{j}),}$

where ${\displaystyle D(O_{j})}$ is the desirability of outcome ${\displaystyle O_{j}}$ and ${\displaystyle P(O_{j}|A)}$ is the conditional probability of ${\displaystyle O_{j}}$ given that action ${\displaystyle A}$ occurs.[2] This is in contrast to the counterfactual formulation of expected utility used by causal decision theory

${\displaystyle U(A)=\sum \limits _{j}P(A\mathrel {\Box {\rightarrow }} O_{j})D(O_{j}),}$

where the expression ${\displaystyle P(A\mathrel {\Box {\rightarrow }} O_{j})}$ indicates the probability of outcome ${\displaystyle O_{j}}$ in the counterfactual situation in which action ${\displaystyle A}$ is performed. Since ${\displaystyle P(A\mathrel {\Box {\rightarrow }} O_{j})}$ and ${\displaystyle P(O_{j}|A)}$ are not always equal, these formulations of expected utility are not equivalent,[2] leading to differences in actions prescribed by EDT and CDT.

## Thought experiments

Different decision theories are often examined in their recommendations for action in different thought experiments.

In Newcomb's paradox, there is a predictor, a player, and two boxes designated A and B. The predictor is able to reliably predict the player's choices— say, with 99% accuracy. The player is given a choice between taking only box B, or taking both boxes A and B. The player knows the following:[3]

• Box A is transparent and always contains a visible $1,000. • Box B is opaque, and its content has already been set by the predictor: • If the predictor has predicted the player will take both boxes A and B, then box B contains nothing. • If the predictor has predicted that the player will take only box B, then box B contains$1,000,000.

The player does not know what the predictor predicted or what box B contains while making the choice. Should the player take both boxes, or only box B?

Evidential decision theory recommends taking only box B in this scenario, because taking only box B is strong evidence that the predictor anticipated that the player would only take box B, and therefore it is very likely that box B contains 1,000,000. Conversely, choosing to take both boxes is strong evidence that the predictor knew that the player would take both boxes; therefore we should expect that box B contains nothing.[1]: 22 Conversely, causal decision theory (CDT) would have recommended that the player takes both boxes, because by that time the predictor has already made a prediction (therefore, the action of the player will not affect the outcome). Formally, the expected utilities in EDT are {\displaystyle {\begin{aligned}V({\text{take only B}})&=P({\text{1M in box B}}|{\text{take only B}})\times \1,000,000+P({\text{nothing in box B}}|{\text{take only B}})\times \0\\&=0.99\times \1,000,000+0.01\times \0=\990,000\\V({\text{take both boxes}})&=P({\text{1M in box B}}|{\text{take both boxes}})\times \1,001,000+P({\text{nothing in box B}}|{\text{take both boxes}})\times \1,000\\&=0.01\times \1,001,000+0.99\times \1,000=\11,000\end{aligned}}} Since ${\displaystyle V({\text{take only B}})>V({\text{take both boxes}})}$, EDT recommends taking only box B. ### Twin prisoner's dilemma In this variation on the Prisoner's Dilemma thought experiment, an agent must choose whether to cooperate or defect against her psychological twin, whose reasoning processes are exactly analogous to her own. Aomame and her psychological twin are put in separate rooms and cannot communicate. If they both cooperate, they each get5. If they both defect, they each get $1. If one cooperates and the other defects, then one gets$10, and the other gets $0. Assuming Aomame only cares about her individual payout, what should she do?[4] Evidential decision theory recommends cooperating in this situation, because Aomame's decision to cooperate is strong evidence that her psychological twin will also cooperate, meaning that her expected payoff is$5. On the other hand, if Aomame defects, this would be strong evidence that her twin will also defect, resulting in an expected payoff of \$1. Formally, the expected utilities are

{\displaystyle {\begin{aligned}V({\text{Aomame cooperates}})&=P({\text{twin cooperates}}|{\text{Aomame cooperates}})\times \5+P({\text{twin defects}}|{\text{Aomame cooperates}})\times \0\\&=1\times \5+0\times \0=\5\\V({\text{Aomame defects}})&=P({\text{twin cooperates}}|{\text{Aomame defects}})\times \10+P({\text{twin defects}}|{\text{Aomame defects}})\times \1\\&=0\times \10+1\times \1=\1.\end{aligned}}}

Since ${\displaystyle V({\text{Aomame cooperates}})>V({\text{Aomame defects}})}$, EDT recommends cooperating.

## Other supporting arguments

Even if one puts less credence on evidential decision theory, it may be reasonable to act as if EDT were true. Namely, because EDT can involve the actions of many correlated decision-makers, its stakes may be higher than causal decision theory and thus take priority.[5]

## Criticism

David Lewis has characterized evidential decision theory as promoting "an irrational policy of managing the news".[6] James M. Joyce asserted, "Rational agents choose acts on the basis of their causal efficacy, not their auspiciousness; they act to bring about good results even when doing so might betoken bad news."[7]