# User:Tnorsen/Sandbox/Bell's Theorem

Bell's Theorem is a mathematical theorem first demonstrated by J.S. Bell in his 1964 paper "On the Einstein-Podolsky-Rosen paradox". The theorem establishes that any physical theory respecting a precisely-formulated locality (or "local causality") condition will make predictions, for a certain class of experiments, that are constrained by a so-called Bell Inequality. Since the particular theory called Quantum Mechanics makes predictions which violate this constraint, one can also think of Bell's Theorem as a proof that there is an inconsistency between (i) local causality and (ii) a certain set of QM's empirical predictions.

Actual experiments, beginning with that of Alain Aspect in 1982, strongly suggest that the relevant empirical predictions of QM are correct, and so seem to establish that the locality condition, from which the Inequality is derived, is false. But since the locality condition is strongly motivated by Special Relativity, Bell's Theorem and the associated experiments suggest, as Bell himself put it, an "apparently essential conflict between any sharp formulation [of quantum theory] and fundamental relativity. That is to say, we have an apparent incompatibility, at the deepest level, between the two fundamental pillars of contemporary theory..." [1]

Bell's own interpretation of his theorem, however, is not widely accepted among physicists in general. Some physicists who have studied Bell's Theorem carefully point to alleged flaws or hidden assumptions in Bell's formulation of local causality and/or his derivation of (what Bell called) the Locality Inequality therefrom; such claims are controversial and will be addressed below. But most physicists fail to agree with Bell's statement above not because they think there is some flaw in the reasoning leading to it, but rather because what they have learned about Bell's Theorem (from textbooks and other sources) radically distorts the subject.

### Non-Technical Overview

There are many variants of Bell's Theorem, each one stating a slightly different constraint implied by the assumption of locality, or bringing out an inconsistency between locality and a slightly different sub-class of quantum mechanical predictions. To give the flavor of Bell's Theorem, we begin here by presenting a minimally technical variant of the argument, in terms of an analogy to a certain game played by human characters Mary, Alice, John, and Bob.

In the game, Alice and Bob are stationed in two separate (and, let us say, perfectly shielded) rooms. Mary and John are experimental subjects who, after conferring with each other to agree on a strategy, are sent (respectively) to the two rooms where they are each asked a yes/no question that is randomly selected (after the doors have been shut tightly) from a pre-arranged list of three possible questions. Thus, crucially, Mary has no way of knowing what question John is being asked, and vice versa. Alice and Bob record which questions they asked and what answers were given, and the whole process is then repeated many times. At the end of the experiment, Alice and Bob can get together, compare notes, and examine the statistical correlations between Mary's and John's answers.

There are also some "rules of the game" which express statistical requirements on how Mary's and John's answers should correlate in order for them to win the game. Those rules are the following:

(a) whenever Alice and Bob happen to ask the same question, Mary and John should answer the same way (either both "yes" or both "no")

(b) whenever Alice and Bob ask a particular pair of different questions, Mary and John should answer the same way (either both "yes" or both "no") about 25% of the time

The theorem, then, is that Mary and John cannot win the game, no matter what strategy they employ. That is, there is a contradiction between the setup (in particular the idea that Mary is ignorant of which question is asked of John, and vice versa) and the requirements expressed in Rules (a) and (b). Here is the proof.

First, observe that, in order to ensure that their answers respect Rule (a), the strategy adopted by Mary and John must be such that in each run of the procedure they agree, during the time when they can communicate, on answers for all three possible questions -- say, yes for question #1, no for question #2, and yes for question #3. (Let us denote this set of answers by YNY.) Otherwise it will (with extremely high probability!) eventually happen that Alice and Bob both ask precisely the question for which Mary and John have agreed to disagree, in which case Rule (a) will be violated. The fact that Mary and John have no information about what questions are going to be asked during the time they are allowed to communicate is crucial here: for instance, they cannot choose to agree on an answer for question #2 only for those runs when Alice and Bob are actually going to ask them question number 2, since such information is not available during the time they can communicate.

Thus, in order to comply with rule (a), Mary and John must, in each run, agree on a triple of answers: YYY, YYN, YNY, YNN, NYY, NYN, NNY, or NNN.

But now, compliance with Rule (b) will require that, in 25% of the runs, Mary and John choose a triple in which the first letter is equal to the second (i.e., one of YYY, YYN, NNY, and NNN). It might seem like all that is required is that Mary and John choose one of these four triples in 25% of those runs in which Alice and Bob ask questions #1 and #2 -- or questions #2 and #1 -- respectively. But again here it is crucial to remember that Mary and John don't know, when they are agreeing on their strategy, which questions they will be asked. So to ensure that their answers give the right statistics when Alice and Bob do ask respectively questions #1 and #2 (or vice versa), they have to pick one of these four triples in 25% of all the runs.

Similarly, Mary and John must, in 25% of all the runs, choose a triple such that the first letter equals the third (YYY, YNY, NNN, or NYN); and they must also, in 25% of all the runs, choose a triple such that the second letter equals the third (YYY, NYY, NNN, or YNN).

So, 25% of the time, Mary and John should pick a triple that is in the first group; 25% of the time they should pick a triple that is in the second group; and 25% of the time they should pick a triple that is in the third group. Of course, some triples (such as YYY) are in more than one of the three sets. It follows that at most 75% of the time Mary and John should select a triple that is in any of the three groups. That is, at least 25% of the time, they should select a triple that isn't in any of the three groups.

The difficulty is that all of the triples are in at least one of the three groups. So there simply aren't any available triples left for the "at least 25% of the time" when they must select a triple that isn't in any of the three groups. So they can't possibly comply with rules (a) and (b), i.e., they can't win the game.

If an experiment like this were actually performed, therefore, you'd know in advance that the actual statistics would have to violate at least one of the rules. Or, conversely, if the statistics respected both of the rules, you would know that the setup hadn't been implemented as described above -- in particular, you would know that Mary and John had been somehow secretly communicating after going into their respective rooms and finding out which questions they were asked.

This whole scenario is of course an analogy to a real physics experiment, so let us now briefly indicate what kind of experiment that is and how the analogy is supposed to work. The experiments involve a central source that emits pairs of particles (usually photons) in opposite directions; the particles are like Mary and John in the game. There are detectors on either side which measure one of several possible properties of the particles (usually the polarization along one of several axes); these are like Alice and Bob in the game. The decision of which property to measure on a given particle is made locally and randomly, after the particles are already in flight; it is then impossible, with influences propagating at or below the speed of light, for the decision on one side (about which property to measure) to affect the outcome of the measurement on the other, distant particle; this is analogous to the closed doors and shielding in the game, which prevents Mary from knowing which question John is being asked and vice versa. Finally, the "rules of the game" correspond to the statistical predictions of quantum mechanics for the physics experiment described here.

We now turn to a more systematic presentation of Bell's Theorem, starting with the historical context of its discovery in 1964.

## History

Bell's interest in local causality and its relation to the various formulations of quantum theory was triggered by two things: his dissatisfaction with the "unprofessionally vague and ambiguous" [2] character of orthodox quantum mechanics, and his learning about the alternative de Broglie - Bohm "pilot-wave" formulation of quantum theory (aka "Bohmian Mechanics"). Bell wrote: "Bohm's 1952 papers on quantum mechanics were for me a revelation. The elimination of indeterminism was very striking. But more important, it seemed to me, was the elimination of any need for a vague division of the world into system' on the one hand, and apparatus' or observer' on the other." [3]

In particular, learning about Bohm's theory helped Bell realize that the various "no hidden variables" theorems (of von Neumann and others), which had been taken almost universally by physicists as conclusively establishing something like the Copenhagen formulation of quantum theory, were all bogus. Bohm's theory was a clean counter-example, i.e., a proof-by-example that the theorems didn't rule out what they had been taken to rule out.

This led Bell to carefully scrutinize those theorems. The result of this work was his paper "On the problem of hidden variables in quantum mechanics". [4] This paper was written prior to the 1964 paper in which Bell's Theorem was first presented, but (due to an editorial accident) remained unpublished until 1966. The 1966 paper shows that the "no hidden variables" theorems of von Neumann and others all made an unwarranted and unacknowledged assumption, which today is usually called "non-contextuality". The existence of Bohm's theory (in light of the alleged "impossibility proofs") could thus be understood on the grounds that it violated this assumption, i.e., manifested the relevant sort of "contextuality".

Despite its virtues, however, Bohm's theory did raise one important question. Bell described this as follows:

"Even now the de Broglie - Bohm picture is generally ignored, and not taught to students. I think this is a great loss. For that picture exercises the mind in a very salutary way. The de Broglie - Bohm picture disposes of the necessity to divide the world somehow into system and apparatus. But another problem is brought into focus. This picture ... has a very surprising feature: the consequences of events at one place propagate to other places faster than light. This happens in a way that we cannot use for signaling. Nevertheless it is a gross violation of relativistic causality." [5]

One can therefore understand why Bell's 1966 paper closes by wondering if it might be possible to construct a theory which shared the virtues of Bohm's theory, but which failed to display the surprising nonlocal causation:

"...it must be stressed that, to the present writer's knowledge, there is no proof that any hidden variable account of quantum mechanics must have this extraordinary character. It would therefore be interesting, perhaps, to pursue some further impossibility proofs,' replacing the arbitrary axioms objected to above by some condition of locality, or of separability of distant systems." [6]

Because of the editorial accident mentioned above, Bell had answered his own question before the paper in which it appeared was even published. The answer is contained in Bell's Theorem, which is precisely a " proof that any ... account of quantum mechanics must have this extraordinary character", i.e., must violate a local causality constraint that is motivated by relativity.

## EPR

Bell's theorem is based on an experimental setup that had been proposed in 1935 by Einstein, Podolsky, and Rosen (EPR) and tweaked into its modern form by Bohm in 1951. [7] The setup involves a central "source" which emits pairs of specially-prepared particles in opposite directions, toward two spatially separated observers who, in recent years, have typically been called Alice and Bob. For historical and technical reasons, the central particle source is often described as emitting pairs of spin-1/2 particles in an entangled spin "singlet" state, though equivalent formulations (which correspond more closely to the setups of the actual experiments relevant to Bell's theorem) in terms of polarization-entangled photons are also possible.

Alice and Bob then use Stern-Gerlach devices (or, if the incident particles are photons, polarizers) to measure the spin- (or polarization-) components of their respective particles along some directions, which they may in principle separately and freely select (say, from among a pre-decided set of possible directions). The outcomes of the two measurements can then be recorded and compared.

Understanding this EPR-Bohm setup is necessary to understand Bell's theorem for two reasons. First, the theorem assumes and uses this setup. The second point is not as widely appreciated, but is crucial: Bell's reasoning from local causality to the celebrated inequality begins by recapitulating (what amounts to) the same argument that EPR presented in 1935. This widely-discussed argument was intended by the authors to prove the existence of "hidden variables" and hence to establish a negative answer to the question raised in the EPR paper's title: Can quantum mechanical description of reality be considered complete?

What's really essential to the EPR paper, however, and this is something Einstein repeatedly stressed in his later discussions of the issue, is not the conclusion per se, but the argument for it. [8] The argument begins by noting that, when Alice and Bob orient their Stern-Gerlach devices along the same direction, the two outcomes will be perfectly correlated -- either Alice's particle will be "spin up" along that direction and Bob's will be "spin down", or vice versa. This perfect correlation is a prediction of quantum mechanics and is also a well-confirmed experimental fact.

EPR, however, objected to the orthodox quantum mechanical account of these perfect correlations, since it seemed to involve a kind of nonlocality associated with the collapse of the wave function. In particular, according to orthodox quantum mechanics, neither particle has a definite value ("up" or "down") for its spin-component along this (or any) direction while on its way to the measuring device. When one of the particles arrives at the measuring device, however, the wave function for the whole system collpases, producing a random but definite outcome for that measurement -- and simultaneously putting the distant, as-yet-unmeasured particle in a new state with the appropriate, correlated spin-component along that same axis. That is, it appears in the account of orthodox QM that the measurement on one particle had instantaneously influenced not only the particle being measured, but the distant one as well.

EPR thus considered that orthodox quantum mechanics was a nonlocal theory. They suggested, however, that a local account for the perfect correlations could be given if one regarded the quantum mechanical description of the particle pair (prior to any measurement) as being incomplete -- in particular, if one posited the existence of "hidden variables" (i.e., facts about the state of the particles which supplement the exclusively wave-function description of orthodox quantum mechanics) which determine, in advance, the results of any and all possible spin-component measurements on either particle.

Here is one of Bell's memorable re-tellings of the EPR argument:

"For [EPR] these correlations simply showed that the quantum theorists had been hasty in dismissing the reality of the microscopic world. In particular [they] had been wrong in supposing that nothing was real or fixed in that world before observation. For after observing only one particle the result of subsequently observing the other (possibly at a very remote place) is immediately predictable. Could it be that the first observation somehow fixes what was unfixed, or makes real what was unreal, not only for the near particle but also for the remote one? For EPR that would be an unthinkable spooky action at a distance'. To avoid such action at a distance they have to attribute, to the space-time regions in question, real properties in advance of observation, correlated properties, which predetermine the outcomes of these particular observations. Since these real properties, fixed in advance of observation, are not contained in quantum formalism, that formalism for EPR is incomplete. It may be correct, as far as it goes, but the usual quantum formalism cannot be the whole story.

"It is important to note that to the limited degree to which determinism plays a role in the EPR argument, it is not assumed but inferred. What is held sacred is the principle of local causality' -- or no action at a distance'. Of course, mere correlation between distant events does not by itself imply action at a distance, but only correlation between the signals reaching the two places. These signals, in the idealized example of Bohm, must be sufficient to determine whether the particles go up or down. For any residual undeterminism could only spoil the perfect correlation.

"It is remarkably difficult to get this point across, that determinism is not a presupposition of the analysis. There is a widespread and erroneous conviction that for Einstein determinism was always the sacred principle. The quotability of his famous God does not play dice' has not helped in this respect. .... Einstein had no difficulty accepting that affairs in different places could be correlated. What he could not accept was that an intervention at one place could influence, immediately, affairs at the other." [9]

To summarize, then, the EPR argument (which is also the first step in Bell's theorem) is an "argument from locality to deterministic hidden variables." [10] As claimed in the extended quote from Bell, the EPR paper has been widely misunderstood. Consequently, many physicists continue to believe, for example, that EPR did not make an argument, but simply expressed the authors' philosophical preference for determinism and/or hidden variables. Many others accept that there was some kind of argument in the EPR paper, but believe that the argument was refuted in the response written by Niels Bohr. [11] Here it will suffice to pass over these debates [12] and simply register that Bell regarded the EPR argument not only as entirely valid, but as the crucial first step in the reasoning behind his theorem.

Note also that a somewhat anachronistic, but mathematically rigorous, derivation of EPR's deterministic hidden variables from Bell's local causality condition is possible. [13] In particular, from Bell's local causality condition and the empirically-grounded assumption of perfect correlations (when Alice and Bob happen to measure along the same axis) we may infer the existence of functions ${\displaystyle A({\hat {a}},\lambda )}$ and ${\displaystyle B({\hat {b}},\lambda )}$ where ${\displaystyle A,B\in \{+1,-1\}}$ are the outcomes of Alice's and Bob's measurements along directions ${\displaystyle {\hat {a}}}$ and ${\displaystyle {\hat {b}}}$, respectively, when the particle pair is in state ${\displaystyle \lambda }$. The variable ${\displaystyle \lambda }$ is here regarded as the complete physical description of the pre-measurement state of the particle pair. It may include the quantum mechanical wave function, but must also include supplementary ("hidden") variables sufficient to determine the outcomes and hence make the functions ${\displaystyle A({\hat {a}},\lambda )}$ and ${\displaystyle B({\hat {b}},\lambda )}$ well-defined. (The "must" in the previous sentence is not an assumption, but the conclusion of the rigorously re-formulated EPR argument.)

## Bell's Inequality

I'm leaving this section basically unwritten for now, because we should all maybe discuss which version to present here. There was some mention by email of a "1/4 + 1/4 + 1/4 > 1" derivation, which I have some vague recollection of from one of Shelly et al's papers, but I don't remember the argument and don't remember which paper it's from! Given that state of ignorance, I'd be inclined to present here the standard argument for the 2x2 type experiment which goes like this:

Consider the following algebraic combination of outcome-predictions (all of which must be simultaneously well-defined for any locally causal theory which successfully predicts perfect correlations under the appropriate circumstances, as discussed in the EPR section)

${\displaystyle S=A({\hat {a}},\lambda )B({\hat {b}},\lambda )-A({\hat {a}},\lambda )B({\hat {b}}',\lambda )+A({\hat {a}}',\lambda )B({\hat {b}},\lambda )+A({\hat {a}}',\lambda )B({\hat {b}}',\lambda )}$

${\displaystyle \;\;=A({\hat {a}},\lambda )[B({\hat {b}},\lambda )-B({\hat {b}}',\lambda )]+A({\hat {a}}',\lambda )[B({\hat {b}},\lambda )+B({\hat {b}}',\lambda )]}$

Now one or the other of the terms in square brackets is zero, with the other being +2 or -2. And A is either +1 or -1, so

${\displaystyle S=+2\;or\;-2}$

and so the absolute value of S averaged over many runs (with some unknown distribution of L's) will be less than or equal to 2. Then explain how, although we cannot actually measure S for a given pair, we can measure one of the terms in S for a given pair, and then construct an ensemble average for S by adding up the averages for each of the four terms. And so the corresponding sum of empirically measurable correlation coefficients should be constrained by 2.

${\displaystyle |C(a,b)-C(a,b')+C(a',b)+C(a',b')|\leq 2}$

Note that this assumes some sort of "fair sampling" hypothesis, namely that there is no correlation between the state L of the pairs, and which of the four possible experiments happen to be performed on a given pair.

Important note: Bell calls the inequality "the locality inequality" (not "Bell's inequality"). That's great terminology to mention and use. e.g., page 245

## Experiments

Bell's theorem shows that any locally causal theory must predict correlations (for the EPR-Bell experimental setup described above) which are consistent with the locality inequality (aka Bell's inequality). In order to know whether or not the class of locally causal theories is empirically viable, we must perform an actual experiment to measure the appropriate correlations and see whether, in fact, the inequality is respected or violated by the data. The first such experiment was performed by Freedman and Clauser in 1972, with several different versions occuring later in the decade. But every experiment prior to Gregor Weihs' landmark 1998 version suffered from an important drawback.

To understand this, we must recall how the assumption of local causality actually manifests in Bell's derivation. It is essentially this: we assume that the setting (i.e., the choice of directions along which to measure the spin-component of a given particle) on one side does not influence the outcome of the experiment performed on the other side. Specifically, we assume that Alice's outcome ${\displaystyle A}$ is independent of Bob's setting ${\displaystyle {\hat {b}}}$

${\displaystyle A({\hat {a}},{\hat {b}},\lambda )=A({\hat {a}},\lambda )}$

and that Bob's outcome ${\displaystyle B}$ is independent of Alice's setting ${\displaystyle {\hat {a}}}$

${\displaystyle B({\hat {a}},{\hat {b}},\lambda )=B({\hat {b}},\lambda ).}$

This assumption, however, is only plausibly motivated by relativity's alleged prohibition on superluminal causation, if Alice's setting is "freely" made at spacelike separation from Bob's experiment, and vice versa. Thus, in order for an experimental violation of Bell's inequality to rigorously support the existence of superluminal causation, the final orientations of the measuring devices (or whatever exactly plays this role in the experiments) must be somehow randomly selected "at the last possible moment". This was pointed out already in Bell's 1964 paper:

"Conceivably [the quantum mechanical predictions] might apply only to experiments in which the settings of the instruments are made sufficiently in advance to allow them to reach some mutual rapport by exchange of signals with velocity less than or equal to that of light. In that connection, experiments ... in which the settings are changed during the flight of the particles, are crucial." [14]

Alain Aspect's 1982 experiment first implemented such "delayed choice". In Aspect's experiment, each particle was shunted to one of two possible (differently-oriented) measuring devices by a clever mechanism involving periodic standing density waves in water. However, not only was the alternating shunting periodic (still allowing in principle the kind of "mutual rapport" discussed by Bell), the periodicity ${\displaystyle T}$ was not significantly smaller than the distance between the two measurements divided by the speed of light, ${\displaystyle L/c}$.

The 1998 experiment of Weihs et al. greatly increased the spatial separation ${\displaystyle L}$ between the two wings of the experiment, and replaced Aspect's periodic shunting by independent quantum random number generators, on each side, determining to which of two possible (differently oriented) measuring devices a given particle would be shunted. To most people, this conclusively rules out the possibility that a locally causal (i.e., slower-than-light) mechanism could allow the setting on one side to causally influence the outcome on the other. (See, however, the notes about "superdeterminism" in the later section.)

## Bell's Theorem Without Inequalities

It seems like this section should go here, since the discussion of Experiments makes most sense after the section on Bell's Inequalities.

## Summary

Here is Bell's summary of the overall argument from local causality to the empirically tested (and refuted) inequality:

"The [EPR-Bohm] correlations are such that the result of the experiment on one side immediately foretells that on the other, whenever the analyzers happen to be parallel. If we do not accept the intervention on one side as a causal influence on the other, we seem obliged to admit that the results on both sides are determined in advance anyway, independently of the intervention on the other side, by signals from the source and by the local magnet setting. But this has implications for the non-parallel settings which conflict with those of quantum mechanics. So we cannot dismiss intervention on one side as a causal influence on the other. ....Einstein could no longer write so easily, speaking of local causality ...I still cannot find any fact anywhere which would make it appear likely that that requirement will have to be abandoned'." [15]

The type of experimental set up to which Bell's theorem applies consists of a source and two wings. The source is usually described as emitting a pair of specially prepared ({\em entangled\/}) particles; one particle goes to the wing in which the experimenter Alice stays and the other particle goes to the wing in which the experimenter Bob stays (in the game metaphor, the particles were replaced by the human characters Mary and John). Although it is usual to describe the source as emitting a pair of particles, it should be noticed that Bell's argument makes no assumption about the existence of particles and in fact it assumes no particular picture about what is going on at the microscopic level\footnote{% It is not even necessary to assume that the source really emits something. However, one should observe that if the source emits nothing, then it is obviously impossible to account locally for the correlation between the results obtained by Alice and by Bob.}. The experiment (and the argument) can be completely described in terms of (macroscopic) preparation procedures and the behavior of knobs and pointers on the apparatuses. Each experimenter uses an apparatus containing a knob; we denote by $a$ (resp., by $b$) the setting of Alice's (resp., of Bob's) knob. After setting the apparatus knob, each experimenter observes an outcome (like the position of a pointer); we denote by $A$ (resp., by $B$) the outcome of Alice's (resp., of Bob's) experiment. In the metaphor of the game, the settings $a$, $b$ of the knobs correspond to the questions used by Alice and Bob; the outcomes $A$, $B$ correspond to the answers given by Mary and John. We make an assumption that we call {\em locality}: there is no interaction between the wings of the experiment. In the metaphor of the game, this corresponds to the assumption that Mary and John cannot communicate after they receive the questions. We also make the assumption that the random procedure that chooses the settings $a$, $b$ of the knobs is independent of whatever physical processes that go on at the source. This is an assumption sometimes referred to as no conspiracy and, in the metaphor of the game, it corresponds to the assumption that, during the time they can communicate, Mary and John don't have any information about what questions are going to be asked by Alice and Bob (the physical processes going on at the source correspond, in the metaphor of the game, to the conversation between John and Mary, before they receive the questions). The name no conspiracy comes from the idea that correlations between physical processes (such as the emission of a pair of particles) and subsequent (apparently independent) choices of human experimenters or (pseudo) random number generators are usually taken to be conspiracies of Nature against investigators and one cannot really do any scientific investigation without some sort of no conspiracy assumption.

Let us assume that the settings $a$, $b$ take values in a three element set $\{1,2,3\}$ and that the outcomes $A$, $B$ take values in a two element set $\{-1,1\}$. For each pair $(a,b)\in\{1,2,3\}\times\{1,2,3\}$ the experiment allows one to observe a probability distribution $P_{ab}$ for the outcomes $(A,B)\in\{-1,1\}\times\{-1,1\}$. Using such notation, assumptions (a) and (b) are restated as: \begin{itemize} \item[(a)] if $a=b$ then $P_{ab}(A=B)=1$; \item[(b)] if $a\ne b$ then $P_{ab}(A=B)=1/4$. \end{itemize} {\em Bell's theorem\/} states that, under the no conspiracy assumption: $$\label{eq:Bell} \text{(a) and (b) and locality}\Longrightarrow\text{contradiction},$$ so that (a) and (b) imply a violation of locality. The proof of the theorem is very simple and it goes as follows: first, from (a) and the locality assumption one concludes that the outcomes $A$, $B$ are predetermined. This means that there exist random variables $Z_1$, $Z_2$, $Z_3$ taking values in $\{-1,1\}$ such that: $A=Z_a,\quad B=Z_b.$ Assumption (b) then implies: $P(Z_1=Z_2)=1/4,\quad P(Z_1=Z_3)=1/4,\quad P(Z_2=Z_3)=1/4.$ A contradiction is now obtained using the following result: \begin{prop} Given random variables $Z_1$, $Z_2$, $Z_3$ taking values in a two element set then: $P(Z_1=Z_2)+P(Z_1=Z_3)+P(Z_2=Z_3)\ge1.$ \end{prop} \begin{proof} Since the random variables take values in a two element set, the union of the events $[Z_1=Z_2]$, $[Z_1=Z_3]$, $[Z_2=Z_3]$ is the entire sample space, so that the sum of the probabilities of such events must be greater than or equal to $1$. \end{proof}

The following scheme summarizes the proof of Bell's theorem \eqref{eq:Bell}; first, it is argued that: $$\label{eq:EPR} \text{(a) and locality}\Longrightarrow\text{predetermined values Z_i}$$ and then it is argued that: $$\label{eq:Bellsidea} \text{(b) and predetermined values Z_i}\Longrightarrow\text{contradiction},$$ so that \eqref{eq:Bell} is proven. Argument \eqref{eq:EPR} is actually a version of the celebrated Einstein--Podolski--Rosen (EPR) argument. Bell's idea was to add the new argument \eqref{eq:Bellsidea} to the old EPR argument, showing that the locality assumption is incompatible with (a) and (b).

\section*{A sharper formulation of the locality assumption and the Clauser--Horne--Shimony--Holt inequality}

It is possible to give a sharper formulation of the locality assumption in terms of conditional probabilities. This has several advantages. It allows a better appreciation of the argument \eqref{eq:EPR} and of the role of the no conspiracy assumption in the proof of Bell's theorem. It also allows one to prove an inequality, the {\em Clauser--Horne--Shimony--Holt inequality\/} (CHSH inequality) directly from the assumption of locality, so that the mere violation of such inequality (under the no conspiracy assumption) implies a violation of locality. This is a second variant of Bell's theorem.

As before, we denote by $a$, $b$ the setting of the knobs. Let $(\Lambda,P)$ be a probability space and for each $\lambda\in\Lambda$ and each $a$, $b$, let $P_{ab}(\cdot|\lambda)$ be a probability measure on $\{-1,1\}\times\{-1,1\}$. We denote elements of the product $\Lambda\times\{-1,1\}\times\{-1,1\}$ by $(\lambda,A,B)$. For each $a$, $b$, we obtain a probability measure $P_{ab}$ on the product $\Lambda\times\{-1,1\}\times\{-1,1\}$ by requiring the marginal distribution of $\lambda$ to be $P$ and the conditional distribution of $(A,B)$ given $\lambda$ to be $P_{ab}(\cdot|\lambda)$.

In order to understand our intentions with such scheme it is useful to think in terms of the metaphor of the game: during the preliminary reunion that happens before the questions are asked, the players John and Mary are allowed to share some set of data, which we denote by $\lambda$. Such $\lambda$ is an element of some set $\Lambda$ and, since $\lambda$ is allowed to vary from run to run, we have some probability measure $P$ on $\Lambda$ which encodes the relative frequencies of the various values assumed by $\lambda$. For each $\lambda\in\Lambda$, the answers $A$, $B$ are chosen using the probability measure $P_{ab}(\cdot|\lambda)$. Notice that the relative frequencies for the values of $A$, $B$ observed by Alice and Bob are not given by the conditional distribution $P_{ab}(\cdot|\lambda)$ of $(A,B)$ given $\lambda$, but by the marginal of $(A,B)$ which is obtained from $P_{ab}(\cdot|\lambda)$ by averaging over $\lambda\in\Lambda$ with $P$: $P_{ab}(A=A_0,B=B_0)=\int_\Lambda P_{ab}(A=A_0,B=B_0|\lambda)\,\dd P(\lambda).$ The locality assumption means that when Mary chooses her answer $A$ she is allowed to use the question $a$ and the value of $\lambda$ agreed on the previous conversation with John, but not the question $b$ and John's answer $B$ (similarly, when John chooses his answer $B$ he is allowed to use $b$ and $\lambda$, but not $a$ and $A$). Thus, a {\em local strategy\/} for John and Mary is one that satisfies the following conditions: \begin{itemize} \item[(i)] the conditional distributions of $A$, $B$ given $\lambda$ are independent, i.e.: $P_{ab}(A=A_0,B=B_0|\lambda)=P_{ab}(A=A_0|\lambda)P_{ab}(B=B_0|\lambda);$ \item[(ii)] $P_{ab}(A=A_0|\lambda)$ does not depend on $b$ and $P_{ab}(B=B_0|\lambda)$ does not depend on $a$. \end{itemize} The assumption of {\em locality\/} is the assumption of the existence of the probability space $(\Lambda,P)$ and of the conditional distributions $P_{ab}(\cdot|\lambda)$ satisfying (i) and (ii). The no conspiracy assumption is the assumption that the probability measure $P$ on $\Lambda$ does not depend on $a$ and $b$ (in terms of the metaphor of the game, this means that John and Mary have no information about $a$ and $b$ when they choose $\lambda$).

A deeper analysis of this sharp mathematical formulation of the locality assumption is given by Bell\footnote{% More precisely, Bell gives a definition of what he calls a {\em locally causal\/} theory. Our locality condition is a consequence of Bell's local causality condition presented in \cite{Bellcuisine}.} in \cite{Bellcuisine} and it is beyond the scope of this article.

\subsection*{The EPR argument again} The EPR argument \eqref{eq:EPR} can be justified in more detail using the sharper formulation of the locality assumption. One only needs the following elementary result from probability theory:

\begin{lem} Let $X$, $Y$ be independent random variables. If $P(X=Y)=1$ then there exists a constant $k$ such that $P(X=k)=1$ and $P(Y=k)=1$.\qed \end{lem}

Apply the Lemma to the random variables $A$, $B$ conditioned on $\lambda$. By part (i) of the locality assumption, such random variables are independent; from assumption (a), we have $P_{aa}(A=B)=1$, so that also $P_{aa}(A=B|\lambda)=1$ and the Lemma implies that the random variables $A$, $B$ given $\lambda$ are equal with probability $1$ to a constant $k=k(a,\lambda)$ that might depend on $a$ and $\lambda$. It follows from part (ii) of the locality assumption that: $P_{ab}\big(A=k(a,\lambda)\big)=1,\quad P_{ab}\big(B=k(b,\lambda)\big)=1.$ The random variables $Z_1$, $Z_2$, $Z_3$ are then defined by: $Z_i=k(i,\lambda),\quad i=1,2,3.$

\subsection*{CHSH inequality} Given $a$ and $b$, we denote by: $C(a,b)=E_{ab}(AB)$ the expected value of the product $AB$ under the probability measure $P_{ab}$. Assuming, as before, that $A$ and $B$ take values in $\{-1,1\}$, we have the following: \begin{prop}[CHSH inequality] Under the assumption of locality ((i) and (ii)), the following inequality holds: $\vert C(a,b)+C(a,b')\vert+\vert C(a',b)-C(a',b')\vert\le2,$ for all $a$, $b$, $a'$, $b'$. \end{prop} \begin{proof} We have: $C(a,b)=\int_\Lambda E_{ab}(AB|\lambda)\,\dd P(\lambda).$ Since the variables $A$, $B$, conditioned on $\lambda$, are independent, we have: $E_{ab}(AB|\lambda)=E_{ab}(A|\lambda)E_{ab}(B|\lambda).$ Since the conditional distribution of $A$ on $\lambda$ does not depend on $b$ and the conditional distribution of $B$ on $\lambda$ does not depend on $a$, we write: $E_{ab}(A|\lambda)=f(a,\lambda),\quad E_{ab}(B|\lambda)=g(b,\lambda).$ Thus: $C(a,b)=\int_\Lambda f(a,\lambda)g(b,\lambda)\,\dd P(\lambda),$ and: \begin{multline*} \vert C(a,b)+C(a,b')\vert+\vert C(a',b)-C(a',b')\vert\le\\ \int_\Lambda\Big[\vert f(a,\lambda)\vert\vert g(b,\lambda)+g(b',\lambda)\vert+ \vert f(a',\lambda)\vert\vert g(b,\lambda)-g(b',\lambda)\vert\Big]\,\dd P(\lambda)\le\\ \int_\Lambda\Big[\vert g(b,\lambda)+g(b',\lambda)\vert+\vert g(b,\lambda)-g(b',\lambda)\vert\Big]\,\dd P(\lambda), \end{multline*} since $A$, $B$ take values in $[-1,1]$, so that $\vert f(a,\lambda)\vert\le1$ and $\vert f(a',\lambda)\vert\le1$. An elementary argument shows that: $\vert\alpha+\beta\vert+\vert\alpha-\beta\vert\le2,$ for all $\alpha,\beta\in[-1,1]$. Setting $\alpha=g(b,\lambda)\in[-1,1]$, $\beta=g(b',\lambda)\in[-1,1]$, the conclusion follows. \end{proof}

\section*{Quantum Theory violates Bell's constraint}

Let $H$ denote a two-dimensional state space (i.e., a two-dimensional complex Hilbert space) and let $e_1$, $e_2$ be an orthonormal basis of $H$. Let $\sigma_x$, $\sigma_y$, $\sigma_z$ denote the observables which are represented with respect to the basis $e_1$, $e_2$ by the Pauli matrices: $\sigma_x=\begin{pmatrix}0&1\\1&0\end{pmatrix},\quad \sigma_y=\begin{pmatrix}0&-i\\i&0\end{pmatrix},\quad \sigma_z=\begin{pmatrix}1&0\\0&-1\end{pmatrix},$ and, for each unit vector $\boldn=(n_x,n_y,n_z)\in\R^3$, set: $\boldn\cdot\boldsigma=n_x\sigma_x+n_y\sigma_y+n_z\sigma_z.$ The state space $H$ could be associated to the spin degrees of freedom of a spin $\frac12$ particle (in which case the observables $\boldn\cdot\boldsigma$ can be measured using Stern--Gerlach magnets) or to the polarization degrees of freedom of a photon (in which case the observables $\boldn\cdot\boldsigma$ can be measured using polarizers). The spectrum of $\boldn\cdot\boldsigma$ is the two element set $\{-1,1\}$.

Now consider the composite system with state space given by the tensor product $H\otimes H$. We set: $\boldn\cdot\boldsigma^1=(\boldn\cdot\boldsigma)\otimes\I,\quad \boldn\cdot\boldsigma^2=\I\otimes(\boldn\cdot\boldsigma),$ where $\I$ denotes the identity operator on $H$. Given unit vectors $\boldn^1$, $\boldn^2$ in $\R^3$, the observables $\boldn^1\cdot\boldsigma^1$ and $\boldn^2\cdot\boldsigma^2$ can be measured using spacelike separated experiments. Assume that the state of our composite system has been prepared at the source in the {\em singlet state}: $\psi=\frac{e_1\otimes e_2-e_2\otimes e_1}{\sqrt2}.$ A straightforward computation shows that the expected value of the observable $(\boldn^1\cdot\boldsigma^1)(\boldn^2\cdot\boldsigma^2)=(\boldn^1\cdot\boldsigma)\otimes(\boldn^2\cdot\boldsigma)$ at the state $\psi$ equals: $\langle(\boldn^1\cdot\boldsigma^1)(\boldn^2\cdot\boldsigma^2)\psi,\psi\rangle=-\boldn^1\cdot\boldn^2.$ The situation in which both wings of the experiment get the same outcome correspond to the outcome $1$ for the measurement of $(\boldn^1\cdot\boldsigma^1)(\boldn^2\cdot\boldsigma^2)$ and the situation in which the wings of the experiment get opposite outcomes correspond to the outcome $-1$ for the measurement of $(\boldn^1\cdot\boldsigma^1)(\boldn^2\cdot\boldsigma^2)$. Denoting by $p(\boldn^1,\boldn^2)$ the probability that both wings of the experiment get the same outcome then: $-\boldn^1\cdot\boldn^2=p(\boldn^1,\boldn^2)+(-1)\big(1-p(\boldn^1,\boldn^2)\big),$ so that: $p(\boldn^1,\boldn^2)=\frac{1-\boldn^1\cdot\boldn^2}2.$ Let $\boldn^1(1)$, $\boldn^1(2)$, $\boldn^1(3)$ be three unit vectors in a plane of $\R^3$ such that two of them make an angle of $120$ degrees with each other and set $\boldn^2(1)=-\boldn^1(1)$, $\boldn^2(2)=-\boldn^1(2)$, $\boldn^2(3)=-\boldn^1(3)$. Given $a,b\in\{1,2,3\}$ then the probability: $P_{ab}(A=B)=p\big(\boldn^1(a),\boldn^2(b)\big)=\frac{1-\boldn^1(a)\boldn^2(b)}2=\frac{1+\boldn^1(a)\cdot\boldn^1(b)}2$ equals $1$ if $a=b$ and equals $1/4$ if $a\ne b$. These are exactly the conditions (a) and (b) that appear in the first version of Bell's theorem presented in this article. Alternatively, such conditions can be obtained using the state: $\phi=\frac{e_1\otimes e_1+e_2\otimes e_2}{\sqrt2}.$ The expected value of $(\boldn^1\cdot\boldsigma^1)(\boldn^2\cdot\boldsigma^2)$ with respect to $\phi$ is: $\langle(\boldn^1\cdot\boldsigma^1)(\boldn^2\cdot\boldsigma^2)\phi,\phi\rangle=\boldn^1\cdot R(\boldn^2),$ where $R(x,y,z)=(x,-y,z)$ denotes reflection with respect to the $xz$ plane. The probability $p(\boldn^1,\boldn^2)$ now becomes: $p(\boldn^1,\boldn^2)=\frac{1+\boldn^1\cdot R(\boldn^2)}2.$ Conditions (a) and (b) are now obtained by letting $\boldn^1(1)$, $\boldn^1(2)$, $\boldn^1(3)$ be three unit vectors in the $xz$ plane making an $120$ degrees angle with each other and by setting $\boldn^2(1)=\boldn^1(1)$, $\boldn^2(2)=\boldn^1(2)$ and $\boldn^2(3)=\boldn^1(3)$.

\subsection*{Violating the CHSH inequality} Using the state $\psi$, if the settings $a$, $b$ correspond to unit vectors $\boldn^1$, $\boldn^2$ then $C(a,b)=-\boldn^1\cdot\boldn^2$. Now let the settings $a$, $b$, $a'$, $b'$ correspond respectively to the unit vectors: $(1,0,0),\quad\tfrac1{\sqrt2}(1,0,1),\quad(0,0,1),\quad\tfrac1{\sqrt2}(1,0,-1).$ Then: $\vert C(a,b)+C(a,b')\vert+\vert C(a',b)-C(a',b')\vert=2\sqrt2>2.$ The same conclusion holds (using the same unit vectors) if we use the state $\phi$ instead of $\psi$.

\begin{thebibliography}{99}

\bibitem{Bellcuisine} J. S. Bell, {\em La nouvelle cuisine}, Between Science and Technology, edited by A. Sarlemijn and P. Kroes, Elsevier Science Publishers (1990), republished in \cite{BellSpeakable}.

\bibitem{BellSpeakable} J. S. Bell, {\em Speakable and unspeakable in quantum mechanics}, Cambridge University Press, 2nd ed.\ (2004).

\end{thebibliography}

## Possible Responses

For those who accept Bell's own views about the meaning and implications of Bell's theorem and the associated experiments, there are several possible options for resolving the apparent conflict between "relativistic" local causality and experiment:

### Abandon Fundamental Relativity

One possibility is to reject "fundamental relativity" (i.e., the claim that spacetime structure is fully and finally captured by the metric of relativity theory) and adopt instead a view in which there exists some (appropriately hidden) additional spacetime structure. Bell suggested this possibility here:

"...I would say that the cheapest resolution is something like going back to relativity as it was before Einstein, when people like Lorentz and Poincare thought that there was an aether -- a preferred frame of reference -- but that our measuring instruments were distorted by motion in such a way that we could not detect motion through the aether. Now, in that way you can imagine that there is a preferred frame of reference, and in this preferred frame of reference things do go faster than light. .... Behind the apparent Lorentz invariance of the phenomena, there is a deeper level which is not Lorentz invariant." [16]

Bell also devoted an entire paper -- "How to teach special relativity" -- to clarifying and defending the reasonableness of Lorentz's approach. [17]

### Unify nonlocal causation with fundamental Lorentz invariance

Bell's local causality condition is strongly motivated by relativity theory (and has been taken as an uncontroversial implication of relativity by most physicists for almost a century), but strictly speaking one could conceivably produce an empirically viable theory which violated Bell's local causality condition (as required by the theorem) but which was nevertheless formulated in a fundamentally Lorentz invariant (i.e., relativistic) way. Bell hinted that the Ghirardi-Rimini-Weber (GRW) theory might lead to such a theory [18], and some crucial further steps in this direction have been taken recently by Roderich Tumulka.[19]

### Retro-causation

Bell's local causality condition is essentially the requirement that the causes of a given event in spacetime should be confined exclusively to the past light cone of that event. Some people advocate understanding this condition as the conjunction of two conditions: "locality" (understood as the requirement that the causes of a given event lie in the past and/or future light cones of the event) and "causality" (understood as the requirement that causes must always temporally precede their effects). This motivates the search for empirically viable theories which violate Bell's local causality condition (as required by the theorem), but which preserve "locality" by including explicit backwards-in-time (but sub-luminal) causal influences.

References...

Perhaps mention the overlap between retro-causal models and the theories that would unify nonlocal causation with lorentz invariance (at least the GRWf type models)

### Blame experiment

A final possibility is to deny that the experiments (purporting to demonstrate that Bell's inequalities are violated) really prove what they purport to prove. Such objections fall into two camps -- those who think that a better future experiment (say, with better statistics) will reveal Bell's inequality to in fact be respected; and those who think that all Bell test experiments are necessarily and inevitably biased.

Bell acknowledged and responded to the first camp as follows:

"It is often said then that experiment has decided against the locality inequality. Strictly speaking that is not so. The actual experiments depart too far from the ideal, and only after the various deficiencies are corrected' by theoretical extrapolation do the actual experiments become critical. There is a school of thought which stresses this fact, and advocates the idea that better experiments may contradict quantum mechanics and vindicate locality. I do not myself entertain that hope. I am too impressed by the quantitative success of quantum mechanics, for the experiments already done, to hope that it will fail for more nearly ideal ones." [20]

The second possibility pertains to what Bell called "super-determinism" -- the idea here being that even the outputs of random number generators (and, say, "free" choices by humans) are determined by events in the past, which would allow in principle a locally causal theory to account for the experiments. But this would necessarily be of a highly conspiratorial character, with the necessary correlations (between the particle pair states ${\displaystyle \lambda }$ and the "random" settings ${\displaystyle {\hat {a}}}$ and ${\displaystyle {\hat {b}}}$ for a given run of the experiment) being somehow encoded in the physical state of the remote past. (And note that the correlations would have to persist no matter by what mechanism the settings are "randomly" chosen -- hence the conspiratorial character.)

Bell described the idea as follows:

"An essential element in the reasoning here is that [${\displaystyle {\hat {a}}}$] and [${\displaystyle {\hat {b}}}$] are free variables. One can envisage theories in which there just are no free variables for the polarizer angles to be coupled to. In such superdeterministic' theories the apparent free will of experimenters, and any other apparent randomness, would be illusory. Perhaps such a theory could be both locally causal and in agreement with quantum mechanical predictions. However I do not expect to see a serious theory of this kind. I would expect a serious theory to permit deterministic chaos' or `pseudorandomness', for complicated subsystems (e.g. computers) which would provide variables sufficiently free for the purpose at hand. But I do not have a theorem about that." [21]

And also:

"Of course, it might be that these reasonable ideas about physical randomizers are just wrong -- for the purpose at hand. A theory may appear in which such conspiracies inevitably occur, and these conspiracies may then seem more digestible than the nonlocalities of other theories. When that theory is announced I will not refuse to listen, either on methodological or other grounds. But I will not myself try to make such a theory." [22]

## Objections to Bell's understanding of the theorem

Many physicists believe that Bell's theorem says and means something very different from what Bell himself thought it says and means. Unfortunately, most such physicists don't recognize that their views differ from Bell's (because they have never read Bell's papers, but only secondary sources such as textbooks). So it is not quite precise to label such alternative views as "objections" to Bell's arguments. Nevertheless, we group them together here with other views that do explicitly challenge Bell's arguments.

### The theorem assumes X in addition to local causality

One frequently hears that, in addition to the local causality premise, Bell's derivation of the inequality also assumes something else. The usual suspects here include "hidden variables", "determinism", "realism", and "counter-factual definiteness" (and several other ideas, which are highly overlapping in terms of their meanings). For example, H.P. Stapp has written that Bell's theorem "shows only that if certain predictions of quantum theory are correct, and if a certain hidden variable assumption is valid, then a locality condition must fail." [23] N.D. Mermin wrote that "[t]o those for whom nonlocality is anathema, Bell's Theorem finally spells the death of the hidden-variables program." [24]

Such objections, however, are typically based on confusions about Bell's reasoning. Bell addressed this particular sort of confusion in a footnote of his paper "Bertlmann's socks and the nature of reality" (which footnote originates from the section, quoted above, in which Bell describes the "widespread and erroneous conviction that for Einstein determinism was always the sacred principle"). Bell writes: "And his followers [meaning Bell himself]. My own first paper on this subject ... starts with a summary of the EPR argument from locality to deterministic hidden variables. But the commentators have almost universally reported that it begins with deterministic hidden variables." [25]

### Bell's local causality condition is too strong

Some critics accept that the empirically testable Bell inequality can be deduced from Bell's local causality condition, but argue that the local causality condition itself is too strong, i.e., that it smuggles in subsidiary conditions going beyond what is minimally required to capture relativity's (alleged) prohibition on superluminal causation. The most prominent and influential such critic is Jon Jarrett, who argued, in his 1984 paper and subsequently, that Bell's local causality condition can and should be decomposed into two sub-conditions, one of which (Jarrett's "locality") captures relativity's prohibition on superluminal causation, with the other (Jarrett's "completeness") an extraneous requirement for which there is no basis in relativity. Jarrett thus acknowledged "that [Bell's local causality] cannot be satisfied by any empirically adequate theory" but argued that

"Since locality is contravened only on pain of a serious conflict with relativity theory (which is extraordinarily well-confirmed independently), it is appropriate to assign the blame to the completeness condition." [26]

Replies to Jarrett's arguments have been made by Butterfield, [27] Maudlin, [28] and Norsen. [29]

### Many worlds

Not entirely sure what to say here, or even if this should go here. It isn't so much an objection to Bell's reasoning as it is a claim that the experiments have been misinterpreted. But one can also consider it a tacit logical assumption of the derivation that experiments have single, definite outcomes, so probably this discussion could go here.

### Non-Commuting Numbers

Bell assumes that the outcomes of experiments can be represented by normal, commuting numbers. Joy Christian has recently argued that this assumption is too restrictive, and has shown that, Bell's inequality can be violated by an allegedly locally causal theory according to which outcomes of experiments are represented by non-commuting numbers.

This is possibly the dumbest thing ever.

I JUST COPIED AND PASTED THIS SECTION FROM THE REAL WIKI PAGE.

SAME HERE -- JUST COPIED AND PASTED THIS. WE SHOULD FIX IT UP.

The following are intended for general audiences.

• Amir D. Aczel, Entanglement: The greatest mystery in physics (Four Walls Eight Windows, New York, 2001).
• A. Afriat and F. Selleri, The Einstein, Podolsky and Rosen Paradox (Plenum Press, New York and London, 1999)
• J. Baggott, The Meaning of Quantum Theory (Oxford University Press, 1992)
• N. David Mermin, "Is the moon there when nobody looks? Reality and the quantum theory", in Physics Today, April 1985, pp. 38–47.
• Louisa Gilder, The Age of Entanglement: When Quantum Physics Was Reborn (New York: Alfred A. Knopf, 2008)
• Brian Greene, The Fabric of the Cosmos (Vintage, 2004, ISBN 0-375-72720-5)
• Nick Herbert, Quantum Reality: Beyond the New Physics (Anchor, 1987, ISBN 0-385-23569-0)
• D. Wick, The infamous boundary: seven decades of controversy in quantum physics (Birkhauser, Boston 1995)
• R. Anton Wilson, Prometheus Rising (New Falcon Publications, 1997, ISBN 1-56184-056-4)
• Gary Zukav "The Dancing Wu Li Masters" (Perennial Classics, 2001, ISBN 0-06-095968-1)

## Notes

1. ^ Bell, Speakable and Unspeakable in Quantum Mechanics, p. 172
2. ^ Bell, op cit., p. 173
3. ^ Bell, op cit., p. 173
4. ^ Bell, pp. 1-13
5. ^ Bell, op cit., p. 171
6. ^ Bell, op cit., p. 11
7. ^ David Bohm, Quantum Theory, 1951
8. ^ Arthur Fine, The Shaky Game, ....
9. ^ Bell, op cit., p. 143
10. ^ Bell, op cit., p. 157
11. ^ Niels Bohr, Can Quantum Mechanical Description of Reality Be Considered Complete, ...
12. ^ Maybe cite A. Fine's plato.stanford.edu article, which is surely a better online reference for the EPR argument than whatever is on wikipedia!
13. ^ Travis Norsen, Bell Locality and the Nonlocal Character of Nature, Foundations of Physics Letters, 19(7), pp. 633-655 (2006).
14. ^ Bell, p. 20
15. ^ Bell, p. 149-50
16. ^ PCW Davies and JR Brown, eds., The Ghost in the Atom, Chapter 3: interview with J.S. Bell, Cambridge University Press, 1986
17. ^ Bell, pp. 67-80
18. ^ Bell, "Are there quantum jumps?", pp. 201-212
19. ^ which paper to cite??
20. ^ Bell, p. 245
21. ^ Bell, p. 244
22. ^ Bell, p. 102
23. ^ "Bell's theorem without hidden variables", quant-ph/0010047
24. ^ N.D. Mermin, "Hidden Variables and the Two Theorems of John Bell", Rev. Mod. Phys., 65, 803-815 (1993)
25. ^ Bell, p. 157
26. ^ Jon Jarrett, "On the physical significance of the locality conditions in the Bell arguments", Nous, 18 (1984) 569-589
27. ^ Jeremy Butterfield, "Bell's Theorem: What it Takes", Brit. J. Phil. Sci., 43 (1992), 41-83
28. ^ Tim Maudlin, Quantum Non-Locality and Relativity, ...
29. ^ Travis Norsen, "Local Causality and Completeness: Bell vs. Jarrett", Found. Phys., 39 (2009), p. 273

## References

• A. Aspect et al., Experimental Tests of Realistic Local Theories via Bell's Theorem, Phys. Rev. Lett. 47, 460 (1981)
• A. Aspect et al., Experimental Realization of Einstein-Podolsky-Rosen-Bohm Gedankenexperiment: A New Violation of Bell's Inequalities, Phys. Rev. Lett. 49, 91 (1982).
• A. Aspect et al., Experimental Test of Bell's Inequalities Using Time-Varying Analyzers, Phys. Rev. Lett. 49, 1804 (1982).
• A. Aspect and P. Grangier, About resonant scattering and other hypothetical effects in the Orsay atomic-cascade experiment tests of Bell inequalities: a discussion and some new experimental data, Lettere al Nuovo Cimento 43, 345 (1985)
• B. D'Espagnat, The Quantum Theory and Reality, Scientific American, 241, 158 (1979)
• J. S. Bell, On the problem of hidden variables in quantum mechanics, Rev. Mod. Phys. 38, 447 (1966)
• J. S. Bell, Introduction to the hidden variable question, Proceedings of the International School of Physics 'Enrico Fermi', Course IL, Foundations of Quantum Mechanics (1971) 171–81
• J. S. Bell, Bertlmann’s socks and the nature of reality, Journal de Physique, Colloque C2, suppl. au numero 3, Tome 42 (1981) pp C2 41–61
• J. S. Bell, Speakable and Unspeakable in Quantum Mechanics (Cambridge University Press 1987) [A collection of Bell's papers, including all of the above.]
• J. F. Clauser and A. Shimony, Bell's theorem: experimental tests and implications, Reports on Progress in Physics 41, 1881 (1978)
• J. F. Clauser and M. A. Horne, Phys. Rev D 10, 526–535 (1974)
• E. S. Fry, T. Walther and S. Li, Proposal for a loophole-free test of the Bell inequalities, Phys. Rev. A 52, 4381 (1995)
• E. S. Fry, and T. Walther, Atom based tests of the Bell Inequalities — the legacy of John Bell continues, pp 103–117 of Quantum [Un]speakables, R.A. Bertlmann and A. Zeilinger (eds.) (Springer, Berlin-Heidelberg-New York, 2002)
• R. B. Griffiths, Consistent Quantum Theory', Cambridge University Press (2002).
• L. Hardy, Nonlocality for 2 particles without inequalities for almost all entangled states. Physical Review Letters 71 (11) 1665–1668 (1993)
• M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press (2000)
• P. Pearle, Hidden-Variable Example Based upon Data Rejection, Physical Review D 2, 1418–25 (1970)
• A. Peres, Quantum Theory: Concepts and Methods, Kluwer, Dordrecht, 1993.
• P. Pluch, Theory of Quantum Probability, PhD Thesis, University of Klagenfurt, 2006.
• B. C. van Frassen, Quantum Mechanics, Clarendon Press, 1991.
• M.A. Rowe, D. Kielpinski, V. Meyer, C.A. Sackett, W.M. Itano, C. Monroe, and D.J. Wineland, Experimental violation of Bell's inequalities with efficient detection,(Nature, 409, 791–794, 2001).
• S. Sulcs, The Nature of Light and Twentieth Century Experimental Physics, Foundations of Science 8, 365–391 (2003)
• S. Gröblacher et al., An experimental test of non-local realism,(Nature, 446, 871–875, 2007).
• D. N. Matsukevich, P. Maunz, D. L. Moehring, S. Olmschenk, and C. Monroe, Bell Inequality Violation with Two Remote Atomic Qubits, Phys. Rev. Lett. 100, 150404 (2008).