Fair coin

In probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin. In theoretical studies, the assumption that a coin is fair is often made by referring to an ideal coin.

John Edmund Kerrich performed experiments in coin flipping and found that a coin made from a wooden disk about the size of a crown and coated on one side with lead landed heads (wooden side up) 679 times out of 1000.[1] In this experiment the coin was tossed by balancing it on the forefinger, flipping it using the thumb so that it spun through the air for about a foot before landing on a flat cloth spread over a table. Edwin Thompson Jaynes claimed that when a coin is caught in the hand, instead of being allowed to bounce, the physical bias in the coin is insignificant compared to the method of the toss, where with sufficient practice a coin can be made to land heads 100% of the time.[2] Exploring the problem of checking whether a coin is fair is a well-established pedagogical tool in teaching statistics.

Probability space definition

In probability theory, a fair coin is defined as a probability space ${\displaystyle (\Omega ,{\mathcal {F}},P)}$, which is in turn defined by the sample space, event space, and probability measure. Using ${\displaystyle H}$ for heads and ${\displaystyle T}$ for tails, the sample space of a coin is defined as:

${\displaystyle \Omega =\{H,T\}}$

The event space for a coin includes all sets of outcomes from the sample space which can be assigned a probability, which is the full power set ${\displaystyle 2^{\Omega }}$. Thus, the event space is defined as:

${\displaystyle {\mathcal {F}}=\{\{\},\{H\},\{T\},\{H,T\}\}}$

${\displaystyle \{\}}$ is the event where neither outcome happens (which is impossible and can therefore be assigned 0 probability), and ${\displaystyle \{H,T\}}$ is the event where either outcome happens, (which is guaranteed and can be assigned 1 probability). Because the coin is fair, the possibility of any single outcome is 50-50. The probability measure is then defined by the function:

 ${\displaystyle x}$ ${\displaystyle P(x)}$ ${\displaystyle \{\}}$ ${\displaystyle \{H\}}$ ${\displaystyle \{T\}}$ ${\displaystyle \{H,T\}}$ 0 0.5 0.5 1

So the full probability space which defines a fair coin is the triplet ${\displaystyle (\Omega ,{\mathcal {F}},P)}$ as defined above. Note that this is not a random variable because heads and tails do not have inherent numerical values like you might find on a fair two-valued die. A random variable adds the additional structure of assigning a numerical value to each outcome. Common choices are ${\displaystyle (H,T)\to (1,0)}$ or ${\displaystyle (H,T)\to (1,-1)}$.

Role in statistical teaching and theory

The probabilistic and statistical properties of coin-tossing games are often used as examples in both introductory and advanced text books and these are mainly based in assuming that a coin is fair or "ideal". For example, Feller uses this basis to introduce both the idea of random walks and to develop tests for homogeneity within a sequence of observations by looking at the properties of the runs of identical values within a sequence.[3] The latter leads on to a runs test. A time-series consisting of the result from tossing a fair coin is called a Bernoulli process.

Fair results from a biased coin

If a cheat has altered a coin to prefer one side over another (a biased coin), the coin can still be used for fair results by changing the game slightly. John von Neumann gave the following procedure:[4]

1. Toss the coin twice.
2. If the results match, start over, forgetting both results.
3. If the results differ, use the first result, forgetting the second.

The reason this process produces a fair result is that the probability of getting heads and then tails must be the same as the probability of getting tails and then heads, as the coin is not changing its bias between flips and the two flips are independent. This works only if getting one result on a trial does not change the bias on subsequent trials, which is the case for most non-malleable coins (but not for processes such as the Pólya urn). By excluding the events of two heads and two tails by repeating the procedure, the coin flipper is left with the only two remaining outcomes having equivalent probability. This procedure only works if the tosses are paired properly; if part of a pair is reused in another pair, the fairness may be ruined. Also, the coin must not be so biased that one side has a probability of zero.

This method may be extended by also considering sequences of four tosses. That is, if the coin is flipped twice but the results match, and the coin is flipped twice again but the results match now for the opposite side, then the first result can be used. This is because HHTT and TTHH are equally likely. This can be extended to any multiple of 2.

The expected value of flips at the n game ${\displaystyle E(F_{n})}$ is not hard to calculate, first notice that in step 3 whatever the event ${\displaystyle HT}$ or ${\displaystyle TH}$ we have flipped the coin twice so ${\displaystyle E(F_{n}|HT,TH)=2}$ but in step 2 (${\displaystyle TT}$ or ${\displaystyle HH}$) we also have to redo things so we will have 2 flips plus the expected value of flips of the next game that is ${\displaystyle E(F_{n}|TT,HH)=2+E(F_{n+1})}$ but as we start over the expected value of the next game is the same as the value of the previous game or any other game so it does not really depend on n thus ${\displaystyle E(F)=E(F_{n})=E(F_{n+1})}$ (this can be understood the process being a martingale ${\displaystyle E(F_{n+1}|F_{n},...,F_{1})=F_{n}}$ where taking the expectation again get us that ${\displaystyle E(E(F_{n+1}|F_{n},...,X_{1}))=E(F_{n})}$ but because of the law of total expectation we get that ${\displaystyle E(F_{n+1})=E(E(F_{n+1}|F_{n},...,F_{1}))=E(F_{n})}$) hence we have:

{\displaystyle {\begin{aligned}E(F)&=E(F_{n})\\&=E(F_{n}|TT,HH)P(TT,HH)+E(F_{n}|HT,TH)P(HT,TH)\\&=(2+E(F_{n+1}))P(TT,HH)+2P(HT,TH)\\&=(2+E(F))P(TT,HH))+2P(HT,TH)\\&=(2+E(F))(P(TT)+P(HH))+2(P(HT)+P(TH))\\&=(2+E(F))(P(T)^{2}+P(H)^{2})+4P(H)P(T)\\&=(2+E(F))(1-2P(H)P(T))+4P(H)P(T)\\&=2+E(F)-2P(H)P(T)E(F)\\\end{aligned}}}

${\displaystyle \therefore E(F)=2+E(F)-2P(H)P(T)E(F)\Rightarrow E(F)={\frac {1}{P(H)P(T)}}={\frac {1}{P(H)(1-P(H))}}}$

The more biased our coin is, the more likely it is that we will have to perform a greater number of trials before a fair result.

References

1. ^ Kerrich, John Edmund (1946). An experimental introduction to the theory of probability. E. Munksgaard.
2. ^ Jaynes, E.T. (2003). Probability Theory: The Logic of Science. Cambridge, UK: Cambridge University Press. p. 318. ISBN 9780521592710. Archived from the original on 2002-02-05. anyone familiar with the law of conservation of angular momentum can, after some practice, cheat at the usual coin-toss game and call his shots with 100 per cent accuracy. You can obtain any frequency of heads you want; and the bias of the coin has no influence at all on the results!{{cite book}}: CS1 maint: bot: original URL status unknown (link)
3. ^ Feller, W (1968). An Introduction to Probability Theory and Its Applications. Wiley. ISBN 978-0-471-25708-0.
4. ^ von Neumann, John (1951). "Various techniques used in connection with random digits". National Bureau of Standards Applied Math Series. 12: 36.