Law of total probability

In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities. It expresses the total probability of an outcome which can be realized via several distinct events, hence the name.

Statement[edit]

The law of total probability is^[1] a theorem that states, in its discrete case, if $\left\{{B_{n}:n=1,2,3,\ldots }\right\}$ is a finite or countably infinite set of mutually exclusive and collectively exhaustive events, then for any event $A$ :

P(A)=\sum _{n}P(A\cap B_{n})

or, alternatively,^[1]

P(A)=\sum _{n}P(A\mid B_{n})P(B_{n}),

where, for any $n$ , if $P(B_{n})=0$ , then these terms are simply omitted from the summation since $P(A\mid B_{n})$ is finite.

The summation can be interpreted as a weighted average, and consequently the marginal probability, $P(A)$ , is sometimes called "average probability";^[2] "overall probability" is sometimes used in less formal writings.^[3]

The law of total probability can also be stated for conditional probabilities:

P({A|C})={\frac {P({A,C})}{P(C)}}={\frac {\sum \limits _{n}{P({A,{B_{n}},C})}}{P(C)}}={\frac {\sum \limits _{n}P({A\mid {B_{n}},C})P({{B_{n}}\mid C})P(C)}{P(C)}}=\sum \limits _{n}P({A\mid {B_{n}},C})P({{B_{n}}\mid C})

Taking the $B_{n}$ as above, and assuming $C$ is an event independent of any of the $B_{n}$ :

P(A\mid C)=\sum _{n}P(A\mid C,B_{n})P(B_{n})

Continuous case[edit]

The law of total probability extends to the case of conditioning on events generated by continuous random variables. Let $(\Omega ,{\mathcal {F}},P)$ be a probability space. Suppose $X$ is a random variable with distribution function $F_{X}$ , and $A$ an event on $(\Omega ,{\mathcal {F}},P)$ . Then the law of total probability states

$P(A)=\int _{-\infty }^{\infty }P(A|X=x)dF_{X}(x).$

If $X$ admits a density function $f_{X}$ , then the result is

$P(A)=\int _{-\infty }^{\infty }P(A|X=x)f_{X}(x)dx.$

Moreover, for the specific case where $A=\{Y\in B\}$ , where $B$ is a Borel set, then this yields

$P(Y\in B)=\int _{-\infty }^{\infty }P(Y\in B|X=x)f_{X}(x)dx.$

Example[edit]

Suppose that two factories supply light bulbs to the market. Factory X's bulbs work for over 5000 hours in 99% of cases, whereas factory Y's bulbs work for over 5000 hours in 95% of cases. It is known that factory X supplies 60% of the total bulbs available and Y supplies 40% of the total bulbs available. What is the chance that a purchased bulb will work for longer than 5000 hours?

Applying the law of total probability, we have:

{\begin{aligned}P(A)&=P(A\mid B_{X})\cdot P(B_{X})+P(A\mid B_{Y})\cdot P(B_{Y})\\[4pt]&={99 \over 100}\cdot {6 \over 10}+{95 \over 100}\cdot {4 \over 10}={{594+380} \over 1000}={974 \over 1000}\end{aligned}}

where

$P(B_{X})={6 \over 10}$ is the probability that the purchased bulb was manufactured by factory X;
$P(B_{Y})={4 \over 10}$ is the probability that the purchased bulb was manufactured by factory Y;
$P(A\mid B_{X})={99 \over 100}$ is the probability that a bulb manufactured by X will work for over 5000 hours;
$P(A\mid B_{Y})={95 \over 100}$ is the probability that a bulb manufactured by Y will work for over 5000 hours.

Thus each purchased light bulb has a 97.4% chance to work for more than 5000 hours.

Other names[edit]

The term law of total probability is sometimes taken to mean the law of alternatives, which is a special case of the law of total probability applying to discrete random variables.^{[citation needed]} One author uses the terminology of the "Rule of Average Conditional Probabilities",^[4] while another refers to it as the "continuous law of alternatives" in the continuous case.^[5] This result is given by Grimmett and Welsh^[6] as the partition theorem, a name that they also give to the related law of total expectation.

Notes[edit]

^ ^a ^b Zwillinger, D., Kokoska, S. (2000) CRC Standard Probability and Statistics Tables and Formulae, CRC Press. ISBN 1-58488-059-7 page 31.
^ Paul E. Pfeiffer (1978). Concepts of probability theory. Courier Dover Publications. pp. 47–48. ISBN 978-0-486-63677-1.
^ Deborah Rumsey (2006). Probability for dummies. For Dummies. p. 58. ISBN 978-0-471-75141-0.
^ Jim Pitman (1993). Probability. Springer. p. 41. ISBN 0-387-97974-3.
^ Kenneth Baclawski (2008). Introduction to probability with R. CRC Press. p. 179. ISBN 978-1-4200-6521-3.
^ Probability: An Introduction, by Geoffrey Grimmett and Dominic Welsh, Oxford Science Publications, 1986, Theorem 1B.

References[edit]

Introduction to Probability and Statistics by Robert J. Beaver, Barbara M. Beaver, Thomson Brooks/Cole, 2005, page 159.
Theory of Statistics, by Mark J. Schervish, Springer, 1995.
Schaum's Outline of Probability, Second Edition, by John J. Schiller, Seymour Lipschutz, McGraw–Hill Professional, 2010, page 89.
A First Course in Stochastic Models, by H. C. Tijms, John Wiley and Sons, 2003, pages 431–432.
An Intermediate Course in Probability, by Alan Gut, Springer, 1995, pages 5–6.

[ZK-1] Zwillinger, D., Kokoska, S. (2000) CRC Standard Probability and Statistics Tables and Formulae, CRC Press. ISBN 1-58488-059-7 page 31.

[Pfeiffer1978-2] Paul E. Pfeiffer (1978). Concepts of probability theory. Courier Dover Publications. pp. 47–48. ISBN 978-0-486-63677-1.

[Rumsey2006-3] Deborah Rumsey (2006). Probability for dummies. For Dummies. p. 58. ISBN 978-0-471-75141-0.

[Pitman1993-4] Jim Pitman (1993). Probability. Springer. p. 41. ISBN 0-387-97974-3.

[Baclawski2008-5] Kenneth Baclawski (2008). Introduction to probability with R. CRC Press. p. 179. ISBN 978-1-4200-6521-3.

[6] Probability: An Introduction, by Geoffrey Grimmett and Dominic Welsh, Oxford Science Publications, 1986, Theorem 1B.

[1]

[2]

[3]

[4]

[5]

[6]