Jump to content

Boy or girl paradox

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by ASmartKid (talk | contribs) at 22:53, 29 June 2010 (External links: great summary). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Boy or Girl paradox surrounds a well-known set of questions in probability theory which are also known as The Two Child Problem[1], Mr. Smith's Children[2] and the Mrs. Smith Problem. The initial formulation of the question dates back to at least 1959, when Martin Gardner published one of the earliest variants of the paradox in Scientific American. Titled The Two Children Problem, he phrased the paradox as follows:

  • Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?
  • Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?

Gardner initially gave the answers 1/2 and 1/3, respectively; but later acknowledged that the second question was ambiguous.[1] Its answer could be 1/2, depending on how you found out that one child was a boy. The ambiguity, depending on the exact wording and possible assumptions, was confirmed by Bar-Hillel and Falk,[3] and Nickerson.[4]

Other variants of this question, with varying degrees of ambiguity, have been recently popularized by Ask Marilyn in Parade Magazine[5], John Tierney of The New York Times[6], Leonard Mlodinow in Drunkard's Walk.[7], as well as numerous online publications.[8][9][10] One scientific study[2] showed that when identical information was conveyed, but with different partially-ambiguous wordings that emphasized different points, that the percentage of MBA students who answered 1/2 changed from 85% to 39%.

The paradox has frequently stimulated a great deal of controversy.[4] Many people, including professors of mathematics, argued strongly for both sides with a great deal of confidence, sometimes showing disdain for those who took the opposing view. The paradox stems from whether the problem setup is similar for the two questions[2][7][9]. The intuitive answer is 1/2.[2] This answer is intuitive if the question leads the reader to believe that there are two equally likely possibilities for the gender of the second child (i.e., boy and girl)[2][11], and that the probability of these outcomes is absolute, not conditional.[12]

Common assumptions

The two possible answers share a number of assumptions. First, it is assumed that the space of all possible events can be easily enumerated, providing an extensional definition of outcomes: {BB, BG, GB, GG}.[13] This notation indicates that there are four possible combinations of children, labeling boys B and girls G, and using the first letter to represent the older child. Second, it is assumed that these outcomes are equally probable.[13] This implies the following:

  1. That each child is either male or female.
  2. That the sex of each child is independent of the sex of the other.
  3. That each child has the same chance of being male as of being female.

These assumptions have been shown empirically to be false[13]. It is worth noting that these conditions form an incomplete model. By following these rules, we ignore the possibilities that a child is intersex, the ratio of boys to girls is not exactly 50:50, and (amongst other factors) the possibility of identical twins means that sex determination is not entirely independent. However, this problem is about probability and not about obstetrics or demography. The problem would be the same if it were phrased using a gold coin and a silver coin.

First question

  • Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?

In this problem, a random family is selected. In this sample space, there are four equally probable events:

Older child Younger child
Girl Girl
Girl Boy
Boy Girl
Boy Boy

Only two of these possible events meets the criteria specified in the question (e.g., GB, GG). Since both of the two possibilities in the new sample space {GB, GG} are equally likely, and only one of the two, GG, includes two girls, the probability that the younger child is also a girl is 1/2.

Second question

  • Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?

This question is identical to question one, except that instead of specifying that the older child is a boy, it is specified that at least one of them is a boy. If it is assumed that this information was obtained by considering both children[14], then there are four equally probable events for a two-child family as seen in the sample space above. Three of these families meet the necessary and sufficient condition of having at least one boy. The set of possibilities (possible combinations of children that meet the given criteria) is:

Older child Younger child
Girl Girl
Girl Boy
Boy Girl
Boy Boy

Thus, if it is assumed that both children were considered, the answer to question 2 is 1/3. In this case the critical assumption is how Mr. Smith's family was selected and how the statement was formed. One possibility is that families with two girls were excluded in which case the answer is 1/3. The other possibility is that the family was selected randomly and then a true statement was made about the family and if there had been two girls in the Smith family, the statement would have been made that "at least one is a girl". If the Smith family were selected as in the latter case, the answer to question 2 is 1/2.

However, if it is assumed that the information was obtained by considering only one child, then the problem is an isomorphism of question one, and the answer is 1/2.[1][3][14]

Third question

  • A (random) family has two children, and one of the two children is a boy named Jacob. What is the probability that the other child is a girl?
Older child Younger child
Girl Girl
Boy Boy
Girl Jacob
Jacob Girl
Jacob Boy
Boy Jacob

Or, the set {GJ, JG, JB, BJ}, in which two out of the four possibilities includes a girl.

Therefore we might think that the probability returns to 1/2. But this is wrong if we again assume that the information was obtained by looking at both children, because it doesn't take into account different frequencies of each of these answers. The likelihood of a boy being named Jacob and a boy not being named Jacob are not equal. Thus, we must replace our classical interpretation of probability with either a Frequentist or Bayesian interpretation. (Note that in real life child names are not independent of each other. In particular, people usually do not give the same name to two children. Thus, this discussion is purely theoretical).

Frequentist approach

Consider 10,000 families that have two children. Assume that the gender and name of each child is independent, within family and between families. Assume that the probability of each individual child being a girl is .5; otherwise the child is a boy. Assume that the probability of a child having the name Jacob is .01, and that all children with the name Jacob are also boys.

In the table above, we have a list of all possible unique outcomes. But these outcomes do not have the same frequency. If we start with the assumption that the family has two children, we get the following frequency table:

Older child Younger child Frequency
Girl Girl 2500
Girl Boy 2500
Boy Girl 2500
Boy Boy 2500

With the additional bit of information that the family has a boy named Jacob, we can break every instance of "Boy" into two: "Jacob" and "Boy not Jacob". For every 50 Boys, 1 will fall into the "Jacob" bin and 49 into the "Boy not Jacob" bin. Thus, we have the following table:

Older child Younger child Frequency
Girl Girl 2500
Girl Jacob 50
Girl Boy not Jacob 2450
Jacob Girl 50
Boy not Jacob Girl 2450
Jacob Jacob 1
Boy not Jacob Jacob 49
Jacob Boy not Jacob 49
Boy not Jacob Boy not Jacob 2401

If we eliminate all instances that do not meet our given criteria ({Girl, Girl} {Girl, Boy not Jacob} {Boy not Jacob, Girl} {Boy not Jacob, Boy not Jacob}), then we eliminate 9801 of our events, leaving 199 possible events. Of those, the successful events are {Girl, Jacob} and {Jacob, Girl}, or 100 cases.

So if the probability of a boy being named Jacob is 1 in 50, then the probability that the family has a girl is 100/199, or roughly 50%. But this value will change depending on the popularity of the name. At the extreme, if all boys were given the same name, then being named Jacob would provide no more information than being a boy, and thus the probability would still be 2/3 that the family has a girl. As the likelihood of the name decreases, the likelihood of the two-Jacob case also decreases, and the probability of the family having a girl approaches the limit of 50%.

If we further assume that parents never name two children with the same name, we can eliminate {Jacob, Jacob}, leaving 198 possible events; thus it would appear that the probability of the family having a girl is 100/198, or 50/99. However, assuming the older child is named first, there are now 50 occurrences of {Jacob, Boy not Jacob} making the probability of a girl 100/199, just as before.

Bayesian approach

The 4 cases with one boy named Jacob are: Jacob and Boy not Jacob, Boy not Jacob and Jacob, Jacob and Girl, Girl and Jacob, with probability , , , , respectively, and is the probability that a boy is called Jacob. Using Bayes' theorem, we know the other child is a girl with probability:

.

When is quite small as in general cases, we get the result close to 1/2. However, if we change the condition of having the name Jacob to something less informative, such as day of birth is an even number, now is very close to 1/2, and the probability that the other child being a girl goes to 2/3.

Variants of the question

The Boy or Girl paradox has appeared in many forms. One of the earliest formulations of the question was posed by Martin Gardner in Scientific American in 1959:

  • Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys? Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?

In 1991, Marilyn Vos Savant responded to a reader who asked her to answer a variant of the Boy or Girl paradox that included beagles[5]. In 1996, she published the question again in a different form. The 1991 and 1996 questions, respectively were phrased:

  • A shopkeeper says she has two new baby beagles to show you, but she doesn't know whether they're male, female, or a pair. You tell her that you want only a male, and she telephones the fellow who's giving them a bath. "Is at least one a male?" she asks him. "Yes!" she informs you with a smile. What is the probability that the other one is a male?
  • Say that a woman and a man (who are unrelated) each has two children. We know that at least one of the woman's children is a boy and that the man's oldest child is a boy. Can you explain why the chances that the woman has two boys do not equal the chances that the man has two boys? My algebra teacher insists that the probability is greater that the man has two boys, but I think the chances may be the same. What do you think?

In a 2004 study, Fox & Levav posed the following questions to MBA students with recent schooling in probability:

  • Mr. Smith says: ‘I have two children and at least one of them is a boy.' Given this information, what is the probability that the other child is a boy?
  • Mr. Smith says: ‘I have two children and it is not the case that they are both girls.' Given this information, what is the probability that both children are boys?

Ambiguous problem statements

The second question is often posed in a way that leave multiple interpretations open. In response to reader criticism of the question posed in 1959, Gardner agreed that a precise formulation of the question is critical to getting different answers for question 1 and 2. Specifically, Gardner argued that a "failure to specify the randomizing procedure" could lead readers to interpret the question in two distinct ways:

  • From all families with two children, at least one of whom is a boy, a family is chosen at random. This would yield the answer of 1/3.
  • From all families with two children, one child is selected at random, and the gender of that child is specified. This would yield an answer of 1/2, and many experts agree.[3][4]

Grinstead and Snell argue that the question is ambiguous in much the same way Gardner did.[14]. Similarly, Nickerson argues that it is easy to construct scenarios in which the answer is 1/2 by making assumptions about whether Mr. Smith is more likely to be met in public with a son or a daughter.[4] Central to the debate of ambiguity, Nickerson says:

Bar-Hillel and Falk (1982) point out that the conclusion [that the probability is 1/3] is justified only if another unstated assumption is made, namely that the family not only is a member of the subset of two-child families that have at least one boy but that it is a randomly selected member of that subset, which is tantamount to assuming that all members of this subset [that is, the three members BB, BG, and GB] are equally likely to be represented on the street by a father and son. But this assumption would be reasonable only in a land where fathers who had a son and a daughter would walk only with the son.

— Raymond S. Nickerson

Scientific investigation

A 2005 article in The American Statistician presents a mathematicians' solution to the Boy or Girl paradox[13]. The authors consider the version of the question posed by Marilyn Vos Savant in Parade Magazine in 1997, and conclude that her answer is correct from a mathematical perspective, given the assumptions that the likelihood of a child being a boy or girl is equal, and that the gender of the second child is independent of the first.[13] This is in conflict with others' conclusion that a similarly-worded problem is ambiguous.[1][3][4]

On empirical grounds, however, these authors call the solution into question. They provide data that demonstrate that male children are actually more likely than female children, and that the gender of the second child is not independent of the gender of the first. The authors conclude that, although the assumptions of the question run counter to observations, the paradox still has pedagogical value, since it "illustrates one of the more intriguing applications of conditional probability."[13] Of course, the actual probability values do not matter; the purpose of the paradox is to demonstrate seemingly contradictory logic, not actual birth rates.

The Boy or Girl paradox is of interest to psychological researchers who seek to understand how humans estimate probability. For instance, Fox & Levav (2004) used the problem (called the Mr. Smith problem, credited to Gardner, but not worded exactly the same as Gardner's self-admitted ambiguous version) to test theories of how people estimate conditional probabilities. However, their question was still ambiguous, since it didn't address why Mr. Smith would only mention boys.[2]. In this study, the paradox was posed to participants in two ways:

  • "Mr. Smith says: 'I have two children and at least one of them is a boy.' Given this information, what is the probability that the other child is a boy?"
  • "Mr. Smith says: 'I have two children and it is not the case that they are both girls.' Given this information, what is the probability that both children are boys?"

The authors argue that the first formulation gives the reader the mistaken impression that there are two possible outcomes for the "other child"[2], whereas the second formulation gives the reader the impression that there are four possible outcomes, of which one has been rejected (resulting in 1/3 being the probability of both children being boys, as there are 3 remaining possible outcomes, only one of which is that both of the children are boys). The study found that 85% of participants answered 1/2 for the first formulation, while only 39% responded that way to the second formulation. The authors argued that the reason people respond differently to this question (along with other similar problems, such as the Monty Hall Problem and the Bertrand's box paradox) is because of the use of naive heuristics that fail to properly define the number of possible outcomes.[2]

References

  1. ^ a b c d Martin Gardner (1954). The Second Scientific American Book of Mathematical Puzzles and Diversions. Simon & Schuster. ISBN 978-0226282534.. {{cite book}}: Check |isbn= value: invalid character (help) Cite error: The named reference "gardner" was defined multiple times with different content (see the help page).
  2. ^ a b c d e f g h Craig R. Fox & Jonathan Levav (2004). "Partition–Edit–Count: Naive Extensional Reasoning in Judgment of Conditional Probability". Journal of Experimental Psychology. 133 (4): 626–642. doi:10.1037/0096-3445.133.4.626. PMID 15584810.
  3. ^ a b c d Maya Bar-Hillel and Ruma Falk (1982). "Some teasers concerning conditional probabilities". Cognition. 11: 109–122.
  4. ^ a b c d e Raymond S. Nickerson (2004). Cognition and Chance: The Psychology of Probabilistic Reasoning. Psychology Press. ISBN 0805848991. {{cite book}}: Unknown parameter |month= ignored (help) Cite error: The named reference "nickerson" was defined multiple times with different content (see the help page).
  5. ^ a b "Ask Marilyn". Parade Magazine. October 13, 1991; January 5, 1992; May 26, 1996; December 1, 1996; March 30, 1997; July 27, 1997; October 19, 1997. {{cite journal}}: Check date values in: |date= (help); Cite journal requires |journal= (help)
  6. ^ Tierney, John (2008-04-10). "The psychology of getting suckered". The New York Times. Retrieved 24 February 2009.
  7. ^ a b Leonard Mlodinow (2008). Pantheon. ISBN 0375424040. {{cite book}}: Missing or empty |title= (help); Unknown parameter |unused_data= ignored (help)
  8. ^ "The Boy or Girl Paradox". BBC. Retrieved 15 February 2008.
  9. ^ a b "Finishing The Game". Jeff Atwood. Retrieved 15 February 2009.
  10. ^ Debra Ingram. [www.csm.astate.edu/~dingram/MAA/Paradoxes.RPSmith.ppt "Mathematical Paradoxes"]. Retrieved 15 February 2009. {{cite web}}: Check |url= value (help)
  11. ^ Nikunj C. Oza (1993). "On The Confusion in Some Popular Probability Problems". Retrieved 25 February 2009.
  12. ^ P.J. Laird; et al. (1999). "Naive Probability: A Mental Model Theory of Extensional Reasoning". Psychological Review. {{cite journal}}: Explicit use of et al. in: |author= (help)
  13. ^ a b c d e f Matthew A. CARLTON and William D. STANSFIELD (2005). "Making Babies by the Flip of a Coin?". The American Statistician.
  14. ^ a b c Charles M. Grinstead and J. Laurie Snell. "Grinstead and Snell's Introduction to Probability" (PDF). The CHANCE Project.