Talk:Boy or girl paradox/Archive 2

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3

Perception

I think some people fail to understand (as even I do sometimes) that the probability is a factor of PERCEPTION and what your point of view is and what information you have as an observer.

Take this example (and please correct me if I'm interpretting the whole paradox wrong):

  • I go to the Smith house and I see two children playing. One is facing me (a boy) and one is facing away (I can't tell the gender).
  • I go home and tell you "The Smiths have two kids; At least one of them is a boy"

In this situation there are two probabilities in play. From the information I have, I know that "the one facing me" is a boy. Therefore the probability that the child facing away is also a boy (and thus both children are boys) is 1/2. However, I give you different information, and you must factor in all the possibilities for both children leading you to conclude that there is a 2/3 probability that the Smiths have two boys. In reality, there is no "probability". A 2nd child already exists who is either a boy or a girl (not half or two thirds of a boy). The question is simply a statistical one, and that's why two people with different information on the same situation can come up with different probabilities in theoretical calculation.... right?

I have to say I understand how to apply it, and I understand the math, but I still have a hard time understanding the logic in real-world terms that if you know the gender of a specific kid, that changes the probability from if you simply know the gender of an unidentified ONE of the kids (ie: wherein "at least one" is a boy). TheHYPO (talk) 09:42, 4 December 2009 (UTC)

Well, the example is spot on, but your explanation is not quite right. The example I like to use for what you are describing involves drawing a card. I look at it and see the Queen of Hearts. I tell Ann nothing about it, Bob that it is red, Carl that it is a heart, and Dana that it is a queen. When asked for the probability that the card is the Queen of Hearts, we say (starting with me) 1/1, 1/52, 1/26, 1/13, and 1/4. And everybody is probably right. The reason we can all be right with different answers has nothing to do with statistics. It is conditional probability. We are not asking for the probability that the specific card I am holding is the Queen of Hearts - as you point out, it either is or it isn't. From each individual, we are asking for a different conditional probability. For Ann, it is the unconditional probability that a random card is the Queen of Hearts. For Bob, it is the conditional probability that a random card is the Queen of Hearts given that it is red. For Cindy, it is the conditional probability that a random card is the Queen of Hearts given that it is a heart. And or Dana, it is the conditional probability that a random card is the Queen of Hearts given that it is a queen. So Bob is only answering a question about the 50% of the cards that are red. If I tell him a card is black, it doesn't count as a success or a failure in his probability experiment.
But I said "probably right." You can calculate a conditional probability by this formula: P(A|B)=P(A and B)/P(B). In order to do that, you need to first examine every possible outcome of your process (here, the 52 different cards) and decide whether it belongs to B or to ~B. For Dana's probability, clearly every card that belongs to B is a queen. But it is possible that I could draw the Queen of Spades and tell him "it is a spade" (i.e., I could rotate the four kinds of information I tell my four friends based upon the suit; I tell Ann about the face value of clubs, Bob the face value of diamonds, etc.). So a queen of spades would belong to ~B. This is the problem with describing a probability puzzle by using an example: it only provides a necessary condition for the event B in my conditional probability formula, not a sufficient condition.
And that is exactly the difference you described in your example. That a family has at least one boy is a necessary condition to be counted in B, but it is not sufficient. If the Smiths had a boy and a girl, you could have seen just the girl. You could not then tell me "They have at least one boy," so this family with "at least one boy" belongs in ~B. People can give different answers to such problems when the difference between necessary and sufficient conditions is ambiguous. JeffJor (talk) 20:07, 4 December 2009 (UTC)

Gardner

So first off, Gardner is not the origin of the paradox, nor is he the first person to mention it in print. I found the Scientific American issues from May and October 1959. They are reprinted, verbatim, in "More Mathematical Puzzles and Diversions", 1963 by Gardner.

He gives the answer to the problem in 1963 as "1/3." Later in the book, the discussion of ambiguity is also copied, verbatim, from the Scientific American issue. I suppose he decided not to go back and change the answer to the problem. Thats a little inconsistent, but it doesn't argue against included a section on ambiguity. He mentions that the problem was published in a math textbook. There are two referenced textbooks; I'll see if I can find them. Finding the first publication of this problem would be nice.

On an aside, I wanted to comment on his discussion of ambiguity. His original question is precisely:

"Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys? Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?"

According to Gardners discussion, the statement (1) "At least one of them is a boy" is ambiguous, because it can lead to two possible predicates: (2) "It is factually true that at least one of the two children is a boy." (3) "We have learned about one of them. That one is a boy."

Interpretation (2) treats (1) as an unconditional, factual statement. The truth value of the predicate for this family is given. Interpretation (3) treats (1) as a statement of imperfect knowledge; the truth value depends on perspective, and our perspective is not provided.

Thus, if you treat (1) as a simple statement of fact, it is unambiguous. For all possible occurances, a truth value for the predicate is assigned, and the ratio of frequencies determines the probability to be 1/3.

Alternatively, if you treat (1) as a statement about experiential, fallible and incomplete knowledge gained through unspecified means, then it is ambiguous. I think this is an unreasonable interpretation. Imagine Gardner had a list of all possibly existing families, and Gardner and I were sitting down and tabulating the frequency of families with two children and the frequency of families with two children and "At least one of them is a boy." Lets say we came across a family, the Bakers, who have two children, Mary and Jacob. According to his method, Gardner flips a coin and it comes up heads, so he states "At least one of them is a girl. It does not meet the criteria. It is not included in the sample space." Sitting next to him, I say "Wait. But one of them is a boy! It does meet the criteria." He responds, "Yeah, but we dont know that! We only looked at the girl." My response is clearly, "But its true! How can you say you don't know it? Where in the problem is it specified that we should ignore information? How can you interpret 'At least one of them is a boy' in such a way as to ignore 33% of the observations for which it is true that 'At least one of them is a boy'."

My argument is a reductio not against your (JeffJor's) formulation, but against Gardners: if we interpret (1) as (3), we're left in the silly situation where it is simultaneously true and false that "At least one of the children is a boy." That is a logical contriction that undermines that interpretation of the problem. --Thesoxlost (talk) 20:56, 25 February 2009 (UTC)

The main point of referencing Gardner is that he acknowledges the ambiguity; Whether you agree or disagree is not relevant. Neither is whether Gardner was the first to print a "Boy or Girl" problem. And neither is the fact that he first answered (the May column) only one way. That just goes to prove that the ambiguity is frequently overlooked, as he admitted in the October column. Also, the two columns were reprinted verbatim because it was in a collection of reprinted columns, not because he was reneging on the claim of ambiguity.
But you are wrong when you disagree with him. You asked "Where in the problem is it specified that we should ignore information"? It isn't. But I ask you, "Where in the problem is it specified that you have the information you think is being ignored?" That isn't specified, either. NOTHING IS BEING IGNORED. Gardner's point is that we can say "At least one is a boy" WITHOUT ACQUIRING information about both children; not a statement about conditions where you can/must/would/should/could have that information.
The people who answer his problem "1/3" are assuming they had access to complete information, but that it SOMEHOW BECAME INCOMPLETE. That is the point of asking the probability question, after all - some information is missing. Those who answer "1/2" are not assuming any information; instead, they are assuming that any additional information needed to solve the problem is distributed without prejudice towards gender, age, hair color, biology grades, or anything else. It is appropriate to assume non-prejudiced information this way; in fact, it is usually required. Do coin-flip questions need to state that the odds are 50% each way? No; because any other possibility is prejudiced.
Here's a more complex example: Say I sort a deck of cards into piles of 13 cards each, shuffle them, and place them face-down. I tell you that there is an ace in each, and tell you which has the Ace of Spades. In other words, "This pile includes at least the Ace of Spades but no other Aces." If you choose a card at random from that pile, what are the odds it is the King of Spades? You could assume that each pile has all of the cards in a single suit. That is consistent with what I described, but not included in what I described. But to assume it would be prejudicial. That doesn't mean I didn't do that, or that I didn't do use any of a vast number of other strategies that would affect the odds of the King of Spades. But you can't assume any one of them. You have to assume that the unmentioned 48 cards are distributed without prejudice in the remaining 48 possilbe positions. The answer is 1/52: The odds of picking a non-Ace are 12/13, and the odds that a non-Ace is the King of Spades is 1/48. 12/13*1/48=1/52. That is how you get to the answer "1/2" for Gardner's problem. You don't assume any more than what is said: that you know the gender of one child. You then assume that anything else that requires a distribution, is distributed without prejudice.
Your analysis of the ambiguity is flawed, in a number of ways. It is not simultaneously true and false that "At least one of the children is a boy." Not saying "at least one boy" does not mean "at least one boy" is false. And regardless, you added an assumption to Gardner's argument, that the problem must have an unambiguous solution. All you would have proven, if your reductio was valid, was that (at least) one of the assumptions is incorrect. One already is: the one you added.
It really doesn't matter which answer to Gardner's question is "more" correct or even if one is "totally" correct. The Boy or Girl Paradox is that different people see different problems in many of these ambiguous problems statements. Trying to disprove Gardner is not being productive. The Encyclopedia article should only associate a single answer with a problem wording that is unequivocally unambiguous. It should point out what the possible ambiguities are in any other wording; whether or not one can argue (as you did) that one interpretation might be better. And your attempts to do that are obfuscating the real issue. Can we please get back to it? I suggested an outline for he article consistent with references, but all you have done is try to disprove the references.

JeffJor (talk) 21:07, 26 February 2009 (UTC)

:Your analysis of the ambiguity is flawed, in a number of ways. It is not simultaneously true and false that "At least one of the children is a boy." Not saying "at least one boy" does not mean "at least one boy" is false. And regardless, you added an assumption to Gardner's argument, that the problem must have an unambiguous solution. All you would have proven, if your reductio was valid, was that (at least) one of the assumptions is incorrect. One already is: the one you added.

— JeffJor
Logically, thats true, but when you compute probability you need to know the frequency of occurrences. You need to apply a boolean true/false to a predicate for every occurance. The predicate here is "At least one of the children is a boy." Every occurrence needs to be assigned a truth value in order to calculate the relevant frequencies. If you interpret the predicate "At least one of the children [in family X] is a boy" as "One randomly selected child [in family X] is known to be a boy", then many occurrences will be assigned the value of true for "At least one of the children [in fact] is a boy," and false for "One randomly selected child [in family X] is known to be a boy."
Now you are saying "Not knowing predicate X" is not the same as saying "predicate X is false." Of course, this is correct: many statements will be "true" for the "it is true that" and false for "we know that." And that's the point: If you equate "At least one of the children is a boy" (which makes a simple, categorical statement of fact) with "We know that at least one of the children is a boy," then you will always run into problems because no predicate(X) is ever equivalent to we_know(predicate(X)). In order to "interpret" "At least one child is a boy" as "We know that ...", you have to assume that the two are logically equivalent. And if they are not logically equivalent, then you are changing the question. And, as you pointed out, clearly the two are not logically equivalent.
Now, I'm not proposing adding any of my criticism of Gardner's discussion of ambiguity to the article. I cited an article last night that argues that the reason people make the mistake is not because of ambiguity--this is clear; many people do not see the question as ambiguous in the way you describe, and get the answer wrong--but rather because the heuristics they use fail because the question intentionally gives them the false impression that there are only two possible outcome states. "The Boy or Girl Paradox is that different people see different problems..." is not true. Even when the question is unambiguous, the paradox still exists. Ambiguity is not central to this issue. This is why Mlodinow, Vos Savant, Bar-Hillel, Fox, many textbooks, and many sites online never discuss the ambiguity. The only source for ambiguity is Gardner, and all the anti-Marilyn people who picked it up as dogma because they want to prove someone with an incredibly high IQ to be wrong. And even Gardner never changed the question or the answer to address the supposed ambiguity.
My purpose for explaining why Gardner's analysis is (at the very least) suspect is to try to put it in perspective: it is not the consensus view, and it is not central to the discussion. Reasonable, smart people with valid, well-thought out positions can fully understand your position and disagree with it; most people who have published something on this topic have disagreed with you. This issue is a relevant WP:RS publication that may justify a section, but to emphasize it would put undue weight on an opinion that is dubious and only expressed by a small subset of the authors who discuss the issue. To allow this issue to color the discussion of the paradox elsewhere would be inappropriate.--Thesoxlost (talk) 15:32, 27 February 2009 (UTC)

GHardner's analysis is not, in the least, suspect. Your attack on it is. Nobody in any of your references both (1) Addresses the ambiguity and (2) Disagrees with me. Any that seem to disagree with me are either using a different statement of the problem that is (almost) unambiguous, or one that is ambiguous but they are treating is as though it is not. And it is not me who is originating that thought, it is Gardner, Bar-Hillel and Falk, and Grinstead and Snell. There are others as well. What the the study you cited concluded is that even for the (almost) unambiguous statement, many readers still see the other interpretation as though it was ambiguous. So yes, it is important to the article. JeffJor (talk) 16:09, 27 February 2009 (UTC)

If you choose to believe that "X is Y" is ambiguous because "How can we really know anything about X, anyhow?" then I don't think we will ever agree. Feel free to edit the article as you see fit, keeping consensus in mind. I've made a number of edits recently, including a more detailed section on ambiguity. I hope it reflects the current consensus following our discussion. As is, I think we can agree the article has improved?--Thesoxlost (talk) 03:38, 1 March 2009 (UTC)

Nobody is "choosing to believe 'X is Y' is ambiguous because "How can we really know anything about X, anyhow?" Sometimes, you only need to look at one child to make the statement "At least one is a boy." If that is the case, the answer is 1/2. If the problem statement doesn't say that two children were observed, and why you always would say "at least one is a boy" when at least one is alos a girl, then it is ambiguous.

The article is far from improved, because it keeps mixing the references up by trying to avoid making such a statement. And it is wrong in some places. The current question 2, "A (random) family has two children, and one of the two children is a boy. What is the probability that the younger child is a girl?" IS STILL AMBIGUOUS. How was "at least one" determined? Many of the references (John Tierney's, the BBC's) are ambiguous. How was "at least one" determined? Some (Mlodinow's) are wrong: it says that you only used one child to determine "at least one." Fox & Levav avoided most of the ambiguity by using "Mr. Smith says: 'I have two children and at least one of them is a boy.' Given this information, what is the probability that the other child is a boy?," but there still is some ambiguity (What would Mr. Smith say if he had two girls?).

But I'm to the point of giving up, because you will never let the article say what it needs to: that the only problem is that people confuse whether one or two children's genders were used to determine "at least one." I'll leave the article to sit for a while, to see what others do to it, then come back to see what needs to change. JeffJor (talk) 21:23, 1 March 2009 (UTC)


I think Thesoxlost is right on this issue. Surely when a question states something as a fact, this is not an invitation to the reader to come up with scenarios whereby the fact could have been discovered in a manner that leads to extra conditions/information?

Would this be a suitable analogy?: Two cards have been removed from a 'normal' deck of cards. At least one of them is red. What is the probability that they are both red?

I think the answer to my analogy is unambiguously 25/77. (With reasonable assumptions and no pedantry, that is.) Am I right?

And returning to the Boy/Girl paradox, to me it seems that the answer to the wording of the Mr. Smith question used by Gardner is unambiguously 1/3.

The article should not state as a fact that the question is ambiguous. Whether it should say something like "some experts consider the question ambiguous" should perhaps depend on how many experts do consider that. (Is it only Gardner himself and JeffJor?)

Disclaimer: I Am Not A Mathematician.

The ambiguity is in determining what a 'normal' deck of families would be from which this particular family is drawn. Therefore, in your analogy it has already been removed. Rp (talk) 17:49, 6 January 2010 (UTC)

Open4D (talk) 15:06, 6 January 2010 (UTC)

With all due respect, Thesoxlost is not correct. It isn't whether any one individual feels the problem can be correctly interpreted in only one way that makes a problem unambiguous, it is whether everybody agrees. Since not all do here, that makes it ambiguous. But let me provide a more concrete example of the ambiguity. To do that, I have to expand the range of the variable from two values (boy/girl or heads/tails), because somehow people feel that since "girl" is the same "not boy," that you don't need to concern yourself with possibilities surrounding the event "girl."
Say I perform this probability experiment many times: I will roll two fair, N-sided dice behind a screen. I will then tell you "At least one of the dice landed on an X," where X is a number between 1 and N. I then ask you "What is the probability I rolled doubles?" If you use the solution method used in the article, the answer is 1/(2N-1) (of N^2 possible rolls, (2N-1) contain an X, and one of those is double N's). But with fair dice, I will roll doubles 1/N of the time. There are only three ways to account for this discrepency: (1) Thesoxlost's solution is definitely wrong, (2) Thesoxlost's solution is definitely correct, but the mathematics of probabiltiy fails to predict the frequency of occurence for well-behaved random events, or (3) There is an ambiguity in the way the problem is phrased. I choose to accept the third alternative. Thesoxlost will propbably even try to tell you why the way I set it up is incorrect somehow - but that is the ambiguity.
Again, the probability we are asked for in the Boy or Girl paradox is not about Mrs. Smith's (or whomever's) family. It is about the set of possible families for whom the conditions of the problem statement are satisfied, and Mrs. Smith is just an example of that set. Those conditions are that we know that some facts apply to a randomly selected family from that set. It does not say that the family was selected because those facts apply, it only says that they do. In other words, they are necessary conditions on the selection process, but not sufficient conditions. The facts do apply, but they might also apply to a family we couldn't have selected for some reason. Do we include Chinese families? Adopted? Does it matter? This distinction very seldom matters to such problems, because the reasons you could not have selected a family this way are independent of the genders involved. For example, I could claim that "named Smith" was an additional fact required in the condition; but the distribution of boys named Smith should be the same as boys with any other name, so it doesn't matter. That doesn't mean the conditions are sufficient, just that it doesn't change the answer.
But technically, such facts need to be included. The conditions we must apply here are that "you learn the family has ALOB," where ALOB means "at least one boy." A 100% correct solution, including this normally-too-pedantic term in parametric form, is:
P(two boys|family has ALOB AND you learn ALOB)
= P(two boys AND family has ALOB AND you learn ALOB)/(P(you learn ALOB|Family has ALOB)*P(Family has ALOB))
= (1/4) / (??? * 3/4)
In Thesoxlost's solution, the missing "???" (the probability that you learn ALOB, given that the family has ALOB), is 1. So the answer is (1/4) / (3/4) = 1/3. But if the method, by which you learn ALOB, is that you meet exactly one child, then the probability you will learn ALOB when the family has ALOB is 2/3 (all of the times when there are two boys, and 1/2 of the times when there is one of each). So the solution is (1/4) / (2/3 * 3/4) = (1/4) / (2/4) = 1/2. Thesoxlost feels the assumption that this probabiltiy is 1 is justified because he has never had to consider such facts in a problem like this. He feels that a fact being a given in a problem makes it both a necessary and sufficien condition. Under that assumption, the correct ansewr is 1/3. But that means that the answer must be 1/(2N-1) in my dice problem, because all I "gave" you was "at least one N."
And please don't think I am calling Thesoxlost wrong. "Not correct" is not the same thing, when the problem is ambiguous. But those who say "1/2" are just as "not wrong" (and "not correct") as those who say "1/3," unless they can justify the assumptions thay make somehow. JeffJor (talk) 17:43, 4 March 2010 (UTC)

Jacob

The recent addition of a "third question" puzzles me. "A random two-child family with at least one boy who's name is Jacob is chosen. What is the probability that it has a girl? Does the additional bit of information that the boy's name is Jacob change anything?" The text as it currently stands claims that the probability is thereby changed from 2/3 to 1/2. How can this be? What is the difference between calling the boy "Jacob" and calling him "a boy"? I can see the logic of the table, but I am not convinced! Who can convince me? SNALWIBMA ( talk - contribs ) 17:18, 20 November 2008 (UTC)

Or can I convince myself? In fact, saying that the child is named Jacob is the equivalent of saying "This is Jacob. He has a sibling. What are the odds that his sibling is a girl?" It is in fact the exact equivalent of saying "Here is a boy. He is the elder of two siblings. What are the odds that his sibling is a girl?" (i.e., Question 1). OK, I think I'm convinced. Would it be useful to express it something like that in the article? SNALWIBMA ( talk - contribs ) 17:23, 20 November 2008 (UTC)
By all means, as a wiki editor, be bold in making edits. But keep in mind that the fundamental problem here is that our intuitions aren't very good. Thats why people get #2 wrong so often. An argument that you find intuitive for question 3 may not be correct. I don't think "This is Jacob. He has a sibling" is equivalent (logically) to "Here is a boy. He is the elder of two siblings." Just be sure that the intuition that you feel justifies #3 is correct and doesn't justify the wrong answer for #1 and #2. --Thesoxlost (talk) 18:09, 20 November 2008 (UTC)
Thanks! I am wary of making edits to this article, however, as I really am just an amateur in this field. "He is called Jacob" and "He is the elder of the two" are clearly different, but they are equivalent in the sense that in the context of the question they serve only to identify the boy. Neither age nor name is in fact important. What matters is that they identify this boy, leaving you to speculate only about the sex of the other child. Both Q1 and Q3 can in fact be reduced to "Here is a boy. He has a sibling. What are the odds that his sibling is a girl?" In Q1/Q3 a specific boy is paraded in front of you, and you are asked to work out only the sex of the sibling. And that, in effect, is the same as working out the odds for a randomly selected child - i.e. about 0.5. In Q2, on the other hand, no specific child is already identified, and you have to deal with a more complex set of possibilities. To put it another way, I reckon Q3 is really the same as Q1. And a question which began "A random two-child family with at least one boy, who has blue eyes..." would be the same. Wouldn't it? SNALWIBMA ( talk - contribs ) 19:49, 20 November 2008 (UTC)


Ah, both name and ageare important! Thats the trick. We have the intuition they don't, but they do. We can use a frequentist definition of probability here: the probability of the other child being a girl is the ratio of frequency of successful events (i.e., events in which one child is a girl) divided by the frequency of all possible events

. The "events" in this case each specify a pair of children; by specifying a name, or an eye color, you don't identify an individual, you restrain the space of successful and possible cases. That is, you are changing both the denominator and the numerator of this probability equation. And the fact that the likelihood of a girl being named Florida (or anything) is very small, it doesn't change the numerator and the denominator equally. Your right that any bit of information about the boy would change the probability of having a girl: eye color; height; anything.

Think about it like this: Say I told you that a family with two children has a girl with an extremely unlikely name. How many girls do they have? What would be your best guess? If they had two girls, they would have twice the likelihood of having a girl with an extremely unlikely name. So wouldn't you be wise to guess two? Thats the effect that comes into play here. Thats why name actually does matter. --Thesoxlost (talk) 22:58, 24 November 2008 (UTC)
You are correct that neither age nor name is important. But the identification of "this boy" isn't important either (whatever that might mean). What is important is whether it's an event (such as an observation or memory from an observation) that reveals that one of the children is a boy or girl, or whether the information that at least one child is a boy (or girl) is simply given to you as an initial fact. iNic (talk) 02:16, 23 November 2008 (UTC)
The third question sounds like nonsense to me. You can replace the letter B with any letter of the alphabet and it would not make any difference to the question. The only thing that (in my eyes) can make a difference is knowing or not knowing the order the siblings were born in. Any statements about the probabilities of children having certain names are well outside the scope of these puzzles (as is assuming the probability of having a boy or a girl is anything other then 50:50). Someone with more knowledge then me should either explain or delete the third question section. 09:30, 15 December 2009
The third question does makes sense. The question is controversial because it seems counterintuitive that supplying the name (or weekday or birth, or other seemingly irrelevant information) of one child, can have an impact on the probability of the "other" child being a certain gender. In fact, much of the controversy is caused by ambiguous wording of the question, and supplying the name only impacts probability if the question is understood in a certain way). But the idea is this: In question two, we eliminate from consideration all girl-girl pairs, and when looking at the remaining pairs which are equally boy-girl, girl-boy or boy-boy, 1/3 will be boy-boy. In question three however, we arrive at we do not only eliminate from consideration all girl-girl pairs, we also eliminate ALMOST all boy-girl and girl-boy pairs, but you only eliminate approximately half as big a portion of the boy-boy pairs (because if you have two boys, the probability that at least one is named Jacob is almost twice the probability it would be if you only had one). When looking at the remaining pairs, there will be a boy-boy group that is approximately as big as girl-boy and boy-girl groups combined), thus you end up with a 50-50 distribution. Again, the whole point is that this is counterintuitive and requires a specific interpretation of the question. Anyway, there are so many webpages out there that explain this in detail, the most intuitive I found being this: http://www.decisionsciencenews.com/2010/05/28/tuesdays-child-is-full-of-probability-puzzles/. 86.52.86.238 (talk) 16:18, 12 June 2010 (UTC)

Source of paradox

There are two children A and B. We don't need to know which one is older.
Please if possible emphasize the difference between the following statements:
There exist a child which is either A or B, and is a boy. What is the probability, that both children are boys?
Child A i a boy. What is the probability, that both children are boys.?
123unoduetre (talk) 20:15, 19 September 2010 (UTC)

Mu. With only two children, it doesn't make sense to speak of probability. Rp (talk) 19:15, 27 September 2010 (UTC)

Mistake in "Bayesian approach"?

The probabilities in the "Bayesian approach" are currently: , , ,

And the end result is:

Shouldn't the probabilities be: , , ,

And the end result:

I am not fully sure that I am correct which is why I don't edit the page directly. If you know probability theory, it should be straightforward to verify. Intuitively, the reason that the current probabilities are wrong, is that if the first brother is named Jacob, the probability that the second brother is named something else than Jacob is 1, not (1-p). This is because no more than one brother is allowed to be called Jacob. The explanation is similar for the case where the second brother is called Jacob. 83.233.152.179 (talk) 15:39, 1 January 2010 (UTC)

Where are you getting your probabilities? Do you know how to use Bayes' Theorem? Here we go:
P(A | B) = ( P(B | A) * P(A) ) / P(B)
P(A | B) = probability of having two boys if at least one child is a boy
P(B | A) = probability of having one child be a boy if I have two boys
P(A) = probability of having two boys
P(B) = probability of having at least one boy
Given the combinations { {B, B}, {B, G}, {G, B}, {G, G} }:
P(A) = 0.25
P(B) = 0.75
P(B | A) = 1
Thus: P(A | B) = (1 * 0.25) / 0.75 = 0.25 / 0.75 = 0.333 = 1/3 Dpru (talk) 18:22, 15 October 2010 (UTC)

Computer printout example

Let me see if I, a non-mathematician, have got some of this right. If I were a new teacher at a large co-educational high school, one which had a computerized student database, I could ask the computer to give me a student report. Suppose I asked it to generate a printout of names of families who have two and only two children, with at least one being a boy. I look at one of those family names on the printout and think about the odds of one of their children being a girl. I can now see, I think, why that would a counter-intuitive 2/3. It is simply because the computer would firstly have selected all the families who have two and only two children, and then deleted from that list all those where both kids are girls (the G-G possibility). The odds of a two-child family being all girls is 1/4 we can all agree. So, having deleted the G-G option, that leaves the printout with the B-B, the B-G, and G-B possibilities intact. That would mean that 2/3 of the families on that printout had a girl in their family, and only 1/3 had two boys.

OTOH, if I questioned a boy student and he said that he had only one sibling, then there would be 50% chance that it would be a girl. The two results are different because the way we set up the information gathering procedures are different. Myles325a (talk) 05:29, 21 June 2010 (UTC)

Sounds correct to me. And the ambiguity is about which of these two scenarios is implied by "Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?". It is impossible for me to see how the original question would suggest the second interpretation while clearly suggesting the first interpretation as if "somebody with universal knowledge" was simply stating the fact that "At least one of them is a boy.". The second interpretation requires that you inject additional information into the problem; information that isn't there about how we arrived in the situation of the problem. I have always been a big fan of Martin Gardner, but I think he dropped the ball here by apparently bowing to the pressure of admitting that another interpretation of the problem is possible, or rather, logically defendable.
For comparison, if the problem would be "Mr Smith flips a coin and bets that the outcome will be heads. What is the probability that he wins the bet?", then I think nobody would suggest that the answer is "More than 0.5". Yet it is easy to defend this answer with the "interpretation" that Mr Smith probably wouldn't be flipping coins and placing bets on them unless he had something to gain by it. (What would you think if you met such a Mr Smith in the real world, trying to make a bet with you. Would you really think your chances were 50-50?) But the second "interpretation" of the boy-girl problem uses exactly this sort of "logic": it injects additional assumptions into the context of the problem.
And yes, I know it doesn't really matter for the article what I think and that I am basically just repeating the arguments of Thesoxlost. But I think they are worth repeating in different words. Too bad that for many articles the most valuable information is on the discussion page. AlexFekken (talk) 13:16, 16 September 2010 (UTC)
Most people who actually consider the interpretation objectively feel the opposite is true. Almost universally, sources that accept interpretation 1 do not address the possibility of ambiguity in their work, and sources that do address it (many were listed in the article besides Gardner, his was significant only because it was among the earliest widely-published sources) seem to prefer interpretation 2 even if they finagle a reason to say the answer is really 1/3 (see the May 2010 edition of Devlin's Angle for an example of finagling after admitting the ambiguity). And when people apply intuition to the meaning and not the solution, it's nearly 100% for the second interpretation (and I'll provide backing for that claim below).
In their book (the link in reference 14), Grinstead and Snell say "It is not so easy to think of reasonable scenarios that would lead to the classical 1/3 answer." The short explanation for why, is that the first interpretation requires the information to essentially be the answer to a question like "Does Mr. Smith have at least one boy?" Because whoever provides the information has to (1) have already decided to tell you "at least one boy" in all cases where it is true, (2) have knowledge of both genders (which is the better term because it can be based on outward appearance, and not physical examination) so that it can know the fact in all cases it is true, but then (3) purposely withholds this additional knowledge it has. Many versions (Mlodinow's "You recall that one is... but you don't recall whether both are..." comes to mind) actually don't satisfy all of these criteria, and should be answered "unambiguously 1/2."
The second interpretation, on the other hand, places no requirements on the bearer of the information other than to make an observation. You are misinterpreting the fact that, in order to get an answer, we have to do more work in the second case. You are taking that to mean it "injects more information," when the truth is the extra work is needed to make up for there being less information: it places none of those three requirements, which streamline the solution.
But let's get straight what the ambiguity is, which I feel Thesoxlost misinterprets. In order to calculate a conditional probability, you have to define the event that corresponds to what you know about the outcome. But an event is not an outcome itself, it is a set of possible outcomes. Rolling two dice, an outcome might be (2,5) which is an element of the event "Seven," but getting the event "Seven" does not mean a (2,5) was rolled. In a similar way, "at least one boy" has to be a property of every outcome we put into our event, but we don't know if every outcome with that property should be put into it. Most problem statements are unclear on that point, and so are ambiguous.
And we can test your intuitive reaction to this difference in interpretation. There is a new version of this puzzle that has had a lot published about it recently. It probably should get included (something similar, but not as well defined, was recently removed because it didn't have enough written about it). Suppose, instead of "one is a boy," Mr. Smith said "One is a boy born on a Tuesday." Does that change the answer? Under the first interpretation, it does, because in a two-boy family you are almost twice as likely to find a boy born on Tuesday, than in a one-boy family. The answer changes to 13/27, which is almost 1/2. Now, if you find yourself objecting on the grounds that the boy's birth date can't affect his sibling's gender, and I have not run across anybody who has said their initial reaction was not that, then you are intuitively applying the second interpretation to the new information. Many will even try to use the first interpretation for "boy" and the second for "born on Tuesday," so that they can still get 1/3 as the answer. JeffJor (talk) 20:08, 23 September 2010 (UTC)
Let's first focus on your final paragraph as you seem to think it would convince me I am wrong. But I totally agree with the answer of 13/27 and I would arrive at it in exactly the same way as I arrive at the answer of 1/3: by considering the "universe" of all possible scenarios and assigning each of them the same probability (out of 196 equally likely scenarios in our "universe" 27 satisfy the data given and 13 of those lead to a boy vs 4 -> 3 -> 1). It doesn't matter that the "discrepancy" between the two answers is counter-intuitive because we humans aren't very good at intuition anyway. The reason that the probability changes is precisely the injection of additional information because having a boy born a Tuesday is a much rarer a priori event which drastically changes the probability of the second child being a boy relative to that event.
To make 1/2 the "correct" answer we have to ask ourselves "how did we get to know the information that was given" and come up with a "reasonable scenario" as Grinstead and Snell apparently call it. The argument of Grinstead and Snell then boils down to saying that the most reasonable scenarios that lead to the information given are consistent with interpretation 2 and not 1. That may be the case but that wasn't what was asked for. So they are the ones injecting additional information by saying that there has to be a reasonable scenario. Perhaps many people try to answer a question such as this "intuitively" by imagining a reasonable scenario but in doing so they are creating a different problem.
Interpretation 1 on the other hand uses just the information given and doesn't care how it was obtained. That it may be difficult or even implausible to get this information is totally irrelevant and no excuse to change the interpretation by making up a "reasonable scenario" and injecting that into the problem. And again: if you do that than you should also admit that my Mr Smith is more likely to win his coin flipping games because that is the most reasonable scenario to explain why Mr Smith is flipping coins for money in the first place.
So if you want to argue that human intuition is very flawed then I totally agree; that's nothing new. But you haven't convinced me in any way that interpretation 2 is logically defendable. AlexFekken (talk) 09:10, 25 September 2010 (UTC)
The point is that neither interpretation is defendable. "How it is obtained" is critical to getting the answer, as demonstrated by the Monty Hall Problem. If you try to hide behind the veil of "considering the 'universe' of all possible scenarios [that can produce the observed result] and assigning each of them the same probability," then the answer to that problem is "each door has probability 1/2." That's wrong. If the problem doesn't say how the information is obtained (as in the MHP, where it impliea the host chooses randomly between two goats when he can), it is ambiguous.
But ultimately, it doesn't matter what you or I think. Every published source that considers the quesiton, says it is ambiguous. So for the purposes of the article, it is ambiguous. JeffJor (talk) 22:42, 25 September 2010 (UTC)
I do agree with you on most points and thanks for helping me put my finger on the issue. Starting with a universe that has 4 equally likely outcomes (B-B, B-G, G-B, G-G) is not a veil behind which I am hiding, but it is still the correct starting point. The real issue is whether the additional information that "one is a boy" was more likely (i.e twice as likely) to be the result of a B-B situation than either of the B-G and G-B situations. And that depends on how the information is obtained, like you said. My idea that the information could simply "be there" is correct from a set theoretical point of view, but for determining the impact on the statistics more is needed to be unambiguous. I am with you now on that.
And I know it doesn't affect the article but that's OK because I get a lot of additional value out of the discussion pages. More people should at least read them!
Back to Myles: in the computer printout situation B-B wasn't more likely because all possibilities were listed, but when you actually meet a boy it is more likely because you could have met either the older or younger boy if there are two. 220.239.72.7 (talk) 01:05, 26 September 2010 (UTC)
It most definitely is such a veil, because you use it to ignore the possibilities where you would learn there is at least one girl, and that is part of the ambiguity. You have to account for all possibilities that are implied in the problem statement, even those that contradict what it said happened. Monty Hall's The "full universe" includes cases where the car is behind Door #1 and the host opens Door #2, even though the problem statement says he opened Door #3. And it is that fact that reduces the probability the car is behind Door #1 when he opens Door #3 from 1/2 (the wrong answer) to 1/3 (the right one). The Three Prisoner's "full universe" includes cases where A will be pardoned and the warden tells A that C will not, even though in the problem statement the warden says B will not. And it is that fact that reduces the probability A is pardoned when the warden mentions B from 1/2 to 1/3. In Bridge, the Principle of Restricted Choice (which is what all these problem have in common) says that even though the probability a specific opponent has the King of Hearts starts at 1/2, it becomes 1/3 when that opponent plays the Queen of Hearts because the King is an equivalent card in the same hand. The "full universe" includes cases where either gets played in that situation, even though the problem says the Queen was.
Because the older child can be a boy or a girl, you include both possibilities in your "full universe." Because the younger child can be a boy or a girl, you include both of them as well. You even included GG in that universe, which you know did not happen. You eliminated it because you know it did not happen, not because it could not. Well, the statement "At least one is a ****" can be made the same two ways, and must also be included. There are eight (or six, if you eliminate inconsistent ones) elements in your "full universe," not four. We know BG_ALOG did not happen, but you can only eliminate it if it could not, and the problem statement does not say that. That is the ambiguity, and it occurs because the problem statement only describes one outcome, not the process, and it can't describe it both ways.
And to illustrate that ambiguity, is there some reason the statement "You know this family has at least one boy" could not be what was said about either example you explained to Myles? I can't find one. Certainly, both could be better explained: "Selected from all families that have at least one boy" and "all we know is that one child was observed to be a boy" does it, but which is adding more information than in the original problem statement? I'd say the first is, and that's why the second is the better assumption. JeffJor (talk) 16:15, 27 September 2010 (UTC)
I don't understand what you are arguing about now. My 'Universe' does include all scenarios that could happen, that is why it includes G-G. It is the starting point and common ground for both scenarios. This starting point needs to be clear because it already contains a number of tacit assumptions that (I would think) are not under discussion though they may in fact be incorrect (that boys and girls are equally likely to be born, that there is no dependency between the number of children and their sexes). So rather than a veil it is essential to the problem definition. The ambiguity is introduced later on by various ways of interpreting what it means that we also know that 'one is a boy'. I have already told you I am with you on that now.
If you are trying to say that the information that there are two children cannot be treated independently from the information that 'one is boy' then strictly speaking you are correct because in reality there will be a dependency between the number of of children and their sexes. But if we do go there, then neither the answer of 1/2 nor 1/3 will likely be correct. I was merely pointing out (with my 'Universe') that I didn't want to go there. AlexFekken (talk) 02:29, 17 October 2010 (UTC)

OP myles325a back here. Thanks for the interesting discussion. I have to think it over before I post a more considered response. But I will say that I do NOT see the relevance of whether the boy I meet is the "older or younger boy" of the two. This is just a red herring. We should note at the outset that is a purely methematical problem, not a sociological or psychological one. It would probably be best if the scenario was changed to a completely mechanical one.

I suggest something like this: A machine selects two coins randomly from a bowl full of them and places them next to each other on a table, and covers the one on the left. A person comes into the room and sees that the visible coin on the right is Heads. Would this person be rational to assume that the chance that the other coin is Heads too is 50%? If not, would it be rational to make bets on the result because the odds are NOT 50/50? Would you be willing to make such bets? I don't think so, and I think that the whole history of the world would be different if this were the case. For example, we could make all kinds of deductions about the otherwise unknown relatives of people based on simply knowing the sex of the ones we do know about. I can't recollect the FBI or any other body ever contemplating such procedures. Is this an oversight on their parts? Myles325a (talk) 09:51, 18 October 2010 (UTC)

Alex, first let me say that there are at least three different kinds of ambiguity: one where a single person can't tell if meaning A or meaning B is correct; one where two people have different opinions about A vs. B, but one is wrong; and one where they disagree but neither is wrong. The Boy or Girl Paradox has several ambiguities, of all three kinds, and some are similar enough that it gets confusing trying to identify them.
Nobody cares about ambiguities concerning other-sized families, that more boys are born then girls, whether there are dependencies, etc., etc. We always assume the universe consists of only families of two children evenly divided, because to assume anything else complicates the puzzle without addressing what makes it interesting. An ambiguity of the third kind is "Shouldn't 'one is a boy' mean 'exactly one is a boy?'" No, we take it to mean "at least one is a boy" because it is not a probability problem otherwise. A similar, but very subtle ambiguity of the second kind, is that some people think "At least one is a boy" means "One specific child of the two is a boy, and the other is not mentioned." These are the people who argue "the probability that the other child is a boy is 1/2, so the probability of 2 boys must also be 1/2." It's wrong, because you do not need to specify a boy from a two-boy family to make the statement. (Imagine I ask a father of two if he has at least one boy. Since he has two sons, he says yes. If I then ask "is the other child your older child?", how can he answer?)
The important ambiguity is that your universe included only four cases, but it needs to include at least six: BB_ALOB, BG_ALOB, BG_ALOG, GB_ALOB, GB_ALOG, and GG_ALOG. The "ALOX" represents families where you know that the gender of at least one child is X. All we know about this set initially is that P(BB_ALOB)=P(BG_ALOB)+P(BG_ALOG)=P(GB_ALOB)+P(GB_ALOG)=P(GG_ALOG)=1/4. The ambiguity is that we don’t know how to divide the 1/4 probability for P(BG) between P(BG_ALOB) and P(BG_ALOG). You are convinced P(BG_ALOB)=1/4 and P(BG_ALOG)=0, while I am convinced that both are 1/8. And nothing in the problem statement proves that either of us is wrong.
You may think that because the problem statement only mentioned boys, that BG_ALOG is impossible. That the information represents the answer to "Is at least one a boy?", and the cases in our universe should be BB_ALOB, BG_ALOB, GB_ALOB, and GG_NOT_ALOB. That's a logical fallacy called "affirming the consequent." You would be assuming that because this breakdown is possible, that it must be what happened. Or, you may think that I am affirming a different consequent, by assuming a specific child in meant. That will cause BG_ALOG to happen in half of the BG cases, but it isn’t the only way. The question I just mentioned is also possible, but then so is the question "Is at least one a girl?" Then BB has to split similarly between BB_ALOB and BB_NOT_ALOG, and we still get BB_ALOB happening as often as BG_ALOB and GB_ALOB put together. But I'm not assuming any method - I'm saying that I have no idea why we only know one gender between two children, but that in a full universe P(BG_ALOB) has to equal P(BG_ALOG). So, only half of the BG cases represent the cases where we learn ALOB. And I'll point out that this is the exact reasoning that leads to the "correct" answer in the Three Prisoners Problem or the Monty Hall Problem.
Myles: it is not relevant whether the boy you meet is the older or the younger. That is just a way that you can distinguish one specific child from the other. There are many other ways, including picking the taller, the one whose name is first alphabetically, or the first one you meet. If the way you identify him is independent of gender (so "taller" is out), there is an "other" child who has a 1/2 chance to be a boy. So, the probability is 1/2 that both are boys. There are others ways to learn "one is a boy" that don’t specify a child, like seeing one boy's bicycle parked in the driveway or meeting only the parent at a Boy Scout meeting. Then the probability of two boys is 1/3. Your coin example identifies one coin, so the answer is 1/2 there. But if your mother dropped two cookies (with icing) off of the counter, and all you know about what happened is that she asked you to get a mop to clean icing off of the floor, then there is a 1/3 chance that both cookies landed icing-side down.
Now, we really do need to stop discussing the problem; this page is for discussing the article. It correctly identifies the ambiguities, and the appropriate answers. JeffJor (talk) 17:21, 22 October 2010 (UTC)

Another way to think about it

Think about it in coin terms. "You have a gold coin and a silver coin. You flip both: The gold coin came up heads. What's the probability the silver coin is also heads", you know that effectively if you assume as given that the gold coin has comes up heads (you can basically just put it heads-up on the table. This is a given, and the gold coin's result does not affect the silver coin's result), the silver coin still has a 1/2 chance of being also heads or not heads. You can ignore the gold coin basically because it's result is predeclared.

However, if I then ask you "You have a gold coin and a silver coin. You flip both: At least one came up heads. What's the probability that both did?" Now you have two coins in front of you. Look at it three optional ways:

  • You're going to have to flip both. You don't know which one is the "at least one" I'm talking about, so you have to flip both. Sometimes you'll get Gold heads only, sometimes you'll get Silver heads only, sometimes you'll get both heads, and sometimes neither. All 4 of these situations can and will happen (assuming you do it enough times) with equal probability. In 3/4 of them, I can say "at least one is heads" truthfully, but in only 1/3 of those situations is it true that both are heads. This is different from the first question because one of the three situations (silver-heads only) is eliminated above since in the gold coin does not come up heads in that case.
  • Think of it from the other direction: At least one came up heads. That means both question-one situations apply with 1/2 chance of either occuring, but now the situation is such that the silver might be the one that came up heads but NOT the gold. This is a third equally likely situation and thus each situation is 1/3 probable.
  • From a third point of view: Some people think the probability is still 1/2 because of the flawed logic that we can just double question one's logic for the silver coin too: Well, if at least one is heads, then there are two situations to look at: If I flip the gold coin and it's heads, we can have (with equal probability) silver heads or silver tails. If I flip the Silver and it's heads, we can have (with equal probability) gold heads or gold tails. There are four situations with two "both heads", so that's 1/2 chances. This is flawed because when you double the analysis, you cover "both heads" twice, implying there are two ways to get both heads with two coins, but there aren't. Two coins can only come up both heads in one way... by both being heads. However, they can come up heads/tails in two ways: Gold heads only or Silver heads only. One of the potential flaws in this logic is that for this basic probability you can only identify the two coins in one way (in my example, I do it by metal). This flawed logic would suggest that the coins are identified both by metal, but also by the order of flipping. Order of flipping can be used (replace gold with "first" and silver with "second") but if you use both colour and metal in your logic, you must do a more complex probability study: in that case you have the following possibilities for at least one heads:

GTST (gold tails first, silver tails second ...) GTSH* GHST* GHSH* STGT STGH* SHGT* SHGH*

In this case we have 8 possibile results, only 6 of which have at least one head (so we ignore the other two that aren't marked), and 2 of which have both heads (2/6 = 1/3 probability) - but remember again that if I say "GOLD heads", instead of "at least one heads", there are four possible results (GHST, GHSH, STGH, SHGH) two of which are both heads (so still 1/2 probability). This means that adding another variable that each coin must be one and each must be the other (eg: one coin is gold and the other is silver, one is flipped first and the other second, one flippped by me and the other by you, one flipped onto the table and the other the floor, etc... ) as long each coin is equally likely to be in either situation (gold coin flipped first by me onto the table... equally likely as gold coin flipped first by you onto the table... equally likely as the gold coin flipped second by you onto the floor.. etc) The probabilities for a question like this that ignores all of those variables remains the same. You just have way more cases if you write them all out. TheHYPO (talk) 20:47, 4 December 2009 (UTC)

There are different approaches to decoding what you want to say when you say "One of the coins is heads":
  • "I flipped the coins until the first time that I was able to make the statement that One of the coins is heads. Now what is the probability of both being heads?" - 1/3
  • "I flipped the coins only once. One of the coins is heads. I looked at the one that fell nearer to me to get that information. What is the probability of both being heads" - 1/2
  • "I flipped the coins only once. One of the coins is heads. I looked at both, and I always mention heads when I can, but I if I had had to, I would have said One of the coins is tails. What is the probability of both being heads" - 1/3
  • "I flipped the coins only once. One of the coins is heads. I looked at both. Whenever I see a tails and a heads, then I just mention one of them with 50-50% probability. What is the probability of both being heads?" - 1/2
  • "I flipped the coins once. One of the coins is heads. I looked at both, and I always mention tails when I can. What is the probability of both being heads?" - 100% (it cannot be tails, because then he would have said One of them is tails, as he says he mentions tails whenever he can)
We don't know which one the guy means, and cannot even guess it. If we know that he only flipped once, then we can reason as follows: Assuming no prior knowledge about him preferring heads or tails, headpreference and tailpreference are equally likely. But, we now know what he said. Our confidence in him being headpreferrer has risen (according to Bayes' theorem) to 3/4. Him being tailspreferrer has sunken to 1/4. If take a weighted average of the 1/3 and the 100% with these weights (3/4 and 1/4) then we arrive at the nice 1/2 probability. W can also consider cases like "he prefers to say heads 73% of the time when he can" we can also calculate the result probability, and then weight that and the symmetric one with tails (as they had equal prior probability). That will also give 1/2 as endresult. So, if we assume his plan was to only flip once: 1/2, if we assume he flipped until his earlier fixed statement became true: 1/3. We may also weight these with our percieved probability of him behaving this or that way. Qorilla (talk) 22:24, 7 January 2011 (UTC)

Girl then boy in 1st and 2nd questions - why?

Why is a girl the example in the "First Question" part of the article, and a boy in the "Second Question" when the author explicitly states that there is no difference between the two examples. I'm confused...Myles325a (talk) 06:25, 25 January 2011 (UTC)

There is no mathematical difference between the two questions, except that one specifies the relative age. The superficial difference, asking about a boy rather than a girl, makes it easier to distinguish the two problems when comparing them. Saying "the probability for the boy is..." is just an easier way to identify that you are talking about the second question. JeffJor (talk) 16:43, 25 January 2011 (UTC)

I think you have missed my point, which is odd as you are the one who understands this more than I do. If you interpolate a difference between two quesions, for the sake of "elegant variation", or to conform to non-sexist writing dictates, or for any other reason OTHER than for stricty necessary mathematical and logical dictates, then you will inevitably confuse readers who of course do not know that THIS difference is only a stylistic one, and not germaine to the argument.

I'm used to seeing this type of confusion arising in technical material where the author has been taught to use different terms for the sake of elegance. Thus he writes for example "When the axis is made secure in the B position, the bearing can rotate, In the A position it cannot move, but it can spin in the the C position. Here, the author intends that rotate, spin, turn et all are meant to be exact synonyms. But of course, the novice reader is not au fait with this, and can imagine that in certain positions a the bearing can "spin", that is turn rapidly, while in the other positions, it can only turn slowly. Elegant variation in technical material can be a bad trap. Myles325a (talk) 10:25, 30 January 2011 (UTC)

And I think you over-emphasize the possibility of confusion here. For 50 years, people have read this quote from Martin Gardner, and not been confused like you describe. In fact, this practice is quite common in mathematics textbooks, and serves a purpose other than to appear "elegant." When you make a slight change to a problem, you make another insignificant change (names, genders, etc.) that distinguishes the problems from each other in a transparent manner. Gardner did both, and if we quote him and refer to his treatment of the problem, we have to leave it that way. JeffJor (talk) 17:22, 4 February 2011 (UTC)

What makes you think that people are not confused by what Gardner writes? Let me put my hand up among an ocean of others and say I have been. And his material was for a science magazine, whereas here we are writing for a people's encyclopedia and a lay public. You might have noticed that there are ample archival pages of discussion on this article. Has it ever, for second, occurred to you that they might be a consequence of you not EXPLAINING the subject adequately?

There are numerous scenarios in scientific and philosophical exposition in which characters are invoked. Alice and Bob often feature in articles on cryptograms, and so on. The difference between subjects like astronauts, food consumption, diabetes, and what have you is that the subject matter is entirely divorced from the sex of the participants. In the boy / girl paradox, the sex of the children is not just a part of the whole problem being evaluated; it is pretty much the WHOLE BLOODY THING! One should standardise the question to its most elemental form, not form symmetrical variants which simply confuse the other symmetrical variants with which they deal. Martin Gardner's stuff is not written on tablets of granite. The paradox stands by itself, and reference to Gardner need only be marginal.

In any case, I am not going to change the text again, but I am still going through this stuff and trying to make more sense of it. Thanks for the work you have put into it. I don’t mean to sound ungrateful and rude, but I do think that this can be improved. And I think that my idea that the paradox should be reduced to ABSOLUTELY primitive elements, like people carrying black and white marbles in their coat pockets should have been adopted before. Then we would have none of these constant red herrings as to whether someone would “walk his son down a London Street like that” or why a man would say what he did, or whether there are more boys than girls and so on. But I’ll have a think about it. Why don’t you do the same? Myles325a (talk) 07:15, 8 February 2011 (UTC)

What makes me think that people are not confused by it? The fact that in the extensive volume of replies to the problem, much of which I have read, and that argue the issues from every conceivable angle pro and con, I have seen this particular complaint exactly once. So, what makes you think others share it? And I don't mean similar complaints to other texts, I mean this one. Which, by the way, was from a magazine about science but for lay people. And maybe you should look up what encyclopedias are supposed to do. There should be no "original research" included, only what can be compiled from other, reputable sources. Martin Gardner is one, especially on mathematical curiousities, so quotes are apprpriate and must remain quotes whether or not you would have edited them in the original.
I agree that work could be done on the article, but unfortunately too many people hold too-strong, and differing, opinions on what the actual answers are. So instead of being informative, much of the article is a battle to include the referecnes that support the editor's opinions. Any attempts I've made to simply that have resulted in edit wars that make it worse, so I stopped. JeffJor (talk) 14:42, 12 February 2011 (UTC)

second question

The other possibility is that the family was selected randomly and then a true statement was made about the family and if there had been two girls in the Smith family, the statement would have been made that "at least one is a girl". If the Smith family were selected as in the latter case, the answer to question 2 is 1/2.

can anyone walk me through this? I dont see how this is true. it should be 1/3 I think.

Here's what we know in that variant, in order of knowledge:

  • 1. That each child is either male or female.
  • 2. That the sex of each child is independent of the sex of the other.
  • 3. That each child has the same chance of being male as of being female.
  • 4. A family of 2 children is selected at random.
  • 5. A true statement will be made about the family. If the famiily has 2 girls then it will be "one is a girl", otherwise "one is a boy".
case 1: "one is a girl"
  • girl, girl --> implies probability 1.

case 2: "one is a boy"

  • boy, girl
  • girl, boy
  • boy, boy --> 1/3

--- or we can change the 5th statement (above) so that we don't know if it is the two boys or the two girls that will elicit an automatic response (ie, we don't know for sure that it is a girl,girl selection that produces the exceptional true phrase "one is a girl", or if it is a boy,boy selection that produces the exceptional true phrase "one is a boy"). Now, with this new step, the cases become regular:

case 1: "one is a girl"
  • girl, girl
  • girl, boy
  • boy, girl

case 2: "one is a boy"

  • boy, boy
  • boy, girl
  • girl, boy

in both variants, the probably is 1/3. The article says 1/2. --— robbiemuffin page talk 13:10, 27 March 2011 (UTC)

Robbiemuffin: You may have been confused because the explanation was worded extremely poorly. It seems to have been written in an attempt to deny the validity of arguments that support 1/2 as the answer, since it uses phrases like "necessary and sufficient conditions" (I'll call this n&s) that such arguments use, but it uses them incorrectly or as if they are implicitly satisfied. I've reworded it, fixed the incorrect n&s references, and attached the citation (to Grinstead and Snell) to something it actually says. If you still have difficulty, look up Bertrand's Box Paradox first, and then ask again if you need to. The lesson it teaches is very important, but is seldom applied completely.
That lesson is that it is never correct to merely "count cases." You need to add the probabilities that each case would produce the result stated in the problem. Grinstead and Snell do this by using a probability tree. You don't "ignore" the cases that can't produce it, as the article implies by striking them out in the table, you assign a probability of zero to them and add that. I know it sounds like the same thing, but it helps conceptually to include them.
This method works out the same as counting cases if every case has either a probability of 0 or 1, but not if some are somewhere in between. That is, if some cases that could produce the result don't always do so. And that's where n&s comes in. A suficient condition always produces the result if it can. For this condition to be n&s, it must have (1) considered both children, and (2) been specifically looking for boys, and not girls. If either criteria is not met, and the Second Question does express or imply either, you can't conclude the probabilities for BG or GB are 1, and 1/2 is a better answer. JeffJor (talk) 20:54, 6 April 2011 (UTC)

Bad grammar

Here is the beagles version, but the same grammatical problem occurs in other forms of the puzzle.

"A shopkeeper says she has two new baby beagles to show you, but she doesn't know whether they're male, female, or a pair. You tell her that you want only a male, and she telephones the fellow who's giving them a bath. "Is at least one a male?" she asks him. "Yes!" she informs you with a smile. What is the probability that the other one is a male?"

Here, "the other one" makes no sense, because no beagle has been specified. Yes, it is assumed that to answer the question, the fellow who is bathing them looked at one of them, and then "the other one" would be the one he did not look at. But, of course, he might well have looked at both of them (and probably did, at some point). In that case, of course, there is no "other one".

Yes, it is a common mistake in the literature about this problem, brought about because the author does not understand the correct way to solve it. But not as bad as you make it out to be. You can pick one male beagle at random from the one, or two, in the litter. It is "the" male, and its sibling is "the other." Regardless, it doesn't change the answer for this unambiguous version. The answer is 1/3 because (1) you sought information of the form "at least one male" and not "name one gender," and (2) you sought it from someone who knew both.
The incorrect way to solve it that I mention, is to count the cases that exist, like the article currently does. What you need to do is sum the probabilities that each existing case would produce observed result, which would emphssize that you aren't talking about one specific boy in a two-boy family. I may see if I can do some editing, based on (among others) a paper by Marks and Smith from the economics department at Pomona College, economics-files.pomona.edu/GarySmith/Two-Child%20Paradox.pdf . JeffJor (talk) 17:35, 29 July 2011 (UTC)

Bayesian view

I have to say, hats off to whoever had the idea of starting the section off with the injunction that "we consider a large Urn containing two children". The deadpan literalness of this surreal instruction made me break out laughing out loud. Respect.

I think what's in the section is quite good, though the tone of the writing is a bit more over-familiar than it should be. But I wonder if it would work better to consider the variant problem first, viz that we look out of the window, we see one child, and it is a girl.

Then it's straightforward to apply Bayes' theorem, starting with the prior probability that the probabilies

 BB  |  BG
-----------
 GB  |  GG

are all equal, but the conditional probabilities are not, because

but

so, applying Bayes theorem, we find that P(GG | Girl seen) = P((BG or GB) | Girl seen) = 1/2; which is in line with our natural instinct that if we know the gender of one particular child, that the gender of the other is an independent variable, so still has probability 1/2. That's very clear, and uncontroversial, so seems to me a good place to start.

The question then is what sort of information I might lead to the assignment P(GG | I) = 1/3 ?

Certainly, a friend might tell you that he knows for a fact that one of the children is a girl, but then the question is: how does he know? Or perhaps: how do we think he knows?

If it is just because he has seen one of the children out of the window, then his assessment has to be the same as ours previously, that P(GG | Girl seen) = 1/2. It is only if he has been told, eg by one of the children's parents, who would know in a different way, that not both of the children were boys, that we can get to P(GG | I) = 1/3, as the 'paradox' envisages.

Anyhow, that's how it seems to me; and how I think the 'Bayes View' section could be usefully rewritten. Jheald (talk) 22:32, 15 October 2012 (UTC)

New variants

In the last few years there have been a lot of interesting new variants to the paradox. For instance, suppose I tell you that one of the children is a girl and is born on a Tuesday, what is the then the probability that the other child is a boy? Here's one reference to start with: [1]. As well as popular publications there are also academic publications on this variant, such as this [2] by Ruma Falk (very well known maths-educationalist). Strange they didn't make it into the article yet. Richard Gill (talk) 16:36, 3 November 2012 (UTC)

I have added a new subsection on this variant. Still to do: give proper literature reference (I just put in the internet link), cite some further references. Richard Gill (talk) 07:41, 4 November 2012 (UTC)

Richard: they have, and it was removed. See comments above about a boy named Jacob. The problem is, by far most of the sources give what I call the "pre-determined" answer. That's the one based on selecting the family from the set of all families that satisfy the condition, rather than observing a fact after selection. And they treat it as though it is the only possible answer no matter how the problem is worded. I hadn't seen the Falk one, and am glad to see it.
Here's another reference by a couple of economists. I can even give you a brief history of what I beleive to be a linked trail, although I have absolutley no direct evidence. In 1988, John Allen Paulos (Temple University) published a book called "Innumeracy: Mathematical Illiteracy and Its Consequences," in which he tried to state the usual Two Child Problem. But to make an easier read, I suppose, he gave the known child a name: a girl named Myrtle. He still gave the usual answer of 1/3, which was wrong. J. L. Snell (Dartmouth) and R. Vanderbei (Princeton) pointed out his error in a 1995 article titled "Three Bewitching paradoxes", and gave the "close to 1/2" answer based on "pre-determined" logic that turns out to be wrong for more than just using "pre-determined" logic. Leonard Mlodinow (Stanford) changed the name to Florida in his 2008 book "The Drunkard's Walk," again making the extra error. Which is that it is never reasonable to assume there could be two children with the same name in a family; nor is its effect negligible if the name is rare. The correct pre-determined answer is that the chances are greater than 1/2, and get bigger as the name becomes rarer. I have even seen two treatments that tried to address that point, and got opposite (both wrong) answers.
I think the chance gets closer to 1/2 as the name/birthday delimitation/... gets rarer.
That's a nice remark, that two children in one family don't (usually!) have the same first name. So "born on Tuesday" needs a different solution from "first name Myrtle".
The moral is that we have to model (a) the selection process and (b) the gaining-of-information process and (c) exactly how these two processes are related. The exact sequence of selections / questions is everything. Richard Gill (talk) 08:51, 15 November 2012 (UTC)
It gets further from 1/2. The name of the girl in question is not the only name that must be treated this way. The probability for any name given to a second girl in a family changes, based on a weighted average of all possible names. Common names get rarer (because they are more likely to be used up) while rare ones become more common. If C is the probability for the name in the middle that doesn't change this way, and F is the (first-daughter) probability for the name Florida, then the probability Florida has a sister is (2+C-F)/(4+C-F) in Falk's "F-Scenario." Compare this to (2-F)/(4-F) when you don't prevent duplication, or the problem is about birthdays. JeffJor (talk) 22:43, 19 November 2012 (UTC)
Gary Foshee eliminated the extra error by changing the added condition to "born on a Tuesday" at a 2010 conference dedicated to Martin Gardner. He also gave the pre-determined answer, which is quite ironic because Gardner himself withdrew the pre-determined solution to the original quesiton!
The problem now is that Wikipedia is more interested in answers with sources to back them, than in correct answers. Just look at Monty Hall for an example of what I believe will heppen here if we include the variants. Which is also ironic, because the same people who will support the "pre-determined" answer here will insist it is wrong for Monty Hall (it is the 1/2 answer there). JeffJor (talk) 21:46, 12 November 2012 (UTC)
FascinatIng story! Somebody should publish it... Yes: Wikipedia can't arbitrate on right or wrong, it can only survey the literature. No problem. There is a growing literature on this variant (and a long history) and the paper by Ruma Falk must be considered "notable" as well as "reliable". Richard Gill (talk) 10:52, 14 November 2012 (UTC)

A call for an expert and some source material

Somebody google and find somebody who plainly explains this.
It should look something like this:

X has two kids, what are the odds they are boys, girls, or a mix?
Bb, Gg, Bg, Gb
1/4, 1/4, 1/4, 1/4
3/4 of time at least one boy
3/4 of time at least one girl
1/4 just boys
1/4 just girls

X has two kids, one is a boy, what are the odds they are both boys?
Bb, bB, Bg, Gb
1/4, 1/4, 1/4, 1/4
1/2 of the time, they are both boys
1/2 of the time, there is a girl

— Preceding unsigned comment added by 72.81.141.184 (talk) 04:58, 3 December 2012 (UTC)

You don't explain your notation and assumptions but the way I interpret your post, Bb and bB should'nt both be there. I guess you use a notation where the older child is a capital letter and the younger a small letter. Bb and bB is the same case. PrimeHunter (talk) 05:13, 3 December 2012 (UTC)

doesnt make sense

"Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?" there are only three options. boy- girl boy- boy girl- girl there seems to be an assumption to factor in age in the problem. but there is no mention of age. "two children. one boy. probability both boys?" really this roughly says what is the probability the other is a boy? the options are: {boy} or {girl} this means a one in two, 1:2, 50-50 chance. the article uses the age based table again and creates four possibilities again. but age isnt mentioned! — Preceding unsigned comment added by 204.29.66.88 (talk) 19:44, 24 April 2013 (UTC)

Age is just the chosen method to distinguish the children when the solution is explained. Height, weight or something else could also have been used. The article correctly says about an interpretion of the question: "From all families with two children, at least one of whom is a boy, a family is chosen at random. This would yield the answer of 1/3." If you don't believe this then try flipping two coins many times. Among those cases where at least one is heads, both will be heads approximately 1/3 of the time. If you flip one coin first and then the other (corresponding to older and younger child) then there are four equally likely possiblities: hh, ht, th, tt. Those with at least one heads are hh, ht, th, so if we know there is at least one h then hh has 1/3 chance. This probability doesn't magically change to 1/2 if the coins are flipped simultaneously. PrimeHunter (talk) 00:28, 25 April 2013 (UTC)
The basic problem with this question is that phrased in this way, it just doesn't make any sense. Probability just doesn't apply to any specific Mr. Smith. It can only apply to generic questions, possible Mr.s Smiths among a certain larger set of people. Even if we take the liberty to interpret the question about being not about a particular Mr. Smith but anyone like him, the way the question is phrased doesn't tell us what that larger set of people is. Rp (talk) 21:02, 25 April 2013 (UTC)
Indeed, the problem is an ill-posed problem. You can't answer the question without knowing how it came to be posed. That's exactly the point made by the authorities who write the problem. In fact the problem was invented in order to get people to think about these things. Richard Gill (talk) 07:12, 10 May 2013 (UTC)
Yes, the problem is usually not well formed - but not like you think. And the arguments about it are almost always fueled by the desire to get a specific answer, not to address the ambiguities in a consistent way. For example, "a family has two children" can mean it has any number greater than or equal to two, since a family with five also has two. So it is ambiguous. But we know, without it being said, that the description is supposed apply to the entire family, not to a part of it. But if we apply the same reasoning to "(at least) one is a boy," it would mean that it is a description that applies to one child, a specific child. That makes the answer the same as the "older boy" version. Most people who naively answer 1/2 won't be able to express this difference, but they are still using the meaning I described to formulate their solution. Those who want 1/3 to be correct will insist they are wrong. It may not be what was intended, but it is still a valid interpretation, and isn't wrong.
There are ways to "fill in" ambiguous information in a puzzle like this, and usually they are applied without even realizing. One is called the Principle of Indifference. Few such puzzles will state the probabilities needed to solve them, especially when the cases are equivalent except in name. Like coin flips, die rolls, placement of prizes behind doors on game shows, etc. It is implicitly clear that we are supposed to treat them as equally likely and independent - even when, as with children's genders, it is known that they aren't in real life. The point isn't to reflect real-world situations, where dice might be loaded or game show hosts might be (illegally, I'll point out) biased, it is to get the same answer if the puzzle is reworded in an equivalent way. Assumptions must sometimes be made, but there are acceptable ways to do it. They just have to preserve this equivalence. Since the answer changes if we assume Door #1 gets the prize more often than Door #2, such an assumption invalid, but assuming they get it equally is valid. We usually apply this without even considering it.
So, if we know that a family of (exactly) two children includes a boy (see how this describes the family, and not a child?), then the answer depends on how we know that fact. We could always learn about boys unless there weren't any, we could always learn about girls unless there weren't any, or we could learn randomly from among the options that apply. But if we are to get the same answer when we reverse the question, and ask about girls, then only the "learn randomly" assumption is valid. So this isn't a point of ambiguity, even though it is usually pointed out as such. JeffJor (talk) 17:27, 10 May 2013 (UTC)

"Information about the child" section is wrong

If the question stated is "at least one child is a boy born on a tuesday" then the probability both are boys is 13/27.

Simply put we have four possibilities:(1) Older boy born on a tuesday;younger born on an arbitrary day (seven possibilities), (2) Older boy born on an arbitrary day (seven possibilities), younger boy born on a tuesday, (3) Older girl born on an arbitrary day (seven possibilities), younger boy born on a tuesday, (4) Older boy born on a tuesday, younger girl born on an arbitrary day (seven possibilities).

This suggest there are 4 times 7 = 28 possibilities of which 14 satisfy our result.

However there is double counting occurring in (1) and (2).

One possibility in (1) is Older boy born on a tuesday and younger boy born on a tuesday. This is also in (2). Thus one of the 28 is void and only in 13 out of 27 possibilities are both children boys. — Preceding unsigned comment added by 80.111.20.214 (talk) 17:32, 24 June 2013 (UTC)

The section isn't worded very well, so you may have missed the fact that it addresses the opposite probability question: "what are the chances of a boy and a girl?" It correctly gets 14/27 (if "Tuesday Boy" is a requirement for selection) and 1/2 (if just one combination of day+gender is observed), which agrees with your calculations. The answer based on requirements changes from 1/3 because a two-boy family is nearly twice as likely to meet the requirement as a one-boy family would be. The answer based on observation stays the same because, as intuition suggests, the additional information is irrelevant to an observation. 74.107.123.96 (talk) 22:00, 25 July 2013 (UTC)

Is the ambiguity an issue of polling vs random events?

It seems to me that the answer to this paradox is contingent upon how the information is obtained... and this wiki article does not seem to highlight this properly. It is filled with ambiguous examples that do not clarify how the information is obtained.

For instance, if we poll the entire population of two child families and ask "is at least one of your children a boy?", then for those that answer yes the odds of the second child being a boy are indisputably 1/3. If we poll a random individual, and ask them "is at least one of your children a boy", then a yes answer would also lead to the 1/3 probability of the second child being a boy.

However, if information is volunteered or shown to us, then this is not the case. Parents with two girls (GG) would always say that one of their children is a girl, and parents with two boys (BB) would always say that one of their children is a boy. Parents with mixed gender children (BG or GB), however, would NOT always tell you their child is a boy. In fact, excluding gender bias, you'd have a 50/50 chance as to whether they mention that their child is a boy or a girl. Half the time they would say "one of my children is a boy" and the other half of the time they would say "one of my children is a girl".

Families with mixed gender (GB and BG) children make up 2/3 of the families with boys and have a 50% chance of volunteering information about their sons instead of their daughters. This causes them to have a reduced representation in random events. In other words, even if a family has a boy, they will not always state that they have a son - instead they might state they have a daughter. The families with two boys (BB) only make up 1/3 of the families with boys, but they are fully represented. The two groups of BB and BG+GB end up with equal representation in random events.

A good way to visualize this is the following scenario, involving a group of parents with two children:

BB 25% (250 families) BG 25% (250 families) GB 25% (250 families) GG 25% (250 families)

Lets say we have 1000 families with two children. On Monday night, all the mothers in the town take one of their children to the school for a Parent/child night. If the child is picked at random, without gender bias, then only 50% of the BG or GB mothers will bring a boy with them. The distributions would be as follows:

BB 100% boy participation (250 mothers w/ boys) BG 50% boy participation (125 mothers w/ boys) GB 50% boy participation (125 mothers w/ boys) GG 0% boy participation (0 mothers w/ boys)

From the stats, you can see that 250 of the mothers w/ a boy have another son. The other 250 mother w/ a boy (125 from BG and 125 from GB) have a daughter. Randomly running into a mother & son pair would yield a 50/50 chance that the mother will have another boy. So, contrary to many of the examples in the article, a mother who has "at least one boy" has a 1/2 chance of having another son.

Its obvious that there is a big difference between polling for the information and a random event. I suggest we make the following distinctions in the article:

1) If the question is asked "do you have at least one boy?", and answered yes, then the odds the other child is a boy is 1/3.

2) If the information is volunteered or shown to us, without any prior directions or statements on our part, the odds of the second child's gender are always 50/50. Such scenarios include:

a) A father of two randomly states that he has a child that is a boy.
b) A mother of two randomly shows up in the park with a son.
c) Your wife casually mentions to you that the two child family across the 
   street has a son, whom she saw (at random) playing in the street.

I propose that the article be adjusted to clearly define these major statistical difference and clean up the ambiguity in this article. While there is an attempt in certain sections to make this distinction, there are still plenty ambiguous examples and several incorrect examples.

You are correct, of course. But it needs to be pulled from a reliable reference to be included in a Wikipedia article, not just put in your words. The article does mention several such references, and how they point out that the question is ambiguous. But if you want to do it, I suggest looking for an article titled "The Two-Child Paradox Reborn?" by Stephen Marks and Gary Smith out of Pomona College. JeffJor (talk) 21:21, 17 December 2013 (UTC)

First question vs Second question sections conflict

There are conflicting difference between "First question" and "Second question" sections. Actually, it is impossible "to specify the randomizing procedure" in both cases. But only "Second question" section have this information. There are no criteria specified to choose "possible events" in both cases, that's the source of the paradox. That criteria is significant. E.g. if Mr. Jones-Smith is the same person, then answer is 0% for both problems.

ThermIt (talk) 07:31, 1 December 2014 (UTC)

A request to compress the article

This article takes 13 pages of shit and only last one, the psychology, explains the issue and resolution. That section says that people mistakenly rule out the girl-boy possibility, in addition to girl-girl, which makes chances of boy-boy 1/2 rather than 1/3. That is basically all you need to know about the paradox. Can we leave only this section or lift it on very top, above all discussions of secondary importance? --Javalenok (talk) 12:38, 30 June 2015 (UTC)

Tuesday's child variant

Has anyone found much reliably sourced discussion of the fact some people seem to make a (IMO) mistake with the Tuesday's child variant in arguing that Tuesday is irrelevant while accepting that boy is relevant even though most common wordings suggest it's either one or the other? Under most common wordings including Gary Foshee's original, it seems to me either boy and Tuesday are irrelevant and the answer is 1/2, or they are both relevant and the answer is 13/27. There are some rare wordings/scenarios, e.g. the BBC have one, where you can come up with a 1/3 answer for the Tuesday's child variant but some people seem to give the 1/3 answer even when it doesn't seem to make sense (as User:JeffJor sort of mentioned above). Nil Einne (talk) 08:45, 22 July 2015 (UTC)

Argument that "Tuesday" problem is meaningless.

The main argument of the "Tuesday is meaningful" people seems to boil down to: "The seemingly irrelevant information is relevant, because when I take it in to account, it changes the outcome"'. Yes, of course it does, but you have not proven or disproven anything, and you have not demonstrated if the information is in fact relevant or not.

Putting things in algebraic form and doing some math doesn't prove or disprove anything, you can do this with complete gibberish, and "prove" things. One needs to be careful that the premise and data are actually meaningful.

See the section below for further amusement. If you don't have time, how about this:

Does it matter if we pick a different day in the examples people posted? Monday? Friday? No, the "calculation" people use remains the same, so it seems not to matter what day the child was born on, just that you know it was born on "a day", but you already know that, even if we don't mention the day, so we have an odd logical contradiction here. Basically this joins up with BrianEButle's tautology argument.


In other words:

The outcome, according to the logic and calculations used is the same in all these cases:

Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Monday. =
Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Tuesday. =
Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Wednesday. =
Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Thursday. =
Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Friday. =
Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Saturday. =
Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a Sunday.

They are all logical, and statistically equivalent. And are thus equal to:

Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was born on a day.

Which you also know if we don't tell you, because the child must have been born on a day. The information does not add anything. Which means this is equal to:

Suppose we were told that Mr. Smith has two children, and one of them is a boy.

Which is a contradiction, because according to people's logic/statistics this is different, but I've just shown it's not. thereby proving that the "Tuesday" information is meaningless.

Done.

VxvUy7HoDiHDUISzEwYHQY2wYt7Tty3k (talk) 23:39, 17 April 2016 (UTC)

Let's ignore the formulation "the boy" which can give the impression there is only one boy. The article proves that "born on a Tuesday" does matter if you assume Mr. Smith is taken randomly from all people who have exactly two children of which at least one is a boy born on a Tuesday. If you make other assumptions then the result can change. You don't state your assumptions so it's not clear why you claim "And are thus equal to". It may be counter intuitive that it can matter and I get the impression you just trust your intuition and don't understand the maths that show it matters. But as said earlier, we follow reliable sources. I don't have to convince you it matters. You have to give reliable sources claiming it doesn't matter. PrimeHunter (talk) 05:16, 18 April 2016 (UTC)
You're absolutely right in your last couple of sentences, especially in light of wikipedia guidelines. However, Just for a moment, for the sake of argment, try to follow what I say, you can go back to believing whatever you want afterwards, no harm done. Just for your amusement.
THEIR math "proves" that Tuesday matters. but in THEIR math I can substitute Tuesday with Monday, Wednesday, etc. it will not make a difference in THEIR math. So it doesn't matter what day I pick, the math is the same. So the information that the child who's sex we know was born on "a day" matters vs not being told that it was born on "a day". Ok. But it is a given that the child was born on "a day" (any day). So THEIR math gives a logical self-contradiction. A very fun one don't you think? This is not vague intuition, it's a good/fun point, as was the tautology point made by the other guy. I haven't seen you make a point. The only counter arguments you make is telling people they don't understand the math (without further explanation), telling them they are in minority for thinking something (a non argument), and "reliable sources say so" argument, without mentioning, or examining the sources. No offence, but can't you see your replies (not just to me) are pretty hollow?
Again, you can plug in any kind of data into equations and it will spit something out. And sure, it will be algebraically correct, because you've followed the rules. But that doesn't mean you've proven anything. This can be easily demonstrated. You can take something nonsensical, and run off with it mathematically, and everything you do on paper will look good, and will be according to the rules. But you're doing nonsense math. This is exactly what all these people are doing. Do their equations work out? Sure. But is what they are doing with these equations meaningful? Not many people seems to ask that question.
Here is why it doesn't make sense: The distribution already exists, before the moment of choice. Say I have flipped 100 coins, and I ask you some questions about them. Are my questions, or your awnsers going to change the distribution of the coins? In other words, say I have (unknown to you) 47 tails, and 53 heads, is that distribution going to change? Is the order in which they were flipped changed? No, and no.
Mr. Smith's children have already been born. If I take 1000 families with two children, they have all been born, and they will all have a certain distribution. The distribution at birth is what determines the (statistical) population. There will be roughly as many boy and girls. Some families will have 2 boys, some 2 girls, some a boy and a girl. This distribution doesn't change, it is set at birth. You asking questions about it is not going to retroactivley change the distribution. You sampling the distribution in different ways is not going to change it either (although this might change/distort your view of it).
With some puzzles the information does matter. For example in the Monty Hall problem. Why? Because there it is about YOUR guess, YOUR odds, so what information YOU have matters. In that way it might impact your perspective of the information. But giving you more information, is not going to change the fact that 1 in 3 doors has a prize, what changes is the chance you pick the correct one. It's not suddenly 1 in 2 doors, or no doors. The 1 in 3 doors has a prize distribution is "unchangable" because that distribution is a given before the question. Likewise the distribution of children is a given before the question. You are not creating that distribution while you solve the puzzle, but that's exactly what all these people are doing. Inception!
Now, I can see how one could take a sort of quantum-mechanical approach (instead of a naive logical one). Where the observations matter, where possibly two people could observe different distributions, and both be right (even empirically) from their point of view. But then I'd like to see some more quantum mechanical maths trown in, and a general page where this phenomena (a generalization of it) is better set out, or in fact set out at all. With good solid sources.
I mean, have you looked at the page? Googled some more things? There is wild disagreement on this, and very inconsistent. And not just this side vs that side, no, there are 20 different sides, they all prove that their maths is correct, no it's 2/3, no it's 0.45672, no it's 0.5431, not it's this, no it's that, bla bla bla. Doesn't that tell you something? It's not professional, I've never seen that in professional mathematics. It reeks of hobby-ism.
And I'm not here to pick a fight, get my way, or whatever. There are pretty solid, valid criticisms made against the information on the page, and these criticism are just brushed off. But you're absolutely right in your last couple of sentences. So you've successfully killed the (basically any) discussion one can have on this, or any, topic. Mission accomplished.
VxvUy7HoDiHDUISzEwYHQY2wYt7Tty3k (talk) 16:02, 18 April 2016 (UTC)

@VxvUy7HoDiHDUISzEwYHQY2wYt7Tty3k: There are basically three reasons people can get different answers:

  1. They work on different formulations which sound similar but aren't mathematically equivalent.
  2. They work on the same formulation but interpret it differently.
  3. They make wrong calculations.

This "paradox" has several formulations, several off those formulations are open to interpretation, and it's a famous but seemingly simple problem many hobby mathematicians feel qualified to write about but probably shouldn't have. This all contributes to the apparent confusion between the answers. But if a reasonably competent mathematician clearly states which formulation he is working with and how he interprets it then making a correct calculation is not so hard and doesn't involve plugging numbers into arbitrary equations and getting nonsensical results.

Let me try to explain one version with a minimum of probability theory and formula complexity by giving actual counts and using expected cases instead of probability formulas.

Let this be the formulation we work with: "Mr. Smith is taken randomly from all people who have exactly two children of which at least one is a boy born on a Tuesday."

Suppose there are 196 Mr. Smiths with exactly two children before gender is considered. Two children must always be born at different times, even for twins. Assume each birth has 50% chance of being a boy (the type of standard assumption made even though it doesn't match real birth statistics). Also assume boys and girls have the same chance to survive until the puzzle (doesn't match reality either). Let BG, GB, GG and BB be the four possibilities of the gender of the younger and older child. We expect 196/4 = 49 of each of the four possibilities. The 49 cases with GG are discarded so 49+49+49 = 147 cases remain with at least one boy. If we only consider cases where at least one boy is born on a Tuesday then we expect 1/7 of the 49 BG cases to remain. That's 7 BG cases. Same for 7 GB cases. But now consider the 49 BB cases. We expect 7 cases where the first boy is born Tuesday and 7 cases where the second boy is born Tuesday. The case where both boys were born Tuesday was counted twice so the expected number of BB cases with at least one boy born on a Tuesday is 7+7-1 = 13.

We now have 7 + 7 + 13 = 27 cases left when BG, GB and BB are considered together and at least one boy is born on a Tuesday. If we pick a random of these 27 cases then each one has the same probability and the chance of getting a boy and a girl is (7+7)/27 = 14/27 = 0.5185185..., the number given in Boy or Girl paradox#Information about the child.

The essential thing here is that before picking a random Mr. Smith, we have already reduced the pool of Mr. Smiths to those where there is at least one boy who is born on a Tuesday.

If we had only reduced the pool to those with at least one boy and then pick a random case then the chance of a boy and a girl would become (BG + GB)/(BG + GB + BB) = (49 + 49)/147 = 2/3.

Consider this formulation similar to your own: "Mr. Smith is taken randomly from all people who have exactly two children of which at least one is a boy born on a day."

"born on a day" doesn't limit the pool further before picking a random Mr. Smith so the chance of a boy and girl is still (49+49)/147 = 2/3. The reason the earlier Tuesday case gives a different answer is that there the original BB cases are more likely to remain in the pool than each of the BG and GB cases, since the BB cases have two chances of a boy who was born on a Tuesday.

If we had reduced the pool to those where the oldest child is a boy and then pick a random case then the chance of a boy and a girl would be GB/(GB + BB) = 49/(49+49) = 1/2. Only one of BG and GB is allowed here before picking Mr. Smith so the probabilities change. Many interpretations of different formulations correspond to this case or something equivalent, for example "Assume we see a random of Mr. Smiths children and it is a boy".

I hope this makes sense. I'm not spending that long again. PrimeHunter (talk) 03:55, 19 April 2016 (UTC)

I know it's not a forum, but +200 wiki points for taking the time.
"reducing the pool" was/is clear. I guess in the "Tuesday" problem it somehow seemed more arbitrary, and unclear, possibly because of all the murkiness surrounding this puzzle, with apparently many explanation both correct (depending on interpretation and execution) and incorrect. It is also a shame no one makes a nice generalization, let alone a well explained one, that would show what's going on fundamentally (not just with this particular paradox). Your interpretation and explanation are clear, and not that different from similar ones (formal or not), but the last two paragraphs make it better than the ones I've seen, and demonstrates the flaw in my "born on a day" argument. One can say "a day", as long as that "a day" is the same for all boys selected, and to do that one could argue it must be specified, in that case, there is no contradiction. Embarrassingly obvious in hindsight. I'd almost day delete this section as irrelevant, but then; why not leave it as a monument to embarrassment. VxvUy7HoDiHDUISzEwYHQY2wYt7Tty3k (talk) 19:09, 19 April 2016 (UTC)

Misuse of additional assumptions in the explanation of the paradox

The article currently states that [t]he paradox occurs when it is not known how the statement "at least one is a boy" was generated. But this is not true. Instead, the paradox occurs when the question of how the conditional came to be the conditional is considered. It is correctly pointed out that different assumptions lead to different results and that the paradox arises because one might mistakenly assume the different assumptions to be equivalent, but it is not pointed out that the assumptions itself are completely unnecessary because the derivation of the correct answer one third in the section "Bayesian analysis" actually makes no assumptions whatsoever about the origin of the information at all. As a result, the false impression is given that one of the assumptions has to be made at all. 178.200.19.178 (talk) 12:47, 7 January 2016 (UTC)

The Bayesian analysis you refer to assumes that the events "there is at least one boy" and "we know there is at least one boy" represent the same set of outcomes. So it is implicitly assuming that we learned this fact by asking about a boy. That is, it takes one of those assumptions that you said were unnecessary as a given.
The point is that "information" and "events" are not the same thing. Probability is based on events, and to convert one to the other you need to how the information was gained. This was first demonstrated in 1889 by Joseph Bertrand in his Box Problem. It is sometimes called the Box Paradox, but to Bertrand the word "paradox" referred to how he showed that an analysis like that Bayesian analysis produce contradictions. For example, say there is an ink blob in your textbook, and the problem you see is "Mr. Smith has two children. The gender of at least one is ****. What is the probability that Mr. Smith has a boy and a girl?" The Bayesian analysis says the answer is 2/3 whether the word obliterated by the ink blob is "boy" or "girl." But if you don't need to see the word to get the answer, then the chances have to be 2/3 that any man with two children has a boy and a girl. Paradox. Or, in the Monty Hall Problem: if the same Bayesian Analysis is applied to the information "Door #3 has a goat," then Door #1 and Door #2 each have a 50% chance to have the car. JeffJor (talk) 17:37, 28 July 2016 (UTC)

As a further note, I would actually suggest that a lot of the article should be rewritten. The calculations were written down without LaTeX, notation is used in ways that is used both inconsistent and contradictory, the "Second question" section abandonds the methodology employed for the first question and immediately claims that the assumption of "how children were considered" is necessary, it is redundant with the section "Bayesian analysis" which features no references, the "Martingale analysis" section has no purpose or references, the "Psychological investigation" section (incorrectly) claims that there is no correct answer without and the important subtlety of the paradox arising because one might be tempted to assume to label the children as "the child I know to be a boy" and "the other child" is buried there. This article is a misleading mess. 178.200.19.178 (talk) 16:15, 7 January 2016 (UTC)

The two problems are equivalent

Let's consider the premises for the second problem. (1) Mr Smith has two children. (2) At least one child is a boy. What can you conclude about the sex of "the other child"? Let us add a third premise: (3) Either the boy is the oldest child or the boy is the youngest child. This is a tautology so we are not changing the information we are given. However, if we consider the two possibilities raised in premise(3), each reduces the problem to the first version and, thus, the answer is .5. Any claims of ambiguity must be dismissed. — Preceding unsigned comment added by BrianEButler (talkcontribs) 17:48, 26 June 2015 (UTC)

This is what cognitive scientists call a "cognitive illusion". Other examples are sentences such as "No head injury is too trivial to ignore" and logic problems such as "Either you have a King and an Ace in your hand or else you have a Queen and an Ace in your hand. You have a King. Do you have an Ace?" — Preceding unsigned comment added by BrianEButler (talkcontribs) 17:52, 26 June 2015 (UTC)

Reliable sources disagree with you (and so do I). Wikipedia's content is based on reliable sources so the article should continue to talk about ambiguity. Also, you changed the formulation by talking about "the other child" and "the boy". The second problem doesn't say "the" and I think it's unclear what "the boy" would mean if there are actually two boys but we don't have knowledge that a specific of the children is a boy. PrimeHunter (talk) 01:43, 27 June 2015 (UTC)
Someone makes a pretty solid point, which doesn't get adressed. Instead we get a "reliable sources disagree with you" nonsense reply, with some meaningless nit-picking, completely (and I suspect deliberately, but perhaps not consciously) missing the point.
Here's the problem with the content on the page, and the attempt of people made to tackle the problem. You can take some data, then plug it in some algebraic (your favorite statistical) equation, and something will come out on the other end. And because you've used the rules of algebra, it will be "mathematically" correct. But there is one thing people forget here, that is that the machinery of equations doesn't check if the input is meaningful, or makes sense.
The main argument seems to boil down to: "The seemingly irrelevant information is relevant, because when I take it in to account, it changes the outcome"'. Yes, of course it does, but you have not proven or disproven anything, and you have not demonstrated if the information is in fact relevant or not.
In this case there are several very solid arguments to be made that what people are trying to do here doesn't make sense. Let me just shake a couple out of my sleeve, without any effort:
1. Does it matter if we pick a different day in the examples people posted? Monday? Friday? No, the "calculation" people use remains the same, so it seems not to matter what day the child was born on, just that you know it was born on "a day", but you already know that, even if we don't mention the day, so we have an odd logical contradiction here. Basically this joins up with BrianEButle's tautology argument.
2. In many of the problems that are considered similar, like the Monty Hall problem, the chance refers to something about the person making the guess (and not about the distribution about which the guess is made). Obviously, if you give such a person more/different information, that is going to affect the statistic of the guess. But in this case we want to know what the chances are if the "the other child", is a boy or a girl, in other words, it's a question directly about a (external) distribution. The statistical distribution of the population doesn't change now that we suddenly know it's Tuesday. That knowledge doesn't affect the distribution.
3. Why stop at days? The boy is born on a Tuesday, and wears a red hat, bought in a store that sells 20 colours, the other child also wears a hat. Let's plug in the 20 colours into the equations! After all, colours retroactivly change the biological distriubtion in nature of male/female. The insanity is endless, and should hopefully be clear. You can change the statistics at whim. Not a good sign your solution is meaninglful.
4. I have tossed two coins, one is heads, what are the chances the other one is tails? Does this change if I tell you I tossed one of the coins on a Tuesday?
Don't tell me that using the word "coin", instead of the word "boy" is not logical equivalent, bla-bla. It's about the logic of what some people try do to here, think about what you're doing. Is it meaningful? Instead of just running away with it, plugging it in an equation, and tadaa "proof".
I am aware that the wording of some problem matters, I'm also aware that many times it doesn't. When one thinks it does, one should provide good argument, or preferably logical proof that it does.
Here's another interesting thing. Many similar problems, like (again), Monty Hall, can be verified emperically. I know, because I've done this, with several problems, to demonstrate them to people (from different angles), and I can't think of a meaningful way to incorporate the Tuesday information experimentally (because it isn't). Also, the outcome of these problems remains the same if tackled from different angles, both theoretical and emperically. Not the case with the Tuesday problem as you can clearly see with all the "ambiguity", which means all sort of alarms should be ringing in your head right now.
- A very reliable source (me).VxvUy7HoDiHDUISzEwYHQY2wYt7Tty3k (talk) 22:44, 17 April 2016 (UTC)
I'm not sure which formulation of the question in the boy or girl paradox you are arguing about or what you think is the correct answer to that formulation. However, there are certainly formulations of the question where reliable sources say there is ambiguity and different answers are possible depending on how you view the question. It's a core principle of Wikipedia that article content should be based on published reliable sources and not opinions or original research by the editors. See Wikipedia:Verifiability. PrimeHunter (talk) 23:50, 17 April 2016 (UTC)
At a glance the original problem, and analysis is fine. I'm talking about the "Tuesday" addition (and at a glance similar additions). This is also the problem that seems to be responsible for most of the "ambiguity". I've added a section (24 I believe) to the talk page, showing a simple argument why the entire "Tuesday" problem is meaningless (and all of it's different solutions). I'm sorry if the sources are potentially wrong. My apologies.
Sure, maybe you can't use my argument (in section 24 of the talk page) to update the page because it is original content, and not a source from outside wikipedia. But don't we have some responsibility to check whether or not information is correct? Or do we just cite for the sake of it, just because some sources say so? Those sources could be wrong, or unreliable. I don't know any of the sources, I don't care. I don't care if the majority thinks this or that, I don't care if someone who has accidentally passed a math exam, wrote a book, and said something stupid. Publishing a paper also doesn't mean a thing anymore these days, as any good academic can tell you. I don't mean to be unkind, but that just the reality of things at the moment (and I mean this in a general sense, not just in the examples mention in this paragraph).
If a page is questionable, and it is being discussed on the talk page, let's discuss it. Let's not just dismiss people, with some "majority" argument. lots of people look for information on wikipedia, when they want more information on such problems, and they just accept things without thinking, we both know that. Let us at least make sure they are fed information of a higher quality, or mark the page as highly unreliable, or whatever.
But alright, I can see how this might not be the right place to make any argument of this sort on the contents of the page (although people seem to do just exactly that on this talk page). You could argue that I should take it somewhere else, and then it will find it's way to wikipedia. Nevertheless, my argument stands, feel free to hit people around the head with it (for your own amusement), and see what happens. I'll just go back into the shadows from which I came.
VxvUy7HoDiHDUISzEwYHQY2wYt7Tty3k (talk) 01:43, 18 April 2016 (UTC)
I'm starting to believe that some people just were not satisfied with 1/3 because it's not what they were expecting or what their calculations yielded. So, they started introducing several ifs and thens to change the problem to suit their expected result, or simply to discredit the original solution, and even going as far as to call the other people as too stupid to understand the reality, and therefor resulting in the formation of two camps: the 1/3ers and the 1/2ers. Then, probably many of the 1/3ers thought they might oversee something or it might be too complicated, so they decided, 'let the 1/2ers say what they think, cause there are some reputable people among them', and thus helping to promote a wrong way to look at the problem and this way to now even to be considered a valid solution. Or, I am simply to stupid to see the logic of the 1/2ers approach. OR (and this is my preferred approach atm), people just use their brain in so different ways, that either way can be the correct way to look at it, and that statistics is not that unambiguous as many (including me) are used to believe. I now see it a bit similar to the white-gold/black-blue dress situation. --Treysis (talk) 18:10, 12 November 2017 (UTC)