# User talk:Gill110951/Archive 3

## Everything wikipedia (and later citizendium) taught me on MHP

... got written up in a number of papers, some of survey nature, others more about my personal (professional) opinion. The intended readership is teachers and students of probability and statistics. The articles are also intended to supply a resource of Indisputable Mathematical Truths, including relationships between different proofs, and giving special care to show what mathematical assumptions are needed to get which results. This work started life as working notes on my home page, useful since I often include MHP in talks to lawyers and doctors and students about statistics and society. Later I was asked to contribute to the Springer international encyclopedia of statistical science and its open source companion and it was a welcome opportunity to write the article about MHP, which *I* would have liked to read, when first I learnt about the problem.

All are peer-reviewed and published now (except for the second on the list - extended version of the top one). Comments are welcome.

The Monty Hall Problem, Article in StatProb.com, the statistical societies' internet encyclopedia. (2011)
The Monty Hall Problem, extended version (includes technical appendices to previous). (written in 2011)
MHP is not a probability puzzle: it's a challenge in mathematical modelling, Statistica Neerlandica (written in 2010)
The three doors problems, Springer International Encyclopaedia of Statistical Science (written in 2009)

Editing in an area in which you have professional or academic expertise is not, in itself, a conflict of interest. Using material you yourself have written or published is allowed within reason, but only if it is relevant and conforms to the content policies. Excessive self-citation is strongly discouraged. When in doubt, defer to the community's opinion. In any case, citations should be in the third person and should not place undue emphasis on your work, giving proper due to the work of others, as in a review article.

How can one do otherwise than defer to the community's opinion? That's what wikipedia is about. Richard Gill (talk) 15:41, 21 February 2011 (UTC)

## Timeline for evidence in Monty Hall case

Please see Wikipedia talk:Arbitration/Requests/Case/Monty Hall problem/Evidence#Timeline for Evidence, Proposed Decision. On behalf of the Arbitration Committee, Dougweller (talk) 16:43, 21 February 2011 (UTC)

Thanks. Richard Gill (talk) 16:48, 21 February 2011 (UTC)

Re your question: I gladly affirm my permission for you to publish my little bit of humor elsewhere. Please note that it was submitted under the terms of the CC-BY-SA 3.0 and GFDL licenses that Wikipedia uses, so I don't think you need my explicit permission so long as you abide by the terms of either of those licenses when you republish it. If it's not too inconvenient, I'd be grateful if you provided me a link to the republished material if/when it appears elsewhere. (A simple note on my talk page will be great.) Best wishes, alanyst /talk/ 20:43, 21 February 2011 (UTC)

Will do. In order to rapidly increase the "reliable source" status of my recent papers, nowadays at the end of my mathematical or statistical conference talks on Lucia de Berk or quantum statistics I add some remarks on MHP and wikipedia, and urge my mathematical colleagues to write a note in their next paper, whether on algebraic topology or statistical learning: "On a totally unrelated note, Gill (2011) has written a crappy paper on the Monty Hall Problem". You see, I aim to rapidly raise its citation count above that of the infamous Morgan et al. (1991), source of all our wikipedia MHP woes. Richard Gill (talk) 00:54, 22 February 2011 (UTC)
Will it improve my Erdős number? That is above all else my chief concern. :-) alanyst /talk/ 01:51, 22 February 2011 (UTC)
Mine is 3. I wrote a paper with Mike Keane who has 2. Richard Gill (talk) 01:54, 22 February 2011 (UTC)

## de Clerk scandal illustrates Fisher < Peirce

Hi Richard!

I mentioned your not knowing me. Olle Häggström told me about your advocacy during the dinner at Allan Gut's retirement conference. Both of them have caught the popular-science bug, and have written books and made many public appearances discussing statistics (in Sweden). Either would make great Wikipedia editors, if asked by a respected colleague.

The de Berk scandal illustrates the problems of groupthink and bureaucratic power in Northern European societies: I have wondered whether these problems stem ultimately from Viking conformism, or Luther and Protestantism, or Kant, or Hegel, or random drift, or .... Such "worship of the state" (failure to protect citizens and civil society) is the "speck in our neighbor's eye", to which you alluded earlier.

The de Berk scandal also displays the "mote in the eye of statistics", fallacies of neo-Fisherian inference using "statistical models".

1. Fisher's "method" of assuming independent samples from an infinite population.
2. So-called "hypothesis testing" using LRTs on data generating the hypothesis.

These "methods" are typically defended with arguments (not about approximation bounds but) about asymptotics, which are irrelevant to statistical practice.

The de Clerk scandal is a disaster caused by the neo-Fisherian "methods". We must choose between Peirce or Fisher and even, alas, between Cox and Freedman!

Seriously & sincerely,  Kiefer.Wolfowitz  (Discussion) 02:25, 22 February 2011 (UTC)

Absolutely! (groupthink and bureaucratic power in Northern European societies). Certainly the peculiarities of miscarriages of justice in the Netherlands have a lot to do with the Dutch character: more efficient and self-organising than the Germans; Calvinism; the so-called pillared society (Dutch tolerance is actually indifference. Everyone minds their own business and gets on with earning money). The longer I have lived in NL the deeper I see the differences with UK. National cultures are very like people: their strengths tell you about their weaknesses, and vice versa.
I think there's a place for both Cox and Freedman. The difficulty is how to communicate to the people you do statistics for what you can do for them. What Cox will give them what Freedman will give them. Richard Gill (talk) 02:38, 22 February 2011 (UTC)
George Orwell's "England, my England" is a joy to read, with its celebration of gardening, tolerance and even encouraging of eccentrics, and reminder of how anti-militarist British popular culture has been---so that military personnel don't wear uniforms when they go to pubs, etc.
Granted, one would be joyed to have either assisting on any practical problem. However, I don't remember Cox's recent Principles discusses the difference between writing in the study protocol a likelihood analysis of a future experiment/study and doing a retrospective analysis (using AIC, etc.). This is an important distinction, as the de Clerk scandal shows.
Maybe I should have counterposed Freedman and your student van der Laan and your coauthor Robins, and not mentioned Cox, who may be the world's greatest statistician.  Kiefer.Wolfowitz  (Discussion) 02:48, 22 February 2011 (UTC)

### Quotes from Peirce

P.S. A quote from the end of the article on likelihood function:

"In many writings by Charles Sanders Peirce, model-based inference is distinguished from statistical procedures based on objective randomization. Peirce's preference for randomization-based inference is discussed in "Illustrations of the Logic of Science" (1877–1878) and "A Theory of Probable Inference" (1883)".[citation needed]

"probabilities that are strictly objective and at the same time very great, although they can never be absolutely conclusive, ought nevertheless to influence our preference for one hypothesis over another; but slight probabilities, even if objective, are not worth consideration; and merely subjective likelihoods should be disregarded altogether. For they are merely expressions of our preconceived notions" (7.227 in his Collected Papers[citation needed]).

"But experience must be our chart in economical navigation; and experience shows that likelihoods are treacherous guides. Nothing has caused so much waste of time and means, in all sorts of researchers, as inquirers' becoming so wedded to certain likelihoods as to forget all the other factors of the economy of research; so that, unless it be very solidly grounded, likelihood is far better disregarded, or nearly so; and even when it seems solidly grounded, it should be proceeded upon with a cautious tread, with an eye to other considerations, and recollection of the disasters caused." (Essential Peirce[citation needed], volume 2, pages 108–109)"

The neo-Fisherian "method" of "testing hypotheses" on the data generating them was labeled the most dangerous fallacy of induction by Peirce. (Maximum-likelihood estimation was the most popular fallacy!)[1] Reasoning and the Logic of Things (RLT) (The 1898 Lectures in Cambridge, MA)

1. ^ Pages 194-196 in

## My role in the case of Lucia de Berk

This material is moved from the talk page for the article about the living person Richard D. Gill, where there was a discussion about my role in the Lucia case. Here is a summary of what I did and how. One of many many people. But certainly statistics played an enormous role in the affair, and statisticians played a crucial role in getting Lucia a fair trial, at which she was completey exhonerated. The minister of justice kneeled in the dust, the Dutch taxpayer forked out millions in compensation, the lawyers and scientists learnt a lot about how to not let these things happen in future, and the medical community looks the other way as if they never had anything to do with it all. Richard Gill (talk) 08:37, 22 February 2011 (UTC)

Extended content
In the very first trial, 2001-2003, Lucia was sent to jail by bad statistics. At her appeal, 2004, the statistics was disguised as medical evidence. E.G. medical doctors gave evidence that normally they would have thought that certain deaths were natural, but because Lucia was so often present, they thought them suspicious. This was in fact the *sole* proof that two of the 7 deaths Lucia was supposed to have caused, were in fact unnatural deaths: *all* other medical experts (half a dozen, for each death) thought they were natural. But the judges could always find one medical expert to support the verdict which they had become committed to. Everybody, including the media and the statistics community, believed that Lucia had been convicted purely on medical evidence. Everybody forgot about the case.
There was a tricky legal obstruction to re-opening the case, because from the legal point of view there was no new evidence - there was only "new" interpretation of old evidence. (And even the "new" interpretation wasn't new). A scientist's interpretation of evidence is not evidence itself, it is just an opinion. And the judges are the ones who must choose which opinion to believe in, if any.
My petition was signed by numerous reputable scientists (including a Dutch Nobel prize winning physicist whose wife and daughters are medical doctors) in the Netherlands and abroad, as well as by nurses, doctors, lawyers, artists, musicians, ordinary decent folk (several thousand) and was delivered by me and others to the Minister for Justice. This also appeared on TV of course. An article got written in the Guardian and in Nature. Again, through the international connections of the statistical community (and via several other persons than me). The Dutch wikipedia article on Lucia, whose talk page was for a long time essentially a discussion forum on the case, got translated into English. All this international publicity, bad for the international business interests of Netherlands Ltd, imbued the authorities with some sense that something had to be done about it. At last the supreme court managed to get their act together and figured out a legal way that the trial could be reopened under the rules which they have written themselves. They've been constantly narrowing the legal definition of "new fact" for the last 80 years, it is now very far from the spirit of the 1926 law; unfortunately parliament and government is uninterested. "Trias Politica", is all parliamentarians ever told us about the case. Parliament is the organ which should tell the Supreme Court how parliament's own laws should be interpreted, but the wilfull ignorance there of "life on the ground" was appalling.
The new trial was open and unbiased. Medical experts were involved who for the first time had access to all medical dossiers and who were not personally connected to the original hospital authorities. Their conclusions were damning. So Lucia at last got a fair trial and it turned out that there were a lot of medical errors (diagnosis, medication) made by specialists at the hospital; while the nurses' professional behaviour was exemplary. The possibility of medical errors had never been investigated by the police or the prosecution. Yet many of the medical witnesses from the hospital must have been aware of them.
I'm now advocating some kind of enquiry into the origins of the case in order to learn from it for the future. It could so easily happen again, I believe it exposes a number of system failings which need to be brought into the open. However the minister of justice has publicly apologized and Lucia has received millions of Euros taxpayer's money in compensation. The legal and scientific professions have learnt from the debacle. The medical world is silent. The hospital is suing me for writing critical comments, but including some personal details about key figures in the case, on internet blogs where the case is still discussed. Richard Gill (talk) 15:21, 21 February 2011 (UTC)

## Compensation for Lucia

Hi Richard, Could you try to find a reference that Lucia de Berk did receive (undisclosed) compensation, please? Best regards,  Kiefer.Wolfowitz  (Discussion) 11:43, 14 March 2011 (UTC)

I guess in the Dutch newspapers, somewhere. I could also ask her lawyers. I'll do that. Richard Gill (talk) 16:16, 16 March 2011 (UTC)
Here are two references to the official fact of the amount of compensation having been agreed between Lucia and the Public Ministry. [1], [2]. I guess these media were reproducing a press release of the Ministry of Justice, which was distributed by ANP, the Dutch equivalent of Reuters. Richard Gill (talk) 09:30, 18 March 2011 (UTC).
Here is another good references [3]. The news was somehow leaked to a TV station, and later the ministry of justice confirmed it to ANP. But the whole thing has been done in secret and there is no official announcement of the compensation, let alone of the amount. No doubt this is part of the deal whereby Lucia's lawyers got the best for her which is possible. Also, all court documents in possession of experts or other outsiders are being returned, at the request of Lucia's lawyers, to be destroyed. No doubt this is also part of the deal.
Meanwhile I continue my fight to reveal the true story: hospital staff had actually corrected several dossies of babies, who had died because of medical errors, so as to avoid complaints by the parents. Later the medical specialists treating those babies gave evidence implicating Lucia as murderer. The suspicion that she was a serial killer was a wonderful opportunity to clean the slate, so to speak. The public prosecution service was infiltrated by close friends of the top medical authorites at the hospital who brought charges. Similarly, the top of the (ruling) socialist party, and the top editorship of the quality newspaper NRC - the last newspaper which so to speak "changed sides".
Unfortunately I cannot cite a reliable and published source for these stories, but I do have very good reason indeed to believe that it is all true. Call it the intermediate results of ongoing research.
Obviously there are strong incentives throughout the top of Dutch society to keep quiet about what really happened. This will surely have benefitted Lucia's lawyers in their mission to get her the best deal possible. Richard Gill (talk) 09:56, 18 March 2011 (UTC)

## Simple solution

Richard, Why are you that keen in stating that the simple solution is correct be it together with symmetry arguments, where it's just the simple solution without any further arguments we, the other editors are fighting, and that is not correct, as you yourself have said, as a solution to the full MHP. You seem to spread some smoke curtain as if you want people to accept the simple solution (in its simple form), because your authority guarantees it may be extended to be correct. It never will be, because as you also have agreed, the simple solution does not set out to consider the needed conditional probability. I should not have to ask you this. And I hope you remember, calculating (or determining) the conditional probability by symmetry arguments is one of the solving methods I showed, and I also like to promote in the article. But why do you want to refer to this as "simple solution", while you definitely know, it is something completely different than THE simple solution. WHY? Have you yourself in the past accepted and defended the simple solution (the simple one) as solution to the full MHP? And do you try to justify this error??? Nijdam (talk) 10:13, 22 February 2011 (UTC)

Of course the simple solution alone does not solve the "full problem". But I do not agree that you *have* to solve MHP with the "full problem". Simple solution plus symmetry, and symmetry plus simple solution, do solve the full problem. Actually, symmetry first tells you that you do not have to solve the full problem: you need not condition on stuff which is independent.
From the mathematical point of view the full conditional solution is just one way of showing that the simple solution is optimal (in the sense of achieving the highest possible overall success-rate), as well as good. Mathematics does not have moral or legal authority. Mathematics can't ever tell you that you *must* act in a certain way. It can only tell you that it is wise to act in a certain way. The applied mathematician must explain to his client why it is wise. In the real world there are many other issues, and maybe it is wise not to be wise in some respects. Richard Gill (talk) 11:27, 22 February 2011 (UTC)
Extended content
Perhaps you could respond to these questions I asked you yesterday, Richard? I think they are relevant to this discussion. Thank you. Glkanter (talk) 11:37, 22 February 2011 (UTC)
After lunch.
Perhaps you could respond to these questions I asked you on Monday, Richard? I think they are relevant to this discussion. Thank you. Glkanter (talk) 12:17, 23 February 2011 (UTC)
@Nijdam, continued. I do not agree that you have to compute a conditional probability. Symmetry can be used in advance. I think it is used intuitively by most ordinary people exactly in this way.
I also think that the optimality of 2/3 (overall success rate) is so obvious intuitively that there is hardly any reason to look at the conditional probability.
My "own research" consists in locating the mathematics which is already in reliable sources, even already in reliable sources on MHP, which express the common sense of a common sense guy like Glkanter or Martin Hogbin. My "authority" is irrelevant. The mathematical proof which you find in a standard statistics textbook in the chapter on Bayes Theorem, which is used to illustrate Bayes theorem, does not give any useful insight at all into MHP. Proofs by symmetry (before or after) and proofs by Bayes' rule do give insight into MHP. And they illustrate mathematical and statistical and scientific principles which are much more important for the general reader, and which don't require specific training in the probability calculus.
The probability calculus was invented for man, not man for the probability calculus. Laplace was the first who wrote out the formal calculus much as we use it today. He grounded it all in plain logic, in symmetry, in the prior concept of "equally likely". MHP is a brain-teaser and a paradox for everyone, also for your grandma. Give it back to the people. It is not the exclusive property of some cult of algebra-wielding formalists.
I would want our own maths and statistics students to learn the ideas which give insight and understanding. I don't teach my students to be formula manipulators. We have computers for that. Richard Gill (talk) 11:41, 22 February 2011 (UTC)
Richard, try to be clear. If you solve the "full" MHP, then say so. If you want to solve some other form of the MHP, make clear which form you mean. If you want to distinguish, then don't just speak of THE MHP, but tell us what you mean. And please, it would be much less confusing, if you didn't call the conditional solution using symmetry, a simple solution. Then you may think it is sufficient to KNOW that symmetry shows that the conditional probabilities are all the same as the unconditional, one, for a correct explanation it is not. You have to say so. And that's what I want. Give the people,and your and my grannies, what they are entitled to. Nijdam (talk) 20:58, 22 February 2011 (UTC)
I try to be clear. That means using different terminology in different contexts. I hope to use words such that that my intended audience understands what I mean. So what I mean when I say MHP depends on who I'm talking too. Richard Gill (talk) 22:24, 22 February 2011 (UTC)
You wrote: I do not agree that you have to compute a conditional probability. Symmetry can be used in advance. In advance of what, and with what purpose would you like to use symmetry?Nijdam (talk) 12:22, 23 February 2011 (UTC)
Symmetry shows that the door numbers are statistically independent of the door roles. No need to condition on anything which is irrelevant. Richard Gill (talk) 09:09, 24 February 2011 (UTC)
It it your morning temper that you write such s... What the hell means door numbers independent of their roles? When is something irrelevant? Nijdam (talk) 12:42, 24 February 2011 (UTC)
Please don't use foul language on my talk page. Please try reading what I write, and please try thinking about it. Richard Gill (talk) 13:24, 24 February 2011 (UTC)
You know what? Here are the numbers: 1, 2 and 3. Or do you prefer other numbers, like 367, 987 and 90112, Be my guest. Here are their roles, C, X and H, you now their meaning. Show me what it means that they are statistical independent. I tried very hard to understand, nut I only got a headache. Nijdam (talk) 22:52, 24 February 2011 (UTC)
[4] Richard Gill (talk) 06:20, 25 February 2011 (UTC)
Richard, you wrote on "Workshop": The word "condition" (conditions, conditional, etc.) . . . YES, but the *reasoning* why it is YES in the second case is just a tiny bit longer (add the words "by symmetry, it doesn't make a difference which door the host opens, so I would still bet at the same odds"). Is that of relevance also for the "host's two *combined* doors" aspect? Gerhardvalentin (talk) 10:06, 24 February 2011 (UTC)
I don't know what you mean. Richard Gill (talk) 13:24, 24 February 2011 (UTC)

## I Continue To Disagree With The Summary Of The Truth That You Just Posted On The Evidence Page

1. Would it be a fair bet at odds 2:1 that the car is not behind the door you first chose?

2. Would it be a fair bet at odds 2:1 that the car is not behind the door you first chose (door 1), after you saw the host open another door?

3. Would it be a fair bet at odds 2:1 that the car is not behind the door you first chose (door 1), after you saw the host open door 3?

1. Unconditional - can be solved by the tables of all possible outcomes as per Selvin & vos Savant

2. Simple Conditional - can be solved by logic, as depicted in the decision tree I have ascribed to Carlton

3. Formal (Complicated) Conditional - can be solved with Bayes or full blown decision tree conditioned on door #s

Posted by Glkanter (talk) 10:42, 25 February 2011 (UTC)

Fine by me to consider three different kinds of problems rather than two. What I've been saying, @Glkanter, is that problem 3 does not need to be solved by a lot of algebra. It can be solved by arithmetic based on the probabilities in a full decision tree. It can also be solved by adding a symmetry argument before or after a solution to problem 1 or 2. I added a small section to the MHP page with three different such solutions, all of them reliably sourced. Richard Gill (talk) 17:28, 26 February 2011 (UTC)
My words on the Evidence page "that's all there is" referred to the conflict between conditionalsts and simplists. The quarrel is about whether some extra words need to be said about the (ir)relevance of the door numbers. William Bell (1992) said that the equivalence of the two solutions, under symmetry, is almost too obvious for words. He's a respected scientist at the US Census Bureau who has widely published in the scientific literature. That's a reliable source if ever you want one. Nijdam is making a mountain out of a molehill, because he only knows how to solve MHP by doing tedious computations, going back to formal definitions. Those definitions were invented by Laplace in order to be able to formalize the intuitive logic of the primitive concept "equally likely" (a primitive concept is one not defined in terms of other theoretical concepts, but taken for granted).
Fortunately there are a whole load of different ways, all already known in the scientific literature, to give a complete solution to MHP which only make use of basic notions of symmetry, and all of which give insight into MHP. So there need not be any conflict at all. On the other hand, the long-winded way that MHP is customarily solved in statistics textbooks is intended to help students get used to the formalism of classical probability calculus. The approach is not intended to give insight into MHP. Richard Gill (talk) 13:52, 28 February 2011 (UTC)
Extended content
I hardly believe what I see. Is this really your opinion Glkanter? Nijdam (talk) 19:16, 25 February 2011 (UTC)
You'll need to be much more specific if you really expect a thoughtful response. Glkanter (talk)
Well, if this is what you think, I I do not understand your fight against the criticism, Rick and I would like to be mentioned. Nijdam (talk) 19:27, 25 February 2011 (UTC)
Which of the 6 statements above is problematic? Glkanter (talk) 19:32, 25 February 2011 (UTC)
The simple conditional problem is still problematic, because how can you see(!) the host opening a door without knowing which one. I.e there is no difference between problem 2 and 3. I only hope that when you say "can be solved", you also mean "is not solved nby the other solutions". Nijdam (talk) 22:47, 26 February 2011 (UTC)
Maybe it's better to say "after you heard the host opening another door" in problem number 2. And problem 3 doesn't have to be solved using Bayes theorem. There are at least two dramatically different ways to prove it using symmetry (indifference); and Bayes rule is more informative than Bayes theorem. Richard Gill (talk) 13:27, 28 February 2011 (UTC)
Well, Richard, considering that you previously actively denied the existence of #2 at all, I thrilled just the way it is in the evidence section. You know, the way the reliable sources published it. I'm only giving examples above. I would *never* dare to speak to Probability. Why did I have to post 5 times in this section before you would acknowledge that I was correct in my reading of the reliable sources? That's a big reason people think I'm uncivil and disruptive. Care to put together a scorecard on how our 'disputes' have been resolved over the last year or so? Glkanter (talk) 13:36, 28 February 2011 (UTC)
I think you misinterpret me, @Glkanter. Richard Gill (talk) 15:30, 28 February 2011 (UTC)
I can count to 5. Glkanter (talk) 15:56, 28 February 2011 (UTC)
• Nijdam, my disagreement with Richard is in regard to whether he has summarized the solutions put forth by the reliable sources in a thorough manner. I do not think he has. Matters of my, your's or Richard's opinions of "correctness" are of no interest to Wikipedia. Too bad Rick Block never explained that to you. Glkanter (talk) 23:17, 26 February 2011 (UTC)

You know, Richard, just between us two egotistical arrogant guys, in all the time we've argued, I haven't been swayed to your way of thinking a single time. But it seems that the stuff you say these days has been influenced by me. A lot. You've said I jump to conclusions, that I don't understand probability, that I don't understand the terminology,, that I'm close minded, and not well read. But you've never proven me 'wrong' about *anything*. And sometimes you've had to (or should have, at least) admit I was right all along.

Well, that's how it is with my solutions terminology, that you make a huge deal about being different than everybody else's. Tell me that I have the above 3 scenarios wrong, Richard. If you would have worked with me to develop what I have been saying into terms acceptable to you (and Nijdam), rather than deflecting my valid arguments with BS semantics like the other day, you would be helping the arbitration, the article, and me a heck of a lot more than you do by posting your erroneous version of "what it's all about".

That's really bad when I have been "hammering away" about that simple decision tree for the better part of a year now, and you continue to completely forget/contradict yourself/overlook/ignore/discredit (pick one or more, please) it. I find that inexplicable. And very, very, frustrating, as I continue to expect better from you, for reasons I can no longer explain. Glkanter (talk) 11:08, 25 February 2011 (UTC)

And by 'right', I only mean 'thoroughly summarizing the solutions as per reliable sources' to the single MHP problem statement. Glkanter (talk) 11:15, 25 February 2011 (UTC)

• Would it be a fair bet at odds 2:1 that the car is not behind the door you first chose (door 1), after you saw the host open another door?

right between the other two:

• Would it be a fair bet at odds 2:1 that the car is not behind the door you first chose?

and

• Would it be a fair bet at odds 2:1 that the car is not behind the door you first chose (door 1), after you saw the host open door 3?

Otherwise, it does not represent the writings of a significant majority of the reliable sources. Why would you intentionally want to do that? Glkanter (talk) 20:20, 26 February 2011 (UTC)

I added the intermediate line. Like Martin, I agree with your contentions concerning Rick and Nijdam. I am not playing word games when I say that the word "conditional" when used in the context "conditional probability" has a well-defined and somewhat technical meaning, which is not carried by the common usage of the word. To explain what it means we first have to settle on a common understanding of the word "probability". EG, frequentist or subjectivist. Tell me what you mean by "probability" and I'll try to tell you what all those authors of statistics textbooks mean by "conditional probability". Richard Gill (talk) 10:56, 27 February 2011 (UTC)

No, you added some typically vague and horribly ambiguous 'Richard' version of what the sources actually say, that nobody has ever seen written or discussed before. You leave out that a door has been opened. Thanks, for trying. I'm starting to see that you just can't do things any other way. Forget that I asked about 'probability'. We've been down that path way too many times. I hope you're happy with the outcomes of the arbitration, you will have had a great deal to do with it. Best wishes, Richard. Glkanter (talk) 11:08, 27 February 2011 (UTC)

"After the host has opened a door" is ambiguous. At the middle step it is important that the player hasn't seen which door was opened. He knows a door has been opened, but not which. I added that the player knows the host has opened another door but doesn't yet know which. Richard Gill (talk) 14:50, 27 February 2011 (UTC)

Vos Savant wrote:

"...opens another door..."

And the indifference is the whole point, which is not clear from your interpretation. You might as well delete it, because no source published that nonsense. Glkanter (talk) 15:03, 27 February 2011 (UTC)

I see you made the requested change. It's been like this all along. I have to beg you, fight through semantics and jargon, and then you finally (sometimes) do exactly as I requested at the onset. The whole important, relavent point is long obscured. Then Rick Block uses these exchanges to support his incivility accusations. Glkanter (talk)

It is quite unfair what you say about my opinion. As you well know I only say that the full MHP is solved by considering a conditional probability. How we get to know its value is unimportant, as I have indicated several times. We can formally use Bayes' theorem, or as I wrote already 2 years ago, we can use a symmetry argument to ease its calculation, No big deal. Although you seem to make a mountain out of this mole hill. And, as you agreed yourself, the full MHP is not solved by the simple solution.Nijdam (talk) 16:13, 28 February 2011 (UTC)
If you define "the full MHP" to be the problem to find the conditional probability, then of course the computation of the unconditional probability is not enough. You also have to say why the desired conditional probability is equal to the computed unconditional probability. But I do not agree that you *have* to solve MHP by finding a conditional probability. (And of course Nijdam, I do know that you know that there are mathematically many different routes to find the conditional probability).
>Open door. Which MHP do you have in mind? Let us try to be specific about the versions we consider. Full MHP = (like) K&W version; simple MHP: the decision is asked from the audience (us) before the player made her initial choice. Okay? Nijdam (talk) 11:40, 1 March 2011 (UTC)
For me, MHP = Vos Savant's words. The player will reveal his decision after the host has shown the goat but he can make it earlier if he wishes. Richard Gill (talk) 20:24, 1 March 2011 (UTC)
But it is also possible in many ways to show that the overall win-chance-by-switching of 2/3 cannot be improved, hence that the value of the conditional probability is irrelevant. One way to do that is as follows. Pretend that the player's initial choice is random, and that by chance he happened to pick door number 1. Then by symmetry of the player's initial beliefs concerning location of the car and the goat-door-opening of the host, whether or not the player's initial choice hides a car is statistically independent of the three numbers written on the door chosen, the door opened, the door left closed. The numbers on these doors are independent of their roles (whether or not they hide a car). And by symmetry, the triple of door numbers is a random permutation of (1,2,3).
>I take it that with "irrelevant" you mean they all are equal to the unconditional. Fine with me, but nevertheless we calculate (derive) the conditional probabilities. And why should I pretend the player's pick is random? No need. Nijdam (talk) 11:40, 1 March 2011 (UTC)
Why condition on information which is known in advance to be irrelevant? And why not pretend the player's pick is random? Without loss of generality, it might as well have been random, and just happened to result in the choice Door 1. I find the full symmetry of ${\displaystyle S_{3}}$, the set of permutations of (1,2,3), more intuitively appealing than the symmetry of ${\displaystyle S_{2}}$, the set of permuations of (2,3). Independence is a more basic concept than conditional probability. Laplace's calculus of probability was based on the intuitive concepts "equally likely" and "independent". Conditional probability was a derived concept. It was defined in order to make the chain-rule hold. If MHP can be solved using probability calculus where probabilities are defined by symmetry (indifference), then MHP can be solved without probability calculus but using primitive concepts of "equally likely" and "independence" only, all following from the symmetries of ignorance. It can be solved with common sense logic only, so that the great unwashed and your grandma also can see the full solution. No need for high school probability calculus of for arithmetic. Just intuitive, correct, logic. Richard Gill (talk) 20:24, 1 March 2011 (UTC)
This argument says: because the player is a priori indifferent to the door numbers, he is a posteriori indifferent to the door numbers too. The only thing that is relevant to him is the roles of the doors, more precisely, the relation between the roles defined by the observed actions of himself and the host, and the roles of interest to him - whether they hide goats or cars.
> Also here some more reasoning than in the simple solution, is necessary. Nijdam (talk) 11:40, 1 March 2011 (UTC)
Yes, some more reasoning, but some more reasoning which is so intuitive and obvious that many find it unnecessary to write down explicitly. Including professional mathematicians and statisticians, writers of reliable sources. Richard Gill (talk) 20:24, 1 March 2011 (UTC)
This is how I interpret Marilyn Vos Savant's insistence that there was nothing wrong with her solution. Note that Craig Whitaker didn't specify any door numbers in his original question and Marilyn only added "say, Door 1" and "say, Door 3" as parenthetical side remarks, to help her readers visualise the situation. She didn't intend the specific door numbers to play any part in the solution and she had perfectly good reason for this. Richard Gill (talk) 09:54, 1 March 2011 (UTC)
> I interpret MvS insistence as a cover up for her omission. Nijdam (talk) 11:40, 1 March 2011 (UTC)
You would. That's an opinion, not a fact. Richard Gill (talk) 20:24, 1 March 2011 (UTC)

Please, Richard, you are preaching to the choir. And you're not addressing any of the points I *actually* made, are you?

Your new comment on the workshop page refers to "...the biased point of view which has previously infected the article in question." I believe that would more accurately and meaningfully read, "...which currently (and has for over 2 years) infects..."

Yes, my wording was ambiguous. I have improved it. Richard Gill (talk) 09:54, 1 March 2011 (UTC)

## MHP: Rosenthal's arguments that the simple solution is "shaky" are ... shaky

I've read your two papers on MHP and [5], and here's my take on Rosenthal's [non-]counterexamples:

### Monty Crawl

This is actually a subset of MHP as defined by you, meaning it's just a specified/fixed host strategy. Rosenthal doesn't actually finish his calculations, but finishing them does not disprove Proposition 1 (or the fist part of 4) from Gill 2011, on the contrary.

Consider the solution to Monty Crawl where the player chose door 2. The switch/stay chances of winning are:

• If goat is revealed at door 1, then it's 50/50
• If goat is revealed at door 3, then it's 100/0

Switching is therefore still advantageous (won't ever lower the player's chances of winning below half, but will increase them to certainty in a sub-case). This much can be already concluded from the expectiminimax sub-tree "pictured" above. Now, to compute a global/conditional win probability for switching (as a probabilities textbook would ask, but really unnecessary for answering the decision problem), you also need to make the standard assumption of uniform distribution of car/goats, which makes those two sub-branches equally likely, so the overall/conditional win probability is 3/4 for switching (better than 2/3) for this initial choice. (the overall is still 2/3, see correction below) This is of course no surprise once you know that randomization is the best strategy for the host (that plays fair). The other two cases (player choosing door 1 or 3 initially) result in the same probability.

Update: I have made an error above assuming the two bulleted cases are equally likely. This is in fact not so. There's no internal symmetry in Monty's choice. The overall chance of him revealing door 3 is only 1/3. (Proof of the general "Monty Small" case to be provided here). So, the chance of winning by switching is 1/3 * 1 + 2/3 * 1/2 = 2/3 (exactly as in MHP).
This result obviously raises the interesting question: is there any strategy Monty can do to help the player? There's actually a simple proof that no help is possible, i.e. the upper bound for probability of the player winning is 2/3 regardless of Monty's strategy. Consider Monty's choice what goat to reveal as a covert channel. If the initial car placement and the player's choice are independent random variables (thus the player guesses with probability 1/3 on his 1st pick), then with probability 2/3 the covert channel has 0 bits, because Monty's move is forced. In the remaining 1/3 of the cases, assume Monty can signal the position of the car with probability 1. But this gives an 1/3 unconditional probability of the covert channel indicating the winning door when it's the "stay" choice, no better than what happens by no collusion. Tijfo098 (talk) 05:43, 14 March 2011 (UTC)
Nice! You see, MHP is fascinating for many different academic communities, and it's so important to bring their insights together. The problem with the Wikipedia page is that the number of sources per category bears no relation at all to their quality or novelty. Some boring analysis gets duplicated in every standard textbook as a dull example of something else, and thereby becomes the majority weight in Wikipedia terms. You can't write a good encyclopedia article without making content-based value judgements! Writers on MHP can't be mere clerks. Thet have to be intellectually on their toes, too. And use taste and judgement. Richard Gill (talk) 08:32, 14 March 2011 (UTC)
And two decades late! Chun proved this in 1999: [6]— this is in fact the most comprehensive paper on MHP from a game theory perspective. It was bit hard to find, not in the least because Chun dresses it up in "information economics" terms (??), but it's pure game theory. Tijfo098 (talk) 10:01, 14 March 2011 (UTC)
Splendid! This us how it so often goes, editing Maths and science articles on wikipedia. People write out the explanation they give to their colleagues or students, people discuss it, some people check in the literature to see if it's already there in an easily quotable location. I never experienced before what I met with on the MHP page, talking about elementary probability, that you say something which is both common sense and a mathematical Truth, and people aggressively attack you for doing OR. No attempt whatever to understand what you are saying, try to say it in different words, locate it in the literature if possible. Now any mathematics more advanced than simple integer arithmetic is labelled "own research". That might be a "last resort" rule to apply when all else fails, but it seems to me to doom Wikipedia, as far as Maths is concerned, to become a mere ten years lagging behind copy of Maths wikipedias "owned" by mathematics communities. And similarly all over science. Is that what we want? Was that the intention of The Founder? Maybe.. Richard Gill (talk) 11:26, 14 March 2011 (UTC)

### Monty Fall

This is in fact a different problem, not a subset of MHP (as defined by you), although this may not be immediately apparent to some. What's really stated in the Fall version is that a priori the host can open any door, including the one selected by the player or one that contains the car. The ending for the game is unspecified in these two cases, so the only meaningful question is to ask whether switching is advantageous over the subset of cases where the game advances to the point that the switching decision is possible/required.

Clearly, in the cases where the MFP game advances to the switching decision point, the probably of winning is unaffected by switching (50/50). However, this does not convince (as R implies) that Monty being "equally likely to open either non-selected door" (emphasis mine) is necessary in some strongly ("callously") worded sense, for a nebulously stated purpose (presumably that the overall probablity of winning by switching is greater than half). What the Fall problem illustrates is that Monty opening a non-player-selected and goat-containing door is the source of additional information which makes the switching decision advantageous in the MHP. (Surely if Monty's choice is non-random but still subject to non-player & goat-contain restrictions, then switching is even more advantageous for the player, as proven by you and exemplified by MCP above.)

### Wikishite

Given Wonpooton and others active in this area, I won't post this WP:OR to the article's talk, but it's sure amusing to see Rosenthal conclude from his incomplete analysis of these [non-]counterexmaples that the simple proof is "shakey". But, hey, he's a WP:RS, and unlike you (COI, nah, nah) he doesn't argue for his points here, although he has some pseudonymous "true believer"s wiki-PhDs doing it for him, whether he knows it or not, and that's unimpeachable in wikiland. (In this matter, I for one disclaim any qualification stemming from academic titles for myself.) Tijfo098 (talk) 05:45, 12 March 2011 (UTC)

Agreed. But I think Rosenthal was writing in a way which would be understood by mathematicians, but not by plain folk. Mathematicians know that if an allegedly logical argument gives the right answer for one problem but the wrong answer for another problem, then there is something wrong with the logic. Indeed his argument "applies" to the usual (unbiased host or ignorant contestant) MHP, and to Monty Fall, giving the same answer 2/3 in both cases. That means there is something wrong with the argument. I would say that these two examples show that we do need to be interested in the conditional probability, not the unconditional probability. Then you can go ahead and compute the conditional probabilities for the two questions, and now you get two different answers.
Ordinary folk are usually not very impressed by logic. For instance, I know really smart physicists who just can't grasp how you can prove something is true by starting off by assuming it is not true ("reductio ad absurdam"). For a physicist, Nature is the ultimate arbiter. If you get the same answer as nature or experiment provides, then people will buy your argument, even if it is patently illogical. Richard Gill (talk) 15:10, 12 March 2011 (UTC)
Richard, you say, 'Mathematicians know that if an allegedly logical argument gives the right answer for one problem but the wrong answer for another problem, then there is something wrong with the logic', but I think that this is too strong. All arguments and calculations have a sphere of validity and generally the wider this sphere the better but that does not make a an argument that covers only a limited range of cases wrong. For example the cosine rule applies to all triangles but Pythagoras' theorem applies only the special case of right-angled triangles that does not make it wrong, just more limited in its uses. Martin Hogbin (talk) 16:03, 12 March 2011 (UTC)
I am talking about the argument, not about the result. If someone uses Pythagoras to compute the length of the side of a triangle, but doesn't give an argument for the triangle being right-angled, his argument is wrong. Rosenthal sees it this way (sorry Glkanter for my e.s.p.). Computing the marginal probability, instead of the conditional, coincidentally works, because of independence; just like Pythagoras works if coincidentally the triangle is right-angled. Rosenthal believes the logical argument is incorrect because he believes that you *must* find the conditional probability or alternatively, show why independence is true, so that the conditioning is irrelevant. (Pity, just like Nijdam, he doesn't say why he has this opionion). Richard Gill (talk) 16:09, 12 March 2011 (UTC)
It is the 'coincidentally' that I disagree with. It works because of an obvious and intuitive symmetry (for the symmetrical formulation or understanding of the problem). I would accept that the simple solutions (without a symmetry argument) are less rigorous than the same solutions with a symmetry argument but this is only a matter of degree. As usual there is not much to argue about, there is a whole spectrum of mathematical proof ranging from 'plausibility and hand-waving arguments, through 'shaky', through 'acceptable' to 'completely rigorous' and maybe we disagree a little as to exactly where on that line the simple solutions lie. I would add that the 'conditional' solutions given in the article are only marginally better than the simple ones in that they fail to address the possible door that the player might have initially chosen. Martin Hogbin (talk) 16:47, 12 March 2011 (UTC)
"Coincidentally" is a bit of a negative value-judgement. I agree that it has unfortunate connotations. From the point of view of mathematical logic and from the point of view that you must compute the conditional probability given everything that is relevant, one could charitably say that the argument is incomplete. My own opinion is that there are plenty of other good reasons to find the simple solutions perfectly fine. For instance, they make less assumptions! So they are more widely valid. And I agree that the extra mileage to an ordinary person in "completing" the proof is not worth the nuisance. After all, checking that the conditional probability is also 2/3 (or more precisely, at least 1/2) is mathematically equivalent to proving that the marginal success probability of 2/3 cannot be improved. But who in their right mind could imagine that one can do better still than 2/3?
I am still looking for a very short and mathematically completely rigorous argument why 2/3 (unconditional) cannot be beaten. Once we have that the simple solution together with the logic of why 2/3 can't be beaten when the car is hidden uniform randomly (or our knowledge of its location is equivalent to that) is a complete solution. Once I have discovered that I will publish it, and then wikipedia editors of the monty hall page can cite it. Or not, just as they like. Richard Gill (talk) 10:20, 13 March 2011 (UTC)
Good luck, I will be happy to add any new rigorous and convincing solutions to the MHP article. Martin Hogbin (talk) 10:29, 13 March 2011 (UTC)
Thanks! I plan to continue in the discussions on the talk page of MHP, where I will certainly feel free to draw attention to any new developments by myself or others which I think might be useful. Richard Gill (talk) 10:34, 13 March 2011 (UTC)
@Martin: We Did It! More precisely: Lambiam did it, and/but I think the game theorists already knew this. Richard Gill (talk) 16:21, 16 March 2011 (UTC)
I don't want to start an argument but does it not need to be published somewhere before it can be added? Martin Hogbin (talk) 18:36, 16 March 2011 (UTC)
No problem. If I write out the argument on my university home page you have your reliable source, if you want to use it. Richard Gill (talk) 15:38, 17 March 2011 (UTC)
Your wish is my command. See my homepage in Leiden, or if you prefer Appendix 3 of [7] (version of 17 March). Richard Gill (talk) 17:37, 17 March 2011 (UTC)

## "Unconditional" might be a poor choice of language?

I was trying to figure out why some editors are so miffed by some of the solutions (more of a challenge than the problem itself), and I think the issue is with the language used in some publications, regrettably, including in Gill 2011. The troubling word appears to be "unconditional" used to refer to the probability from Proposition 1 in Gill 2011 (but you don't seem to be the one to have invented the terminology for this problem). This probability is only independent of Monty's (the host/quizzer's) strategy, as long as Monty is not allowed to cheat (by [re]moving the car as in Three-card Monte or by forcing an unspecified ending as in MFP by opening the car or player's door). It obviously depends on the initial probability of the player/quizee guessing and the assumption that a non-player goat door is opened by Monty. To make this even more obvious: If the player is forced to use to a (probabilistic) oracle to place the initial bet with probability greater than 1/2 on the winning door, then switching is not advantageous for him under MHP with an optimal/randomizing host. A slightly more insightful trick is to rephrase it as an optimization problem: you're allowed to choose both pd the initial probability of picking the car, and sp the player's choice for staying or switching, what are optimal values for (pd, sp) that maximize the chance of winning the game? Clearly (1, stay) wins with probability 1, and so does (0, switch) because Monty has to make a forced move revealing the 2nd goat. Tijfo098 (talk) 11:17, 12 March 2011 (UTC)

The other issue that I suspect is at play is the "exact probability calculation" mentality from probabilities 101 courses/textbooks. To compute that in the MHP, you have to know the host's strategy. But it's of course not needed to answer the decision problem, finding the bounds for the probability of winning by switching suffices. I suppose my prior exposure to randomized algorithms (see BPP) predisposed me for thinking about it in terms of bounds for the probability. Tijfo098 (talk) 08:42, 12 March 2011 (UTC)

Thanks for your insightful comments. The adjectives "conditional", "unconditional" belong to the world of introductory courses on probability and statistics. Where people naturally associate them with conditional and unconditional probabilities, according to the formal definitions of these notions. And actually, one of the reasons why people tend intuitively to think that switching is not beneficial, is because common intuitions of conditional probability are not that good. I agree that MHP can also be very fruitfully viewed from the points of view of optimization, game theory, mathematical economics.
The words also played a big role in the quarrel between wikipedia editors, and go back to the articles (in the probability and statistics literature) criticizing the "unconditional (probability)" solution of Marilyn Vos Savant and other popularizers of MHP. People who are not at home in this field (including some of the editors) find the words confusing. Richard Gill (talk) 15:01, 12 March 2011 (UTC)
Ok, I think I understand this issue now. First, this wiki-solution (which is found in various books) is horribly obfuscating because all probabilities there are conditional... However, in more clear sources [8], still using the canonical sample space {1,2,3}, part of the MHP problem assumption (host picks goat, host doesn't pick player's door) is an event, say A, so the question is to calculate the conditional probability of winning by switching, knowing that A happened, ergo the probability asked for is conditional on A, even though A happens with probability 1. However, I think there are alternative encodings, which account for symmetry, that don't have an explicit event encoding part of the problem. For example, the one in my sandbox. It seem unnatural to call the final probability there conditional. (It's probability space is constructable as multi-stage experiment, I think; the resulting sample space is bool x bool). Am I making any sense? Tijfo098 (talk) 11:39, 13 March 2011 (UTC)
I think you are making sense but I don't have enough time to respond in detail just at the moment.
What do you think of the approach of [9]? And: the same, but with a technical appendix for all those Statistics 101 students out there: [10]. Richard Gill (talk) 13:07, 13 March 2011 (UTC)
PS the solution this wiki-solution is obfuscating, not because it is using conditional probabilities, but because it is using the dumbnest possible way to get the answer within that approach. An approach which might be good for a computer which only knows the axioms of probability calculus, but none of the theorems. But which gives absolutely zero insight - instead it just mechanically goes through lines of formula manipulations. Every time I or any one of the people trying to move the article out of its bogged down, extremist state, propose to delete these lines, we get howled down. Even when we say that all this stuff can also be found on the wikipedia page on Bayes's theorem. There are at least three beautiful and insightful (and each one giving a different insight into MHP) routes to getting the conditional probability answer. They are so obvious and beautiful that they are never given in elementary statistics textbooks, where the teacher is busy trying to teach the students formula manipulation according to the formal rules. The teacher is not interested in MHP. It is just used as a vehicle to show how blindly following the rules does produce the right results. As of course it must do, since the probability calculus is consistent with the rest of mathematics and based on logic and common sense.
Woonpton got very angry with me when I made this point. I think she has never heard of Laplace. Yet Laplace (1814) wrote down the calculus rules which she uses in her daily work. And Laplace did that to distill common sense into as small a collection of logical rules as possible. Richard Gill (talk) 13:29, 13 March 2011 (UTC)

## Subjectivist interpretation

By the way, the same source [11] has probably the most clear discussion of subjectivist vs. frequentist debate on MHP. A frequentist assumes that A (from the previous section) happens with probability 1 (repeatable experiment), but a subjectivist does not. I raised this latter point on the article's talk. I suspect this is actually draped in the jargon "conditional problem" and "unconditional problem"; is this some sort of standard MHP or just Wikipedia-MHP jargon? Do various editors even mean the same thing by these terms? My impression is that they don't. The few sources I've looked at don't use them. Tijfo098 (talk) 12:55, 13 March 2011 (UTC)

Frequentally I have said that you can't understand different sources' approach to MHP without realising that they can be using different concepts of probability. Most sources make good sense in just one of the two main ways (frequentist or subjectivist); and a few make no sense at all. Martin Hogbin memorably said "anyone who thinks seriously about MHP is forced to ponder on the meaning of probability", or something like that. I agree. Rosenhouse has a whole chapter on the topic. Whenever I brought this up on the MHP talk pages, I was howled down by the wikilawyers. Hence I wrote some sections on this aspect of the whole MHP story in my Statistica Neerlandica paper.
I also believe that the very strong opinions of some of the wikipedia editors corresponds to their instinctive choice of just one of the two main schools of thought concerning what is probability. Whenever I tell them that I think they understand their point of view, and I think they are subjectivist or frequentist at heart, they get angry with me for "false" claims to be able to read their minds. Similarly when I try to explain various reliables sources writings by figuring out what they really think about probability. References to the wikipedia pages on exactly this issue fail to impress. The problem with MHP is that everyone "knows" it can be completly solved by common sense, yet don't realise that not everyone shares the same common sense.
The book you refer to looks good. I shall take a better look at it in the near future, thanks. Richard Gill (talk) 13:22, 13 March 2011 (UTC)

## Has That 'Frequentist' Contestant Had The K & W Premises Explained To Him?

This http://statprob.com/encyclopedia/MontyHallProblem2.html site explained those terms you use very well. I'll still avoid using them, though.

In a nut shell, unless the K & W (Selvin) premises (maybe excluding the host bias premise) are provided to the contestant before his door selection, it's understood the contestant won't know if he's facing K & W Monty (2/3 & 1/3) or Monty Fall (1/2 & 1/2, random), or biased host (≥ .5) or deceitful Monty (1 & 0, only offers when it loses), or just what. So that contestant can't conclude the 2/3 & 1/3 outcomes.

So, for the '2/3 & 1/3 vs 1/2 & 1/2' paradox to exist for a puzzle which begins, "Suppose you're on a game show...", the contestant must have been told the K & W premises before he selects a door. And he figures out the 2/3 & 1/3 likelihoods.

And if he were to play again, those same K & W premises are still in force.

So, just like one ignores the results of the fair throw of a fair die, and always, and correctly, relies on each of the 6 numbers being equally likely, no matter what numbers have come up, the MHP contestant will ignore the result of multiple plays of the game. Because he knows the K & W rules are still in force, he pays no attention the outcomes of this infinitesimal sample he's seen. Just like every player in a casino does, or should do. Glkanter (talk) 14:43, 13 March 2011 (UTC)

Let's agree that we are not in the deceitful Monty case. As Vos Savant explained, she meant you to understand that the host always will open a different door and reveal a goat, and that he can do this because he does know where the car is hidden.
From now on, if we use subjectivist probability, I am with you all the way. There is just one game. You know what we have been told. No more, no less. Therefore, a priori, for you, it is equally likely that the car is behind any of the three doors. Therefore, a priori, for you, if you happened to choose the door hiding the car, Monty would equally likely open either of the other doors. NOT because K&W rules are in force. It's the other way around. K&W "rules" are the logical expression of your lack of any other information. See Laplace (1814) - a very great mathematician, a very great scientist, and the first scientist to fully work out a rigorous and logical subjectivist probability calculus.
However it's clear to me that Morgan et al. were frequentists at heart, not subjectivists. That fits well in their time and place and academic context. Most people with their kind of job at that kind of place in that kind of time were frequentists. Read about Probability interpretations. Richard Gill (talk) 14:57, 13 March 2011 (UTC)
PS, a frequentist cannot solve MHP unless you give him the opportunity to choose his own door by his own randomization. And he still cannot compute the conditional probability, but he doesn't want to or need to. Alternatively, he must be told in advance (and believe this 100% to be true) that the K&W rules apply to the way the car is hidden and the way Monty chooses his door, when he has a choice. Very implausible, I'd say. Of very restricted academic interest only. Richard Gill (talk) 14:59, 13 March 2011 (UTC)
Extended content

I'm a 'casino-ist'. And so is every 'frequentist' or 'subjectivist' contestant on that game show. There's nothing 'implausible' about it, and frankly, I don't give a darn. Was there anything incorrect in my explanations? Or were you just throwing irrelevant concepts at me, again?

And don't lecture me on Laplace. It was my arguments that made you go and find him in the first place. I'm beginning to think you do this crap on purpose. Glkanter (talk) 15:02, 13 March 2011 (UTC)

If you don't give a darn you shouldn't be active on wikipedia. If you're paranoid you'll have a hard time making friends there. Please don't use foul language on my talk page. Try reading and thinking and thinking again. Remember the WP principle of "good faith". Or quit. Richard Gill (talk) 15:11, 13 March 2011 (UTC)

Was there something wrong with my explanation? Or were you just showing off, as usual, probably trying to save face? Glkanter (talk) 15:15, 13 March 2011 (UTC)

Yes, I think there is something wrong with your explanations. And no, I wasn't showing off. I never do that, I just try to explain, but I am hampered by only knowing long words and only being able to write long sentences.
As I said, I don't believe that the player has been told how the show chooses the location of the car and how the host chooses a door to open when he has to. I believe the player does know in advance that the host can open a door in advance and will do so. In other words, I assume what Vos Savant has told us, no more, no less. No K&W BS. Just MVS and logical deduction using subjectivist probability, as it is called (and has been called for about 200 years. Read about it on wikipedia). Richard Gill (talk) 15:18, 13 March 2011 (UTC)

Right. You're making an issue about something irrelevant to the points I'm making. As usual. I may even have stated the same things previously, but it's of no import to this discussion. Glkanter (talk) 15:23, 13 March 2011 (UTC)

Selvin provided all the same premises as K & W. I guess, like Morgan (until 2010, that is), you've decided Selvin is inconsequential. Glkanter (talk) 15:26, 13 March 2011 (UTC)

And the random car placement and the random host bias come from "Suppose you're on a game show..." Whether stated explicitly or not. Glkanter (talk) 15:31, 13 March 2011 (UTC)

And regardless of what they include, every contestant is given the exact same instructions, regardless of how that contestant defines 'probability'. You're just spouting that stuff out instinctively, whether it's germain to the problem statement, or more importantly, the Wikipedia reader, or not. Going off about 'contestant randomization'? For the article? Nonsense. You're no better than than ones who insists 'it must be conditional'. No thought given, just catechism. Glkanter (talk) 16:06, 13 March 2011 (UTC)

Marilyn Vos Savant made MHP famous and her words are quoted by everyone. It's no longer Selvin's problem. I agree that everyone who instinctively approaches Vos Savant's problem will take all doors equally likely to hide the car and the host equally likely to open either door. They are doing that, using probability to express their own state of knowledge. They do *not* assume that the car is hidden using a real and unbiased random generator. Nor that the host makes his choice, if he has one, in that way. I don't insist that you or anyone should think about MHP in the same way. I have given MHP a great deal of thought, but I don't ask you or anyone else to think about it in the way that I like best. But you dear friend Glkanter seems to have a problem with accepting that anyone could think differently from you. And as you proudly asserted, you never ever changed your mind and never ever learnt anything of any use from all these discussions. We are talking about MHP on my talk page. I am not talking about how the article ought to be. Richard Gill (talk) 18:41, 13 March 2011 (UTC)
PS There's Glkanter's Great Unwashed solution to MHP, and there's Laplace's solution. They're the same. That's my point. Glkanter would have better luck convincing his opponents on the MHP page of his point of view, if he would invest a little time reading up on the basic sources. And the point is, it is not difficult, as Rick Block said, it's not rocket science. It's just a language which takes a little while to get used to. But it's important since about 5% of the wikipedia readers of the page on MHP are students of statistics and probability and math courses given by folk such as Nijdam and Kmhkmh and me and Boris Tsirelson. And unfortunately, most of the "academic" sources are texts written for Statistics 101 teachers, some of whom are not as gifted as others, and most of whom have no interest whatsoever in Monty Hall, he's just a vain attempt by them to gain their students' attention in order to ram some Bayes theorem formalism down their throats. Richard Gill (talk) 18:51, 13 March 2011 (UTC)

## MHPP (Monty Hall Prize Problem, or: The Holy Grail of MHP studies)

Suppose the car is hidden uniformly at random. The contestant chooses Door 1. Monty Hall, for reasons best known to himself, opens Door 3 revealing a goat. We know that whatever probability mechanism is used by Monty for this purpose, the conditional probability that switching will give the car is at least 1/2. We know that the unconditional probability (ie not conditioning on the door chosen by the contestant, nor the door opened by Monty) is 2/3.

Always switching gives the car with unconditional probability 2/3, always staying gives it with probability 1/3. Nobody in their right mind could imagine that there could exist some mixed strategy (sometimes staying, sometimes switching, perhaps with the help of some randomization device, and all depending on which doors were chosen and opened) which would give you a better overall (ie unconditional) chance than 2/3 of getting the car.

This is true, of course. In fact, from the law of total sausage meat, proving the optimality of (unconditional) 2/3 by always switching is equivalent to proving that all the six conditional probabilities of winning by switching, given door chosen and door opened, are at least 1/2. We can prove the latter using the Bayes' sausage machine, or, better I think, using Bayes' rule in a smart way. However both these proofs require some sophistication. Richard Gill (talk) 09:05, 15 March 2011 (UTC)

### The Holy Grail of MHP studies

Tell me a short, intuitive, but completely and easily formalisable proof, that 2/3 overall success chance can't be beaten. It has to be a proof which you can explain to your wise grandmother, i.e., it uses only common language and plain logical thinking. The same proof must be acceptable to a pedantic mathematics teacher. Prize: a bottle of good wine, and undying fame in the Annals of Monty Hall Studies.

The only mathematical assumption you can use is the complete randomness of the location of the car. Player's choice, host's choice are not necessarily random, let alone uniformly random. Richard Gill (talk) 09:05, 15 March 2011 (UTC)

I hope you agree we need not consider random strategies; after all, people who might think of that possibility will surely realize that no random strategy can beat the optimal deterministic one. Then here we go, not as short as one might wish, but hopefully intuitive, and without scary probability formulas:
When the contestant has to make the decision whether to switch or to stay, the information they have about the state consists solely of two items: DIC, which door they have initially chosen, and DHO, which of the remaining doors the host has opened. So any (deterministic) strategy can be given as a strategy look-up table: a mapping from pairs (DIC, DHO) to the decision set {SWITCH, STAY}. Now imagine a partially prescient contestant, not sufficiently prescient to know where the car is hidden, but just prescient enough to know: if I open this door, then the host will react by opening that door. In other words, given DIC, they already know DHO in advance. Clearly, this uncanny prescience will not diminish their chances provided they follow the optimal strategy for non-prescient contestants. They will then do just as well. But for such prescient contestants, the strategy lookup-table can be simplified: since DHO is functionally dependent on DIC, we can condense it to a mapping from DIC values to the decision set. But then we know they can't do better than 2/3, which is obtained if the table tells them in all cases to SWITCH; given the random placement of the car, any STAY reduces their chance of winning.
(My grandmother might object, though, that there is no such thing as prescience.) Now find where this already has been published.  --Lambiam 17:30, 15 March 2011 (UTC)
Lambiam, you presented the "combined doors solution"?  –  Showing the "overall" probability to win by switching is 2/3? This value will always be valid, for any single game, as long as you do not get "additional information" on the actual location of the car, for the actual game. But even given that you get the "closest possible" additional information for any game, probability to win by switching will REMAIN within the range of at least 1/2 (but never less!) in 2/3 of games to max. "1" in 1/3 of games (the two still closed doors suddenly are shown to be made of opaque glass, then). So this just characterizes the actual location of the car a little bit closer, anyway confirming that the average probability to win by switching will remains 2/3, forever.
I'm citing Carlton 2005:
Imagine you plan to play Let’s Make a Deal and employ the “switching strategy.”
As long as you initially pick a goat prize, you can’t lose: Monty Hall must reveal the location of the other goat, and you switch to the remaining door - the car.
In fact, the only way you can lose is if you guessed the car’s location correctly in the first place and then switched away.
Hence, whether the strategy works just depends on whether you initially picked a goat (2 chances out of 3) or the car (1 chance out of 3).
(Matthew A. Carlton, Cal Poly State University, San Luis Obispo, Journal of Statistics Education Volume 13, Number 2 (2005)
What I said regarding "additional info" on the actual location of the car applies here, also. On the long run you can never beat the "overall success rate" of 2/3 (the average of "always switching"). No way to improve it.  Gerhardvalentin (talk) 18:14, 15 March 2011 (UTC)
What I intended the argument to show (and believe it does show) is that no strategy (among all strategies, thus also including strategies that may have the contestant STAY with their initial choice, depending on which door they initially chose and which door was opened) has an overall outcome that is better than 2/3. "No strategy exists such that the overall payoff value exceeds 2/3" is a different statement than "There is a strategy such that the overall payoff value equals 2/3", which is what you appear to be showing here. (Although you write 'On the long run you can never beat the "overall success rate" of 2/3', I don't see how this is a consequence of the argument preceding this claim.) Both are well-known results; the challenge was to give a "short, intuitive" proof of the upper bound (and therefore optimality of the always-SWITCH strategy) instead of the usual enumeration of strategies, not providing insight and boring even when you exploit the symmetries to the fullest. As I have written earlier: "The proof of 2/3 is as before, but the proof that this is optimal requires, as far as I can see now, an elaborate case analysis (but maybe I'm wrong and there is an easy proof for this too)." (Archived here.) The prescience trick allows you to bypass that case analysis.  --Lambiam 00:40, 16 March 2011 (UTC)
So did I win the Monty Hall Prize? Is the bottle of good wine hidden behind one of three doors?  --Lambiam 11:53, 16 March 2011 (UTC)
Could be, @Lambiam! This sounds good. I have to go home and talk with my inner Grandma... BTW you're right that we can neglect randomized decisions on the part of the player. They can only produce win-chances intermediate between what can be done with deterministic decisions. You certainly have brought up a new smart idea into the game. You're saying: suppose Monty Hall tossed his three coins in advance, and revealed the three answers which he would give in each of the three situations which might arise. That certainly is extra information to the player, so he can't do worse by using this information, as long as he uses it optimally. Now you're saying, with that extra information it is obvious that even with the best you can do, you still run a 1/3 chance to miss the car.
QED.
Certainly does smell a bit to me that you have opened the right door... Richard Gill (talk) 13:38, 16 March 2011 (UTC)
I now have communed with Grandma (in the spirit world). Yes @Lambiam, you did it! Brilliant! We reduce in two steps to the case that both parties use deterministic strategies. In other words, we convert Monty Hall into Monty Crawl! Now we see that we can't do better than 2/3 in general, since even with the extra information, we can't do better than 2/3. Therefore all the conditional probabilities must be at least 1/2 (law of total probability) (since otherwise we would have been able to do even better). We don't need The Right Reverend Thomas Bayes' sausage machine, even in the situation with possible host bias!!! The reason in a nut-shell is what your proof says: already, most favourable to us (the contestant) would be a 100% biased host with 100% known bias. But even in this most favourable situation we can't beat 2/3. So we can't do it at all. So it's a waste of time to worry about conditional probabilities, they don't tell you anything you don't already know: switch! Richard Gill (talk) 13:45, 16 March 2011 (UTC)
In the argument as I presented it the prescient contestant doesn't actually use the extra information; they just blindly follow the optimal strategy for non-prescient contestants by consulting the optimal-strategy lookup table. Then the predetermination of the host's move shows that they might equally well have used a condensed lookup table, as described. (Of course, the argument also shows that attempts to actually use the advance knowledge will do them no good, just as it incidentally shows that 2/3 is not only an upper bound, but also a sharp one.)
The idea of downloading the prescience into three advance coin tosses, as you described above, made me think of a slightly different presentation of the argument: the outcomes might also be revealed to the print-on-demand publisher of the optimal-strategy lookup table, instead of to the contestant, who remains in the dark for now; the publisher could then use the advance knowledge obtained to likewise condense the table and thereby save on the cost of paper – and, seeing that the optimal strategy is constant across initial choices, reduce the lookup key to a null tuple and just print the single word SWITCH.
I think that this very argument, in one disguise or another, underlies many people's strong intuition that going conditional can't make a difference. I hope I won't be blocked or banned now for putting OR on this page.  --Lambiam 14:49, 16 March 2011 (UTC)
I think I may have seen this proof in the game theoretic literature... or something like it...but didn't see the wood for the trees. I saw the formalized version but did not pick up the idea. What we (you, Lambiam) have done here is we have applied transparent logical manipulations to a simple little problem, reducing a more complex situation to a more simple one. The logic is at the level of: A is true, and A implies B is true, so B must be true. Doesn't matter if it is O.R. or not. We can talk about it on talk pages because we want to improve the presentation of the MHP page, so it helps to gain yet more insight into MHP. If someone has done it before we will find the reference and can use it. If no-one seems to have done it before I write it up on my talk-page or in a new version of the statprob.com peer reviewed encyclopedia article on MHP. (Or you Lambiam can do something like that if you are in a position to do so). And then anyone can refer to it who wants to. Maybe nobody likes it. Fine.
Actually, it has now escaped into the wild: "If you do not want your writing to be edited, used, and redistributed at will, then do not submit it here." Richard Gill (talk) 15:17, 16 March 2011 (UTC)
Oh, how I wish I spoke the secret language of the High Priests. Then I too, and many others, could have attained the Holy Grail, countless months ago. Glkanter (talk) 14:27, 16 March 2011 (UTC)
I am eternally grateful to you, Glkanter, for stimulating this discovery! You may consider yourself the Godfather of all good things to have come out of this silly quarelling. Richard Gill (talk) 14:31, 16 March 2011 (UTC)

Richard, am I right in guessing that you asked for the "proof" that the absolute "minimum chance" for every single game could be 1/2 only, but never less, preventing the overall chance of 2/3 forever to be beaten? Gerhardvalentin (talk) 18:27, 15 March 2011 (UTC)

I was looking for a different proof of that, or even an argument to make the proof a waste of time.
Let's talk about just one game. Let's be frequentist: probability is in the real world, in the physical mechanisms by which choices are made. Suppose the car is hidden by a completely fair and (for the contestant) secret randomization. The player chooses a door. The host will either be forced to open a particular door, or he will have a choice. If he has a choice I suppose he uses a biased coin. His biased coin may depend on the door in question. The player knows which coin the host would use in each of the three cases. For instance, player chooses door 1, host opens door 3. The player doesn't know whether the host's choice was forced, or was the result of tossing a coin. But he does know the probabilies of the two faces of the coin which would be used if it were needed.
From Bayes' rule, odds form, we can compute that the conditional odds that the player's initial door hides a goat, given he chose door 1 and the host opened door 3, are 1:q, where q is the probability that Monty would open door 3 had the player chosen door 1 and the car happened to behind this door. Everybody knows q. (I should write q(1,3) - it could be a different q if the first choice was 2 and the host opened 1, and so on.. then we would be talking about q(2,1)).
Whatever all these six q's might be, every single case is always either favourable to switching or neutral to switching.
Now someone who knows probability theory can do these calculations and someone who knows probability theory also knows the law of total probability - the overall chance of winning by a given strategy is the weighted average over the six cases, of the chance of winning using that strategy in the case at hand. It follows that to get the best overall chance, you should do what is best in each particular case. We know that you can get overall win chance 2/3 by switching in every single case. You get overall win chance 1/3 by staying in every single case. I have just argued for you that 2/3 can't be improved, since in each case separately we are getting the best available.
But I don't think you can explain this proof easily to your wise Grandma. (Or your wise Grandpa).
I'm asking for a short, elegant, completely loophole free, proof of the same thing. I don't want to use conditional probability calculations: please show to me, directly, and totally convincingly, that 2/3 overall chance by always switching can't be (overall) improved. (I know it's true, but I only know a proof -- the one I just told you -- which I can only explain to students of probability). I want to show *directly* that everyone who has the gut feeling that the simple solution is enough - it compares the overall win chance of always switching, with that of always staying - is justified. And I want to do it in a way which also convinces pedantic mathematicians (eg Nijdam).
If we can do this then we have the simple argument that you can get 2/3 by always switching, we would have a simple argument that you can't do better, and then we would not need conditional probability or other advanced mathematical concepts but still have completely solved MHP, in the case that every door initially is equally likely to hide the car. No assumptions about host behaviour (or about our knowledge of host behaviour, if you are want to do this with subjectivist probability).
I know a simple argument when we have full symmetry: the frequentist knows that Monty's coins are all three unbiased, the subjectivist knows that he knows absolutely nothing, one way or another. We may as well pretend that the player makes his choice also completely at random. Instinct now says that specific door numbers are completely irrelevant. It's just the relationship between their manifest and their hidden roles. Maths confirms this: it says that the door numbers are independent of (the relationships between) the roles of the doors. Proof: obvious by symmetry. The probability student can convert this into formulas, if they want to. You write out what you want to prove and you apply symmetry to show that it is true. It's an easy exercise. But your Grandma is, rightly, convinced anyway. The conversion from the word "symmetry" to one line of algebra are just a matter of translation from one language (common language) to another language (probability formalism). They don't add anything. They don't make the result "more" true, or more convincing that it already was.
The grand objective of mathematics is to replace computations by ideas. The art of maths is to make arithmetic superfluous. Replace arithmetic and formula manipulations (which computers are better at doing than people) by ideas, pictures, concepts, and simple logic. That's what people are good at (well - except for the logic bit, I'm afraid). It's how we communicate to one another. This is why the discussion on the arbitration conclusion talk page about O.R. in mathematics is pretty ludicrous, and the mathematicians are rightly worried that the wikilawyers are going to make their work on wikipedia impossible, simply because the wikilawyers have no idea what maths (or more generally science) is all about. Good lawyers are creative and apply the spirit of the law. Succesful lawyers are bureaucrats and convince other bureaucratic lawyers by using the letter of the law and false logic. Richard Gill (talk) 09:23, 16 March 2011 (UTC)
The simple answer is to change the script of the problem - i.e. just what is said.
Player: OK I choose door 1
Host: You can keep that door, or you can have both these other doors. I'll even open one of them for you
Player: Ok, that sounds like a good deal... hmmm
Host: Why so pensive?
Player: I kinda want the thrill of revealing the car myself!
Host: OK, I'll open a door that is hiding a goat! <peeps behind prop doors> here we go!
Rich Farmbrough, 15:14, 17 March 2011 (UTC).
This is good, and tells us always switching beats always staying. I often use this story. *Everybody* gets it. Everybody ... except lawyers. At the University of Nijmegen they did an experiment. Posed students MHP. Almost always got the 50-50 answer. Explained why it's 2/3-1/3. Almost everyone becomes convinced you should switch. The exception? The lawyers! Once a lawyer thinks they have got the answer, they are not convinced by any other reasoning you can give, that any other answer is better. Richard Gill (talk) 15:35, 17 March 2011 (UTC)
Maybe that's why the law is such a mess. Rich Farmbrough, 15:43, 17 March 2011 (UTC).
Lawyers don't know anything. They try to do their work by following (and interpreting) rules. Form, not content. Moreover, since they can be easily fooled by false arguments, and since a lawyer only ever has to convince other lawyers (at least, in the Dutch situation, where we do not have a jury), there's no reason for them to learn correct arguments. Richard Gill (talk) 16:32, 17 March 2011 (UTC)

### Why overall 2/3 can't be beaten, in a nutshell

Obviously the player only needs to consider deterministic strategies for himself. Now suppose Monty Hall makes his choice of door to open, when he does it at all, by tossing a possibly biased coin (a possibly different coin for each door). He might just as well toss his three coins in advance and just "look up" the action which is needed, if and when an action is needed. Now suppose the player also gets to see the results of the three coin tosses in advance. He now knows even more, so he cannot do worse (provided he uses all the available information as best as he can).

But now we are effectively in the "Monty crawl" situation. We want to show that for the Monty crawl problem, there still is no strategy with an overall win-chance of more than 2/3. Suppose the coin says that Monty would open door 3, if he had a choice between 2 and 3. Then whether the car is behind door 1 or door 2, Monty is certain to open door 3. His action tells us nothing so we may as well switch. Suppose the coin says that Monty would open door 2, if he had a choice between 2 and 3. Then the fact he opens door 3 shows us that the car must be behind door 2, so we must switch. Either way, we might as well switch. If we switch anyway, our overall win chance is 2/3. So this is the best overall win chance which is available for us for the Monty crawl problem.

Beautiful, beautiful, beautiful. (At least that's what I think, and my Grandma does too).

You can't do better than 2/3 overall because you can't do better than 2/3 in the situation that would be most favourable to you, Monty crawl. And because you can't do better than 2/3 overall, the chance of winning by switching must be at least 1/2 in each separate situation which you can distinguish (reductio ad absurdam).

We are in fact using Bayes in the Bayes' rule form, but only for the situation when the evidence we are given is certain under both hypotheses, and for the situation when it is certain under one, impossible under the other.

We are also using the insight of all game theorists that one can always reduce everything to the extreme (deterministic) case.

We are solving Monty Hall by use of the more simple problem Monty crawl.

Since 2/3 overall is the best you can do, and you can achieve that by always switching, it's a waste of time to look at the specific door numbers and figure out conditional probabilities with Bayes' sausage machine or whatever. Richard Gill (talk) 15:40, 16 March 2011 (UTC)

Your conclusion is only true because, while the 2/3 overall is the best and worst overall, the worst conditional scenario is 1/2. This is because the host's choice of doors only has 1 bit of information. If we had a large number of doors the amount of information goes up, and biased choices by the host can reveal more information, for example the scenario where the host chooses door 2 if you have chosen the car, may also imply the car is behind 3 in the 3 door variant. In a four door variant he may choose 2 only if there's car behind 1, and 3 or 4 otherwise. If you know this, your expectation from not switching (given that he opens 2) is 1. Moreover as the number of doors increase the amount of information that can be passed increases, although even with a cooperating host you can never do better than (n-1)/n overall (host chooses door to the right of the car, if the car is behind door n chooses at random, for example). Rich Farmbrough, 15:08, 17 March 2011 (UTC).
That's right. 2/3 is the best overall win-chance exactly because the worst conditional scenario is 1/2. I was not thinking of the n door variants yet...! Richard Gill (talk) 15:25, 17 March 2011 (UTC)
Yes, even in the 1 million-doors variant (where the host also opens n-2 doors) with the 9'999'999/1'000'000 overall, the worst conditional scenario is 1/2 and never can be worse: Suppose  you know  that, for some reason or other, the host is determined to leave door no. 777 777 closed, whenever possible, then observing that "special" situation will render that door as likely to hide the prize as door no. 1  (Ruma Falk).  And vice-versa: If  you know  that the host always is determined to open #777'777, whenever possible, but exceptionally leaves it closed, only in that "special" situation the conditional probability will be "1". That's the hoax of the conditional solution: You have to know quite a lot in advance, otherwise it doesn't work. In the 1 million doors variant it works very rarely, only in those "very special" situations. Please note that conditional probability theory does not request that anyone should/could "know" the value of the variable "q". It remains "q". Rather bogus, isn't that? Gerhardvalentin (talk) 17:34, 17 March 2011 (UTC)

### And now in a smaller nutshell too

Here is a short and elementary and complete solution of the MHP, which actually covers the biased host situation just as well as the usual symmetric case. There is no computation of a conditional probability. All we have to do is to consider two kinds of players: a player who in some situations would stay, and a player who in all situations would switch. We show that both kinds of players are going to end up with a goat with probability at least 1/3. In other words, it's not possible to do better than to get the car with probabillity 2/3. But always switching does give you the car with probability 2/3. Hence always switching achieves the best that you can possibly do.

Suppose all doors are equally likely to hide the car, and you choose Door 1.

If you are planning to stick to Door 1 if offered the choice to switch to Door 2, you'll not get the car if it is behind Door 2. In that case Monty would certainly open Door 3, you'll have the choice between Doors 1 and 2, and you'll keep to Door 1. Chance 1 in 3.

Similarly if you are planning to stick to Door 1 if offered the choice to switch to Door 3, you'll not get the car if it is behind Door 3. Probability 1/3.

If on the other hand you are planning to switch anyway, you'll not get the car if it is behind Door 1. Chance 1 in 3.

Altogether this covers every possible way of playing, and however Monty chooses his door: there's always a chance of at least 1/3 that you'll end up with a goat. This means that there is no way you can do better than getting the car with chance 2/3.

We know that "always switching" guarantees you *exactly* a chance of 2/3 of getting the car. I've just shown you that there is no way this can be improved.

Thanks to Sasha Gnedin for this one. Richard Gill (talk) 13:14, 1 July 2011 (UTC)

## Street cred

Yo Richard!

I worry that, never having been blocked, I lack "street cred". If you don't get blocked at MHP, perhaps we can stage a "rumble"?

(I think the staged-rumble strategy worked established the manly bonafides for another Richard, Cunningham of Happy Days!)

Coolly,  Kiefer.Wolfowitz  (Discussion) 11:28, 15 March 2011 (UTC)

Yeah, man! Richard Gill (talk) 13:05, 15 March 2011 (UTC)
Let's do some really cool joint wiki-outlawed-O.R. on wikipedia and get banned together, when all we are doing is writing a really good encyclopaedia article, e.g., about Bayes' sausage machine or about the law of total sausage meat. (I just love that name). Richard Gill (talk) 09:35, 16 March 2011 (UTC)

## The dual nature of MHP

Basically it's both a zero sum extensive-form game and a partially-observable Markov decision process due to its utter simplicity in both strategies and possible moves. Refs:

Now the really interesting math/tcs question is: what the relationship between EFGs and MDPs in general? They are both subclasses of an ancient (1953) but little studied since bilinear-payoff convex game (CGs -- have arbitrary convex strategy sets instead of explicitly enumerated finite strategy sets, no relation to convex cooperative games). Polyhedral CGs are solvable via linear programming, but convex programming is needed in general. A more computationally efficient superclass of both (EFG, MDP) has been recently discovered: paper thesis--also surveys CGs & well written. Recession must have been really bad if this guy went to Google--post IPO. Tijfo098 (talk) 22:50, 17 March 2011 (UTC)

Those are two fields (Markov decision theory, game theory) in which I'm only an amateur, but I can read the literature and I have good colleagues who are experts and whom I can easily consult. As well as new friends on wikipedia :-) . Are you saying that those two references actually mention MHP as well? If so, it should be mentioned on MHP talk page. (I am trying to fight my wiki-MHPcoholism). Very interesting! Richard Gill (talk) 07:40, 18 March 2011 (UTC)
Yes, the page range I gave is for MHP discussion in each ref. Depending on your luck with google books, you may even be able to read them on-line. The EFG/MDP "equivalence" paper/thesis doesn't discuss Monty--too trivial of a subject at that level I suspect. And thank you for the frank self-evaluation on game theory. I was going to say that your papers are a sub-optimal choice as reference in that respect, but I was afraid to offend you. Tijfo098 (talk) 23:20, 18 March 2011 (UTC)
You'll find it hard to offend me! The fun of being a mathematician and a statistician is learning new stuff all the time. And learning requires criticism. Richard Gill (talk) 06:28, 19 March 2011 (UTC)

I'm not sure you're quite fair to physicists with what you wrote on the MHP decision page. We may be less bothered by niceties that don't seem so relevant to the problem at hand, but it's a bit much to push that to 'if it gets the right answer, who cares about the method'. (Okay, so maybe the old pre-1920s quantum theory falls into that box, and arguably Heisenberg's original working up of matrix mechanics certainly does; but even then people were pretty unhappy about it).

On the other hand there is a right answer, wrong(?) method question that has haunted me ever since I was presented with it as a first year undergraduate, so I'd be interested in your comments/analysis.

It's a one-dimensional random walk problem with sticky ends. If the drunkard takes a step either left at right at random each tick of the clock, how long does it take for him to get 20 steps away from where he started?

What we were presented with as "the solution" was: "the variance of his distance from the starting point increases by 1 each tick of the clock; a distance of 20 steps corresponds to a variance of 400, so the answer is 400 ticks."

I didn't like that answer then, and I still don't like it now. It's certainly not directly calculating an expected time; it seems to me far from obvious that you can go from calculating an expected squared-distance as a function of time to an expected time as a function of squared distance, never mind an expected first crossing time.

Yet if you do the problem "properly" -- build a transition matrix, establish a recurrence relationship for expected first crossing time, etc -- then 400 is the answer you get.

So it's something that's nagged at me ever since (almost 25 years ago now) -- can the physicist's quick-and-dirty argument be justified by some special aspect of the problem; or, alternatively, can some special features of the problem be identified that explain why in this case (and/or perhaps in wider cases) the physicist's argument happens to work? Can the physicist's answer be logically underpinned? Or is there no logic to explain what features of this particular problem may have made it work?

As I said, I've wondered about that ever since, so I thought why not take this chance to run it past you and see whether you had any thoughts. Jheald (talk) 11:11, 19 March 2011 (UTC)

Moi, a little unfair? Moi? Well, all's fair in love and war, so don't take anything I say too seriously! But seriously, your question is a very good one. I hope to come back to it later this weekend. Richard Gill (talk) 12:17, 19 March 2011 (UTC)
Just recently, as a statistician designing an early stopping time for a randomized clinical trial, I used exactly your rough physicist's method for this same rough calculation, by instinct. There seem to me two possibilities. Either the answer is "exact" or it's only approximate. Possibility 1: the answer is approximately right. Note that the mean value is not the only sensible measure of the "centre" of a probability distribution, actually, usually it is a rather bad measure. The half way value - median, or the most likely value - mode, are two alternatives and one might prefer either in various circumstances. 20 is big enough that things are getting a bit normally distributed. And the tails of this distribution aren't awfully heavy so means of approximate distributions and approximate means will be about the same. All this is hand waving to say why I think it would be a reasonable thesis project to figure out if there is something exactly true for Brownian motion or Wiener process, which then is approximately true for random walk. Now for possibility 2. It's exact in some sense, whether only for Brownian motion, or also for random walk. Well, martingale theory gives us some beautiful identities, some called Wald's identities (and Wald used them for developing sequential quality control methodology), some called Doob's identities, about the expectation value of a martingale stopped by a stopping time. Brownian motion squared, minus time, is a martingale with mean zero. Stop it at the first time the Brownian motion itself equals plus or minus 20, also has expectation zero. This tells us that the expected time that the Brownian motion first reaches plus/minus 20 is 400. And actually exactly the same argument works for the symmetric random walk.
So the answer is that there are very good reasons anyway to expect this to be a good rough answer, and actually part of the reason for that is also the reason why it is exactly the right answer, when we are interested in the mean time to first crossing the boundary. The reason for the answer to be exactly right is kind of deep, I would say. So this could be called a lucky coincidence, just like the coincidence which in the symmetric case makes 2/3 also the right answer for the question about the conditional probability in MHP.
Stunningly beautiful simple answers in probability theory are often because of martingale theory. "Chercher le martingale" is the way to set about solving many problems.
One could say that the Wiener process was actually disovered by Einstein, or by Bachelier - it was born in physics and in finance. Richard Gill (talk) 16:19, 21 March 2011 (UTC)
Thank you very much for this. I had heard of the awed respect in the community for the ability of martingales to produce seemingly stunning results with extraordinary brevity and neatness, but had never the time out to find out how.
I see that this very case is even presented as an introductory example at Optional stopping theorem. I must now spend a few hours with a towel wrapped round my head, to translate the proof explicitly into the context of this case and see whether I should have been able to spot it with the tools I had back then.
Thank you again, for finally clearing this up for me, and for finally getting me girded up to look into what it is that is the magic of martingales.
Final question. Martingales are all about things that equal zero. Are there connections between them and the algebra and group theory of symmetry, since a signpost for symmetry can often be something equalling zero (and/or zero change in something which is constant) ? Are there mapped-out connections between the two areas? Jheald (talk) 09:39, 22 March 2011 (UTC)
Um.... I guess if you would start looking at martingales taking values not which are real numbers but which are on differentiable manifolds, and then considered differential manifolds with symmetries, which I think brings us to Lie groups, ... well then you could start to see applications of martingales using symmetry to verify the martingale property. This is getting rather far from my usual home ground so I may well be writing nonsense now (there is always a positive probability of that, but further from home it increases). Richard Gill (talk) 15:58, 22 March 2011 (UTC)

## Probability notation

Further to your exception to "the probability P(C)" and "Bayes theorem with P(H|S,C)", I wonder if it would be worth your writing a WP:ESSAY to review the various different notations that are used, who uses them (eg what do particular style guides for particular journals prescribe), and what claimed advantages/disadvantages each form has (including why you object so much to the above, even though it's in common enough use).

Notation in probability and statistics says very little on this score. That may be appropriate, considering it's very introductory. But perhaps it is an area that WP:WPSTATS maybe ought to give more of a steer towards best practice on, and a personal essay surveying the territory might be a good start to get a discussion rolling. Jheald (talk) 11:11, 19 March 2011 (UTC)

That's a very nice suggestion. Maybe I'll do that. Richard Gill (talk) 12:18, 19 March 2011 (UTC)
OK: I started something. Please take a look at my first draft essay on probability notation, you can talk about it at essay-talk Richard Gill (talk) 18:16, 21 March 2011 (UTC)
Looks like it's going to be interesting. I like the "gentle guided walk through probability-land" tone, rather than focussed polemic; I'm learning quite a lot even before you go back and ramp up the focus on notation.
I expect I'm bringing coals to Newcastle, but this page has a couple of interesting short paragraphs on who introduced some of the different basic notations. Jheald (talk) 10:19, 22 March 2011 (UTC)
Wow, there were quite a few surprises for me on that page. Mostly, surprises that so many notations which are nowadays - in mathematics and in statistics - pretty standard, are in fact incredibly modern. (Since I was born in 1951 I consider the 30's and 40's to belong to the modern age!).
I've now added stuff leading up to Bayes and Monty Hall! Richard Gill (talk) 16:05, 22 March 2011 (UTC)

## Glkanter

Do you think you can persuade him to stop ranting. He's likely to stay blocked unless he indicates that he's going to stop, and find some other articles to edit.--Elen of the Roads (talk) 23:28, 19 March 2011 (UTC)

Message received. I'll do what I can to mediate. On his behalf I want to say: Glkanter's a good guy. Has a strong sense of fairness. Judging behaviour independent of content is dangerous. Especially when people jump on bandwagons and start throwing dirt around. Then it sticks onto the outsider, someone who stands out being clearly different from all the others. That doesn't necessarily mean they are the bad guy. It just means they are an easy victim. This is what I learnt from Lucia de Berk, Kevin Sweeney case, and many others. Richard Gill (talk) 08:11, 20 March 2011 (UTC)

## Virtual bottle of good wine

The excessive excise exacted here for non-locally produced alcoholic drinks, including wine, requires virtualization of the award – and, after all, virtue is its own reward! So why don't you donate an appropriate amount of money to some worthy goal, such as a local Innocence Project or equivalent, or a fund for nooddruftige slachtoffers van waarschijnlijkheidsleermisbruik.  --Lambiam 23:20, 20 March 2011 (UTC)

OK! Richard Gill (talk) 05:07, 21 March 2011 (UTC)
@Lambiam, I'm going to support Medecins sans Frontieres. So, Monty Hall will save lives in war-torn and natural-disaster shattered regions of the world. Richard Gill (talk) 10:11, 5 April 2011 (UTC)

## User pages

Please take more care with your user pages. At present User:Gill110951/User BS, User:Gill110951/User IMS, User:Gill110951/User VVS-OR all appear in Category:Statistics and user pages should not do so. You can hide categories so that other users are not affected by them, by placing a ":" before "Category". This is the very thing that the text on these pages warns against. Also "Statistics" is not the right category for templates, if these are made public eventually ... you might try Category:WikiProject user templates and Category:Statistics templates. Melcombe (talk) 10:09, 22 March 2011 (UTC)

Thanks! (I have a lot to learn about these things...) Richard Gill (talk) 15:42, 22 March 2011 (UTC)

## Wikipedia:Arbitration/Requests/Case/Monty Hall problem closed

An arbitration case regarding Monty Hall problem has now closed and the final decision is viewable at the link above. The following is a summary of the sanctions that were enacted:

For the Arbitration Committee, NW (Talk) 00:47, 25 March 2011 (UTC)

Discuss this

## MHP - sources

Richard, yesterday I entered the Uni Leiden source to the article, today I tried to enter "Monty Hall Problem" (version 5). StatProb: The Encyclopedia  as a source. I'm no expert in adding references, so I asked on the article talk page if s.o. could check whether I did it right. And as I don't know whether you agree with my edits I ask you to have a look there, also. Thank you. Gerhardvalentin (talk) 22:06, 3 April 2011 (UTC)

Thanks, I'll check, but surely somebody else will check too. The important thing is that there is concensus. So editors who are interested in this matter should check the source and see if they agree. Also the editors have to decide whether or not these alternative proofs are useful or interesting. Richard Gill (talk) 06:33, 4 April 2011 (UTC)

## MHP notation for Bayes' theorem proof

This is for @Glopk. I'm still trying to control the typography better. If fine points of notation involving use of small letters or capitals, subscripts and superscripts are involved, all mathematics should be in the same typeface. No unnecessary changes of font or size or style. Right now *everything* mathematical is in math display mode (png images generated from latex code). This looks really bad in most browsers. But the alternative is very different mathematical typography in big displayed formulas and in-line in the text. This seems to me to be another reason why this material is out-of-place in the Monty Hall article.

We now derive this result (conditional 2/3) by running through the standard and elementary proof of Bayes' theorem, which involves nothing else than a couple of applications of the chain rule (the definition of conditional probability) and the law of total probability (splitting a probability up according to constituent events). The only difference with the usual textbook derivation is that every single probability is actually conditioned on the door selected by the player. Bayes's theorem is a theorem about probabilities in general. A conditional probability measure satisfies the axioms of probability.

I notice that the notion of randomness in this proof is frequentistic. Randomness is seen in the procedure for hiding the car and the procedure of the host for opening a door - not in the complete lack of information of the player concerning these operations. Biased POV, corresponding to the likely POV of the original academic sources.

Consider the discrete random variables, all taking values in the set of door numbers ${\displaystyle \{1,2,3\}}$:

${\displaystyle C\,}$: the number of the door hiding the Car,
${\displaystyle S\,}$: the number of the door Selected by the player, and
${\displaystyle H\,}$: the number of the door opened by the Host.

Use the symbols ${\displaystyle c\,}$, ${\displaystyle s\,}$ and ${\displaystyle h\,}$ to denote possible values of these random variables, i.e., door numbers in ${\displaystyle \{1,2,3\}\,}$. Use a ${\displaystyle p\,}$ to denote a generic probability mass function (discrete density function). "Officially" we should indicate whose probability mass function we are talking about, by attaching the name of the random variable concerned, as a subscript. However we can safely employ the common "abuse of notation" where the subscript is omitted if if is clear from the context. Thus ${\displaystyle \Pr(C=c)\,}$, as a function of ${\displaystyle c=1,2,3\,}$, is officially denoted by ${\displaystyle p_{C}(c)\,}$ but often abbreviated just to ${\displaystyle p(c)\,}$. The same conventions will be used for conditional probability distributions, more precisely, for the conditional probability mass functions of various conditional probability distributions.

As the host's placement of the car is random, all values ${\displaystyle c\,}$ in ${\displaystyle \{1,2,3\}\,}$ of ${\displaystyle C\,}$ are equally likely. The (unconditional) probability distribution of ${\displaystyle C\,}$ has probability mass function

${\displaystyle p(c)={\tfrac {1}{3}}\,}$, for all three values of ${\displaystyle c\,}$.

Further, as the initial choice of the player is independent of the placement of the car, the random variables ${\displaystyle C\,}$ and ${\displaystyle S\,}$ are independent. Hence the conditional probability mass function of ${\displaystyle C\,}$ given ${\displaystyle S=s\,}$ is

${\displaystyle p(c|s)=p(c)\,}$, for every value of ${\displaystyle c\,}$ and ${\displaystyle s\,}$.

The host's behavior is reflected by the values of the conditional probability distribution of ${\displaystyle H\,}$ given ${\displaystyle C=c\,}$ and ${\displaystyle S=s\,}$, with conditional probability mass function ${\displaystyle p(h|c,s)\,}$ given by

 ${\displaystyle p(h|c,s)\ =\ {\begin{cases}\ \\\ \\\ \end{cases}}}$ ${\displaystyle \ 0}$ if ${\displaystyle h=s\,}$, (the host cannot open the door picked by the player), ${\displaystyle \ 0}$ if ${\displaystyle h=c\,}$, (the host cannot open a door with a car behind it) ${\displaystyle \ {\tfrac {1}{2}}}$ if ${\displaystyle h\neq s}$ and ${\displaystyle s=c\,}$, (the two doors with no car are equally likely to be opened), ${\displaystyle \ 1}$ if ${\displaystyle h\neq c}$ and ${\displaystyle h\neq s}$ and ${\displaystyle s\neq c}$ (there is only one door available to open).

The player can then run through the proof of Bayes' theorem to compute the probability of finding the car behind any door, given the initial selection and given the door opened by the host. This is the conditional probability of ${\displaystyle C=c\,}$ given ${\displaystyle H=h\,}$ and ${\displaystyle S=s\,}$. In terms of the (conditional) probability mass functions of these random variables, by twice applying the definition of conditional probability (the chain rule),

${\displaystyle p(c|h,s)={\frac {p(c,h|s)}{p(h|s)}}={\frac {p(h|c,s)p(c|s)}{p(h|s)}}.\,}$

The denominator can be expanded using the law of total probability and again the definition of conditional probability as the marginal probability mass function

${\displaystyle p(h|s)=\sum _{c=1}^{3}p(h,c|s)=\sum _{c=1}^{3}p(h|c,s)p(c|s)\,}$.

Thus, if the player initially selects Door 1, and the host opens Door 3, the probability of winning by switching is

${\displaystyle p_{C|H,S}(2|3,1)\,}$
${\displaystyle \ =\ {\tfrac {p_{H|C,S}(3|2,1)p_{C|S}(2|1)}{p_{H|C,S}(3|1,1)p_{C|S}(1|1)+p_{H|C,S}(3|2,1)p_{C|S}(2|1)+p_{H|C,S}(3|3,1)p_{C|S}(3|1)}}}$
${\displaystyle \ =\ {\tfrac {1\times {\frac {1}{3}}}{{\frac {1}{2}}\times {\frac {1}{3}}+1\times {\frac {1}{3}}+0\times {\frac {1}{3}}}}\ =\ {\tfrac {2}{3}}.}$

Richard Gill (talk) 07:57, 5 April 2011 (UTC)

## MHP: it all comes down to what you probably mean by probability

An editor wrote to me about the host bias question. I replied with the following "essay" on concepts of probability. You see, my correspondent touched unknowlingly (I think) on a major issue at the heart of many quarrels about the Monty Hall problem. In my opinion there are different concepts of "probability" out there, and many endless discussions could be short-circuited if people would first of all, up front, say what they understand by the concept.

Most ordinary people use probability in its so-called subjective or Bayesian variant. Don't take too much notice of the names. This is nothing to do with Bayes, and it doesn't have to be subjective. What it does mean is that probability is in your head, it's a property of the information or lack of information in your head. So for instance, the host might be biased, but since you don't know anything about his possible bias, and you have no idea if it is in one direction or the other, *for you* it is equally likely that he would open door 2 or door 3 if he had the choice. Similarly *for you* in advance the car is equally likely behind any of the three doors.

On the other hand, many scientists and statisticians use probability in its so-called frequentist variant. Actually there is a lot of subjectivity in this notion too. What it does mean is that probability is in the physical world, in properties of things like dice or coins, rather, in the whole mechanism around tossing a dice or a coin. People who use probability in this sense will think of Monty Hall's brain choosing a door as a kind of random number generator. If in your imagination he would make his choice many many times, he would probably behave like a biased coin, choosing one door more often than the other. Similarly the car hidden behind one of the doors in advance of the show might be done by tossing a real die, or opening a random page in a telephone directory, etc etc.. and if in imagination repeated many times doors 1, 2 and 3 will be selected different proportions of the times.

By the way, don't think of these repetitions as being made sequentially in time with all the implied possibility of learning and adapting and changing. Think of the repetitions being made in parallel universes. Just before the show, we clone the physical universe a large number of times - not exactly, but so closely that you no one can see any difference. We then let all these copies run forwards in time through the show. The car is hidden, the player makes their choice, the host opens a door. Each time because of tiny variations (eg a butterfly flapping its wings or not on the other side of the earth) history unrolls just slightly differently, and sometimes the car is hidden behind one door, sometimes behind another.

The art of understanding probability models is to ask yourself: what stays the same, what is different, in each repetition? If you believe ultimately in a deterministic world, you have to allow variation in the initial conditions so as to obtain variation in the outcome. But miniscule differences in the initial state of Monty's brain will lead to hism sometimes opening door 2, sometimes door 3, when he has the choice. Just as miniscule differences in the initial state of coin and hand and result in a coin sometimes falling heads sometimes tails, in a totally deterministic process already understood by Newton.

The *subjectivist* who chose door number 1 because it's their lucky number will believe, after the host has opened door 3, that the car is two times more likely behind door 2 than door 1 and he'll switch. This "two times more likely" is just the logical consequence of his initial symmetric knowledge/lack of knowledge about door hiding car and door opened by host.

The *frequentist* realises that he doesn't know any of the probabilities so in advance he'll choose his own initial door by tossing a fair die (divide the outcome - from 1 to 6 - by two, and round up to a whole number 1,2 or 3) and thereafter switch to the other closed door, without taking notice of the numbers of the door chosen by him nor of that opened by the host. He knows that his procedure gives him a 2/3 overall chance of getting the car. He knows this is the best you can do. He does not know what is the chance that he gets the car given that he chose door 1 and the host opened door 3, and he doesn't care either.

I know that my frequentist will go home with the cadillac on two thirds of the times that the game is played. I have no idea how often the subjectivist will get the car because I have no idea how good he is at choosing a door which can't be predicted by the host and I have no idea how the host and his team are hiding cars and opening goat doors.

I have even less idea how often either frequentist or subjectivist will go home with the car on those specific occasions when the player first chose door 1 and the host opened door 3 and the player then switched.

So it's amusing that one person knows the conditional probability is 2/3 because he knows nothing, while another person knows they don't know the conditional probability for the very same reason. Richard Gill (talk) 07:16, 5 April 2011 (UTC)

## Henze,Georgii, Bayesians and Frequentists

Hi Richard! I posted an answer regarding your question on the MHP discussion page, but since it might get lost in the ever increasing threads there, I'll provide you a copy here:

The 8th edition (2009) of Henze's rather popular 1997 book Stochastik für Einsteiger is partially available at Google books, the MHP problem is treated (in a frequentist manner) at p.52 (unconditional) and pp. 104 (conditional). You may also want to take a look at another probabilty textbook by another German mathematician posted by math portal member earlier, which considers the Bayesian angle with an interesting twist (leading to p=1/2 rather than p=2/3 as the "best answer" to ascertain the probability of winning by switching). It can also be found at Google Books pp. 54-56--Kmhkmh (talk) 11:34, 6 April 2011 (UTC)
Thanks! In the meantime also Gerhard helped me with the good pages from Henze. Goergii looks interesting but unfortunately Google Books only shows me pages 54 and 56. Richard Gill (talk) 14:25, 6 April 2011 (UTC)
Odd I can see all 3 pages of Georgii, but Google Books is known to be tricky. You could also try the German edition of his book pp. 56-58 or simply try at a later time. If this doesn't work either and the book is not in your university library i could email you some screen copies if you are interested.--Kmhkmh (talk) 15:51, 6 April 2011 (UTC)
P.S.: If I understand it correctly Georgii's approach is neither Monty Hall nor Monty Fall (being frequentist and essentially in the line of Morgan's approach), but it takes a closer look at the probabilities in the bayesian approach and what are reasonable ways to quantify them. One approach leads to the canonical solution (i.e. vos savant and p=2/3) but the other one being arguably closer to Monty's real behaviour leads to p=1/2. The difference lies in the contestants prior knowledge influencing the probability he assigns. If you treat it as an abstract thought problem somewhat discinnected from real gameshows in general and Monty's show in particular you get vos Savant's solution, if however you include some very basic knowledge about real gameshows (say the contestant has watched Monty or a similar show several times) you get p=1/2.--Kmhkmh (talk) 16:31, 6 April 2011 (UTC)
Thanks for the tip! I got all three German pages. Most interestingly, Georgii takes the doors to be numbered in advance but the numbering unknown to the player. Door number 1 *is* the door hiding the car. The player chooses a door and its number, which he doesn't know, is equally likely to be any of the three. Georgii supposes that the event A has happened that the host opens a different door to the one chosen by the player and that the host's door reveals a goat. He argues why we should interpret Vos Savant's words as implying that A was certain. Therefore conditioning on A does not change the chance that the player's choice is the door hiding the car.
Thus Georgii promotes Glkanter's simple solution.
After this, he considers the alternative interpretation where A was not certain. In particular, he considers the case where the host opens one of the two goat doors, with equal chances 1/2, independently of which door was chosen by the player. Now we condition on the probability 2/3 event that the door opened by the host is different from the door chosen by the host.
Also in this variant, the Helpful Host who picks a goat door at random (not restricted to be different from that of the player) and offers the player the chance to revise his choice, Georgii does not condition on specific door numbers! Richard Gill (talk) 16:16, 6 April 2011 (UTC)
Yes I was reminded of Glkanter too, but it seems though he takes GLkanter's approach (in particular following a real game show with the real Monty) he reaches a different conclusion (i.e. p=1/2). It is also intersting to note that Georgii as far as the exact modelling and interpretation is concerned (and the community's debate about it) considers the issue as yet "unsettled".--Kmhkmh (talk) 16:31, 6 April 2011 (UTC)
It can never be settled, because Vos Savant's words are ambiguous, not everyone reads her later amplifications of her intention, and anyway, you are not obliged to believe her. The issue is unsettled because it is unsettl-able. But again remember, Georgii is talking about whether or not we are supposed to know in advance that the host is certain to open a goat door different from our door and can and does do this, because he knows where the car is. I would say that there is a clear strong majority opinion on this, which coincides with Vos Savant's own later amplification of her question. So the issue will always come up again and again, so the issues is unsettled, but from the point of view of professional Monty Hall Studies, it is settled. Richard Gill (talk) 11:07, 28 April 2011 (UTC)

Google books randomly blocks some pages for each IP address over a certain time period. (Well it's not entirely random; it's directly proportional to how many pages of the book in question the IP has already accessed recently.) You can have better luck from another IP address and/or if you wait a few days, assuming of course, you can't check it via a library. Tijfo098 (talk) 13:51, 8 April 2011 (UTC)

As Glkanter is no longer able to speak here let me make a comment which I suspect he would like to make and which I agree with. The versions discussed above relate to interesting academic extensions and variants of the problem. It is easy to extend these ad infinitum, as for example, in my own variant in which the host tries to help the player by using different words when the player has originally chosen the car. Unlikely maybe, but certainly possible in the context of an early TV game show.
The core problem, however, is a simple mathematical puzzle, for which there is a set of conventions to treat such problems simply and not add unnecessary complications.
Kmhkmh, I hope you will support, as you have in the past, my proposed structure which allows us to deal first with the simple puzzle which everybody gets wrong, which is the only reason we have an article on this subject at all, and then go into the complications, extension and variants in a comprehensive and scholarly manner. I make no point about which solutions are the 'right' ones, because we all know there is no such thing. As Seymann so aptly says it all depends on the intent of the questioner. Martin Hogbin (talk)
Exactly, Martin. It depends on the perspective, similar to wave–particle duality  :-)
But you could, just at the beginning, place a remark that some controversy exists in perspective, together with a reference to the "Academic controversies"  Gerhardvalentin (talk) 18:59, 8 April 2011 (UTC)
I do not think there is any real controversy except here on the WP talk pages. In any case the lead should reflect the article as a whole and might therefore contain some mention of the more cpmplex solutions. Martin Hogbin (talk) 22:20, 8 April 2011 (UTC)
Not might but should.--Kmhkmh (talk) 22:29, 8 April 2011 (UTC)
As far as chapters/sections are concerned I have no objections of describing the simple solution first and in its chapter without caveats. I do however have strong objection towards dumbing down the article's lead. Meaning the lead needs mention the ambiguity and different treatments and solutions, this can be along the line of Tijfo098's lead suggestion.
I don't agree at all with your notion, why have an article on MHP in the first place. We would have an article on MHP with or without the vos Savant affair, just like we have articles on the Gardner's Three Prisoners problem, Bertrand's box paradox and many more. All we require for a science or math entry in WP (or for a genereal interest puzzle as well) is simply that there are a (few) reputable publication(s) about it and in additions maybe that it is somewhat known in the community. That's it, nothing more required there and MHP has that with or without vos Savant, media storm ot any other individual author/publication.
As far as the work on the article is concerned (ideally) I'd like to see that done by new authors (being established/trusted WP authors though), i.e. none of us (and in particular not you and Rick) should get involved there.--Kmhkmh (talk) 22:29, 8 April 2011 (UTC)
P.S.: All of this is now drifting away from my original posting to Richard and we probably shouldn't spam his talk page with yet another general MHP discussion.--Kmhkmh (talk) 22:29, 8 April 2011 (UTC)
Do feel free to continue, please be my guests! As long as we keep "good faith" and "civility" in mind, the more talk is done, in the more places, the better, I think. I am trying to break my addiction to MHP; and I am trying to be a good reliable source writer in the field, avoiding suggestion of COI with respect to my dual role of wikipedia editor and supporter. And also maintain principles of assuming good faith and acting with civility, focussing on content not persons. New things keep coming up! It's wonderful what a rich problem this is. Richard Gill (talk) 10:26, 9 April 2011 (UTC)

## MHP FAR

I have nominated Monty Hall problem for a featured article review here. Please join the discussion on whether this article meets featured article criteria. Articles are typically reviewed for two weeks. If substantial concerns are not addressed during the review period, the article will be moved to the Featured Article Removal Candidates list for a further period, where editors may declare "Keep" or "Delist" the article's featured status. The instructions for the review process are here. Tijfo098 (talk) 22:57, 7 April 2011 (UTC)

## Delightful!

Hi, Richard! I just read very quickly through your paper, The Monty Hall Problem is not a Probability Puzzle (It’s a challenge in mathematical modelling). I'd need to spend several hours with it, or perhaps several days, to understand it well since it has been so long since I've thought about applied math, but I just wanted to tell you how much I liked what I saw. It really is a very delightful paper: I especially liked, "The battle among wikipedia editors could be described as a battle between intuitionists versus formalists, or to use other words, between simplists versus conditionalists." That made me grin. But then, I've always felt a great affection for Brouwer. :-) Anyway, I just wanted to tell you how much I enjoyed your paper, and to thank you for making it available on the web. Best regards,  – OhioStandard (talk) 09:22, 8 April 2011 (UTC)

Thank you! All fan mail is greatly appreciated. Richard Gill (talk) 07:22, 15 April 2011 (UTC)

## Who gave us Bayes' rule?

I'm offering a prize to who can tell me who invented Bayes' rule. What is the origin of the name, the concepts... Richard Gill (talk) 07:22, 15 April 2011 (UTC)

I don't know if he was the first to use the term, but the term was used and the rule was clarified by Antoine Augustin Cournot in his 1843 book Exposition de la théorie des chances et des probabilités. There are at least 12 occurrences of the term in this book, plus one occurrence of "Bayes' principle", which appears to mean the same. The first occurrence is on p. 155:
"une règle dont le premier énoncé appartient à l'Anglais Bayes, et sur laquelle Condorcet, Laplace et leurs successeurs ont voulu édifier la doctrines des probabilités à posteriori, ..."
(The remainder of this sentence is interesting, but it's long and too much work to retype it.) For the full story one should apparently check the writings by Condorcet and Laplace on posterior probabilities – although our article on Bayes' theorem states, without giving a source, that Laplace reproduced and extended Bayes' results, apparently quite unaware of Bayes' work. The second occurrence of the term is on p. 158: "la règle attribuée à Bayes", with a footnote referring to Transactions philosophiques of 1763, p. 370, which must be the issue of the Transactions philosophiques de la Société royale de Londres in which (a special case of) Bayes' theorem was first published (posthumously) in his "Essay Towards Solving a Problem in the Doctrine of Chances". It is interesting in this context that Cournot is often considered a frequentist (see e.g. our article on Frequency probability).  --Lambiam 18:51, 18 April 2011 (UTC)
P.S. According to Google, the name "Bayes" occurs just once in Condorcet's 1785 book Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix, where it is stated on p. lxxxiij that Messrs. Bayes & Price have given a method to find the probability of future events according to the law of past events (whatever that may mean). I did not find an online copy of Condorcet's 1805 book Éléments du calcul des probabilités et son application aux jeux de hasard, à la loterie et aux jugements des hommes, which is considered an expanded and improved rewrite of the Essai sur l'application.  --Lambiam 19:14, 18 April 2011 (UTC)
Thanks! I am trying to get an eBook of the Cournot. You can only get it from Google books if you are in the US. But then it is free. Please note that I really mean Bayes' rule, not Bayes theorem. I mean the result that posterior odds equals prior odds times likelihood ratio (or Bayes' factor). Richard Gill (talk) 07:43, 22 April 2011 (UTC)
I had not realized you were specifically interested in the odds version. As far as I know the use of odds is a mainly Anglo-Saxon oddity; I even think there is no French term for the concept. It appears to me that the insistence that the term "Bayes' rule" is more specifically used for this version may be particular to Wikipedia. All sources I have looked at use "Bayes' rule" as synonymous with "Bayes' theorem", and use a circumlocution such as "the odds-likelihood ratio form of Bayes' rule", or "Bayes' rule in odds form", when they mean this particular version.  --Lambiam 23:09, 23 April 2011 (UTC)
This Anglo-Saxon oddity is why the Anglo-Saxons have better probabilistic intuition than others. We can do it without odds, if you like, but then we must introduce the concept of "is proportional to". There is a mathematical symbol for this ${\displaystyle \propto }$ but I've noticed again that while Anglo-Saxons have learnt this at school, for mainland Europeans it's something alien. If you know the "proportional to" concept then you can express Bayes' theorem/rule as
${\displaystyle \Pr(H|D)\propto \Pr(H)\Pr(D|H)}$
The posterior probability of a hypothesis is proportional to the prior probability of the hypothesis times the likelihood of the hypothesis given the data, where the likelihood of the hypothesis given the data is defined as the chance of the data under the hypothesis, up to proportionality. All proportionalities are here to be read as "over H, for fixed D". Richard Gill (talk) 11:16, 24 April 2011 (UTC)
Posterior is proportional to prior times likelihood. This is how we use Bayes' theorem. We don't want to think about the normalizing constant, the denominator in Bayes' theorem. It's a distraction, a burden. We study the shape of prior times likelihood and fix the normalizing constant when it suits us best, preferably after we have dropped complicated but constant factors (ie factors depending on only D, not on H). We can do this, because of the theorem that the conditional probabities of events, given one fixed event, are themselves a probability measure, that is to say, satisfy the usual axioms. In particular conditional probabilities of mutually exclusive and jointly exhaustive events add up to 1. Hence if we know such a collection of conditional probabilities up to proportionality, we know them exactly.
This is where odds come in, implicitly. Odds are ratios of probabilities. Because of the normalization of probability, it is enough to know odds. Odds are a natural concept to gamblers. The odds for and against an event are the fair wagers for and against. If the odds are five to two in favour of something happening, then I'm happy to wager five Euros on its happening against your two Euros that it doesn't. For every five times I win, you'll win two times. When I win five times I receive five times your wager of two Euros. When I lose two times I lose two times my wager of five Euros. The equivalence of odds and fair betting wagers is an application of the commutativity of multiplication. Two times five equals five times two. Richard Gill (talk) 11:19, 24 April 2011 (UTC)

## Two envelopes problem

### The problem

The basic setup: Let us say you are given two indistinguishable envelopes, each of which contains a positive sum of money. One envelope contains twice as much as the other. You may pick one envelope and keep whatever amount it contains. You pick one envelope at random but before you open it you are offered the possibility to take the other envelope instead, cf. Falk, "The Unrelenting Exchange Paradox", in Teaching Statistics, Vol. 30:3, 2008.

The switching argument: Now suppose you reason as follows:

1. I denote by X the amount in my selected envelope, by Y the amount in the other
2. The probability that X<Y is 1/2, and the probability that X>Y is also 1/2.
3. The other envelope may contain either 2X or X/2.
4. If X is the smaller amount the other envelope contains 2X.
5. If X is the larger amount the other envelope contains X/2.
6. Thus the other envelope contains 2X with probability 1/2 and X/2 with probability 1/2.
7. So given X=x, the expected value of the money in the other envelope is

${\displaystyle {1 \over 2}2x+{1 \over 2}{x \over 2}={5 \over 4}x}$

8. This is greater than x, so I gain on average by swapping.
9. After the switch, I can reason in exactly the same manner as above, concerning Y.
10. I will conclude that the most rational thing to do is to swap back again.
11. To be rational, I will thus end up swapping envelopes indefinitely.
12. As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction.

The puzzle: The puzzle is to find the flaw in the very compelling line of reasoning above.

### Informal solution to original problem

A careful analysis shows that the probability assumptions used in the argument for switching are inconsistent with the laws of (subjective) probability. No rational person could ever have the beliefs which are assumed in the outset of the problem statement. To be specific, it could never be the case that the other envelope is equally likely to be double or half the first envelope, whatever the amount in the first envelope. This silly assumption led to a silly conclusion (namely, keep on switching for ever).

Suppose one of the envelopes could contain the amount x. Then apparently, x/2 and 2x are also possible. From this, also x/4 and 4x must also be possible amounts of money, and so on. It turns out that these amounts must not only be considered possible, but by use of the laws of probability it turns out that all these amounts of money must be exactly equally likely.

This certainly leads to a contradiction with a common sense approach to real world problems. The amount of money in the world is bounded so there definitely is an upper limit to the money which could be in the envelopes. Also, we don't allow for arbitrarily small amounts of money in the real world.

However, one need not invoke pragmatic principles to defuse the paradox. Also in mathematics, it is not possible to have uniform probability distributions over infinite, discrete sets. You just can't divide total probability 1 into an infinite number of pieces which are both positive and equal. So also within the abstract world of mathematics, the paradox is resolved by saying that the 2x or x/2 equally likely assumption, whatever x, can never be true. No assignment of probabilities to all possibilities can have this feature.

### Formal analysis of original problem

We now give the same argument in a more rigorous mathematical form.

In the Bayesian paradigm, a person who is reasoning with uncertainty in a self-consistent way is supposed to expresses his or her uncertainty about the possible pair of amounts of money in the two envelopes according to a probability distribution over all pairs. Everything is determined by the probability distribution of the smaller amount of money, since this fixes everything else (the other envelope has twice that amount, and the first envelope given to our subjective Bayesian is equally likely to be either). Given this probability distribution, we can compute the conditional probabilities that the other envelope contains 2x or x/2, given that the first envelope contains x. Suppose that these two conditional probabilities are both 1/2, for any value of x which could occur. It follows that if a random envelope contains an amount of money between 1/2 and 1, the other is equally likely to contain the doubled or halved amount, which is between 1 and 2, or between 1/4 and 1/2 respectively.

Now the smaller amount of money has to be between 1 and 2, or between 2 and 4, or between 4 and 8, ... or between 1/2 and 1, or between 1/4 and 1/2, or between 1/8 and 1/4 ... where in each case let's include the lower end-point of the interval and exclude the upper end-point. To say it a different way, we express the smaller amount of money in binary and look at the position of the leading binary digit; we now may as well round off the rest of the binary fraction to zeros. Binary 101.111 is rounded down to binary 100 or decimal 4 (a number at least as large as 4, and strictly smaller than 8). Binary 0.00101 is rounded down to binary 0.001 or decimal fraction 1/8 (a number at least as large as decimal fraction 1/8 and strictly smaller than 1/4).

After this rounding down, the only amounts of money possible in either (and both) envelopes are ..., 1/8, 1/4, 1/2, 1, 2, 4, 8, ... and all amounts are possible.

Now suppose that the probability that the smaller envelope (rounded down) has an amount ${\displaystyle x=2^{n}}$ with probability ${\displaystyle p_{n}}$, where n is any whole number (negative or positive). The probability an arbitrary envelope has x is ${\displaystyle (p_{n}+p_{n-1})/2}$, the probability that it is the smaller of the two is ${\displaystyle p_{n}/2}$. The conditional probability that it is the smaller amount can only be 1/2 if ${\displaystyle p_{n}/(p_{n}+p_{n-1})=1/2}$, and simplifying this equation tells us ${\displaystyle p_{n-1}=p_{n}}$, for all n=...-2,-1,0,1,2,...

Thus the amount of money in the smaller envelope (rounded down as explained above) is equally likely to be any of an infinite sequence, but there is no probability distribution with this property: we can't divide total probability 1 into an infinite number of both equal and positive probabilities.

The resolution of the original paradox is therefore simple: though the assumptions appear reasonable, they are actually inconsistent with one another. No rational person who describes his uncertainty about the world in self-consistent probabilistic terms could ever believe, given that the amount of money in one of the envelopes is x, that the other envelope is equally likely to contain 2x or x/2, whatever x might be.

In the language of Bayesian statistics the uniform distribution over all (rounded down) amounts of money is an improper prior; improper priors are well known to lead (on occasion) to improprietry; in particular to self-contradiction.

The previous analyses showed that the original argument was incorrect. But we can recreate the paradox (and even create a lovely new one) if we generalise the initial problem a little. Let's drop the assumption that the one envelope contains exactly twice the other, since we no longer aim to use in our argument that the second envelope is equally likely to contain half or double the other, given the amount in the first. Now read on ...

### Randomized solutions

Consider the situation when the two envelopes are known to contain two different amounts of money, and one of the envelopes is chosen at random and opened. The player is allowed either to keep this amount, or to switch and take the amount of money in the other envelope, whatever it might be. Is it possible for the player to make his choice in such a way that he goes home with the larger amount of money with probability strictly greater than half?

We are given no information at all about the two amounts of money in the two envelopes, except that they are different, and that neither envelope is empty, in other words, both amounts of money are strictly greater than zero. It might be better to think of each envelope as simply containing a positive number, and the two numbers are different; that's all we know. The job of the player is to end up with the envelope with the larger number.

Counter-intuitive though it might seem, there is a way that the player can decide whether to switch or to stay so that he has a larger chance than 1/2 of finishing with the bigger sum of money. However, it is only possible with a so-called randomized algorithm, that means to say, the player needs himself to be able to generate random numbers. Suppose he is able to think up a random amount of money, let's call it Z, such that the probability that Z is larger than any particular quantity z is exp(-z). Note that exp(-z) starts off equal to 1 at z=0 and decreases strictly and continuously as z increases, tending to zero as z tends to infinity. So the chance is 0 that Z is exactly equal to any particular amount of money, and there is a positive probability that Z lies between any two particular different amounts of money. The player compares his Z with the amount of money in the envelope he is given first. If Z is smaller he keeps the envelope. If Z is larger he switches to the other envelope.

Think of the two amounts of money in the envelopes as fixed (though of course initially both unknown to the player). The only things that are random is which envelope contains the smaller amount of money (that's 50-50), and independently of this, the value of the player's random probe Z. Now if both amounts of money are smaller than the player's Z then his strategy does not help him, he ends up with the second envelope, which is equally likely to be the larger or the smaller of the two. If both amounts of money are larger than Z his strategy does not help him either, he ends up with the first envelope, which again is equally likely to be the larger or the smaller of the two. However if Z happens to be in between the two amounts, then his strategy leads him correctly to keep the first envelope if it is larger, but to switch to the second if the first envelope is smaller. Altogether, this means that he ends up with the envelope with the larger amount of money with probability strictly larger than 1/2. To be precise, the probability that he ends with the "winning envelope" is 1/2 + Prob(Z falls between the two amounts of money)/2.

In practice, the number Z we have described could be determined, to any required degree of accuracy, as follows. Toss a fair coin many times, and convert the sequence of heads and tails into a binary fraction: HTHHTH... becomes (binary) 0.101101.. . Take minus the natural logarithm of this number. Call it Z. Note that we just need to toss the coin long enough that we can see for sure whether Z is smaller or larger than the number in the first envelope. So we only ever need to toss the coin a finite number of times.

But the particular probability law (the so-called standard exponential distribution) used to generate the random number Z in this problem is not crucial. Any continuous probability distribution over the positive real numbers which assigns positive probability to any interval of positive length will do the job.

This problem can be considered from the point of view of game theory, where we make the game a two-person zero-sum game with outcomes win or lose, depending on whether the player ends up with the lower or higher amount of money. One player chooses the joint distribution of the amounts of money in both envelopes, and the other player chooses the distribution of Z. The game does not have a "solution" (or saddle point) in the sense of game theory. This is an infinite game and von Neumann's minimax theorem does not apply. [12]

The argument which leads to infinitely often switching says that because we know nothing about the amount of money in the first envelope, whatever it may be, the other envelope is equally likely to contain half or double that amount. But it turns out that no probability distribution of the smaller amount of money can have the property that given the amount in a randomly chosen envelope, the other envelope is equally likely to have half of double the amount.

Firstly, by a very simple argument, such a probability distribution would have to give probability to arbitrarily large amounts of money. For pragmatists this is enough to disqualify the argument. There must be an upper limit to the amount of money in the envelopes.

Secondly, by a slightly more complicated argument (using the definition of conditional probability and one line of algebra) it follows that the probability distribution of the smaller amount of money must be improper. For many mathematicians this is also enough to disqualify the argument. Whether one is a subjectivist or a frequentist, all probability distributions are proper (have total mass 1).

However this is not satisfactory to all. There are many occasions where the use of improper priors as a way to approximate total ignorance leads to reasonable results in decision theory and in statistics (though there are also dangers and other paradoxes involved). More seriously, a slight modification to the original problem brings us right back to the original paradox without use of an improper prior (though we still need distributions with no finite upper limit).

Let's look for examples and show how easy they are to generate. Let's drop the requirement that one of the envelope contains twice the amount of money as the other. Let's suppose we just have two envelopes within which are two pieces of paper on which are written two different, positive, numbers. Call the lower amount A, the larger amount A+B. So A and B are also positive and I think of them as random variables, whether from a subjectivist or a frequentist viewpoint makes no difference to the mathematics.

We choose an envelope at random. Call the number in that envelope X and the other Y. Is it possible that ${\displaystyle E(Y|X=x)>x}$ for all x? Well, if ${\displaystyle E(Y|X)>X}$ it follows that ${\displaystyle E(Y)>E(X)}$ or both are infinite. But by symmetry (the law of ${\displaystyle (X,Y)}$ is equal to the law of ${\displaystyle (Y,X)}$) if must be the case that ${\displaystyle E(Y)=E(X)}$. Hence ${\displaystyle E(Y)=E(X)=\infty }$.

So to get our paradox we do need to have infinite expectation values. Let's first just show that this is easy to arrange - I mean, in combination with ${\displaystyle E(Y|X=x)>x}$ for all x. Let the smaller number A have a continuous distribution with positive density on the whole real line, and let B, the difference between the larger and the smaller, also have a continuous distribution with positive density on the whole real line. Let's see what happens if we take A and B to be independent, with ${\displaystyle E(B)=\infty }$. A simple calculation shows directly that in this case, ${\displaystyle E(Y|X=x)=\infty >x}$ for all x.

So there exist examples a-plenty once we drop the assumption that the two numbers differ by a factor 2. (There also exist examples which maintain this assumption, but since we have agreed that we can't use step 6 of the original argument, there is not much point on insisting on this particular feature).

How can we resolve the new paradox?

Again some pragmatists will be happy just to see that the paradox requires not only unbounded random variables but even infinite expectation values.

But I would say to the pragamatists that though you might argue that such distributions can't occur exactly in nature, they do occur by a good approximation all over nature - just read Mandelbrot's book on fractals. Moverover as a matter of mathematical convenience it would be very unpleasant if we were forbidden to ever work with probability distributions on unbounded domains. For instance what about the standard normal distribution? And infinite expectations are not weird at all - what about the mean of the absolute value of the ratio of two independent standard normal variables? Not a particularly exotic thing. The absolute t-distribution on one degree of freedom.

Fortunately there are several ways to show why also idealists (non-pragmatists) need not be upset. In particular, we do need not switch envelopes indefinitely!

By symmetry, ${\displaystyle E(X)=E(Y)=\infty }$. So when we switch envelopes we are switching an envelope with a finite amount of money for an envelope with a finite amount of money, whose expectation value is infinite, given the number in the first envelope. But the number in the first envelope also has an infinite expectation value and in fact they two have the same marginal distribution (both with infinite expectation) and we are simply exchanging infinity for infinity. We can do that infinitely many times if we like, nothing changes. Of course any particular finite number is less than infinity, the average of a load of possible finite numbers. It's not surprising and it's not a reason to switch envelopes.

Suppose we actually looked at the number in the first envelope, suppose it was x. Should we still switch? The fact that the conditional expectation of the contents of the second envelope is infinite actually only tells us, that if we were offered many, many pairs of envelopes, and we restricted attention to those occasions on which the first envelope contained the number x, the average of the contents of the second envelope would converge to infinity as we were given more and more pairs of envelopes. We are comparing the actual contents of one envelope with the average of the contents of many. The comparison is not very interesting, if we are only playing the game once. (As Keynes famously said in his Bayesian critique of frequentist statistics, "in the long run we are all dead"). Now, the larger x is, the smaller the chance (since we are not allowing improper distributions) that the contents of the second envelope will exceed it. If we were allowed to take home the average of many second envelopes, the larger the first envelope the larger the number of "second envelopes" we would want to average, till we have a decent chance of doing better on exchanging.

So, on the one hand, infinitely switching is harmless, since each time we switch we are equally well off, if we don't look in the first envelope first. On the other hand, if we do look in the first envelope and see x, and we're interested in the possibility of getting a bigger amount of money in the other, we shouldn't switch if x is too large.

This is where the randomized solution of the variant problem comes in. If we are told nothing at all, and only use a deterministic strategy, there is no way we can decide to switch or stay on seeing the number X=x in the first envelope that guarantees us a bigger chance than 1/2 of ending up with the larger number. For any strategy we can think of, the person who offers us the envelopes can choose the numbers in the two envelopes in such a way that our strategy causes us to have a bigger chance of ending with the smaller number. However if we are allowed to use a randomized strategy then we can get the bigger number with probability bigger than 1/2, by using the random probe method. Choose a random number Z with positive density on the whol real line and compare it to x, i.e., use it to decide if x is "small" or "large". When our probe lies between the two numbers in the two envelopes we'll be led to the good envelope. When it lies above both, or below both, we'll end up either with the second envelope, or the first envelope. But given the two numbers in the two envelopes, it is equally likely that the first is the smallest, as that the second is the smallest.

This is all written down in the literature. There are even a few fairly decent reviews which discuss the many variants of the two envelope problem. Of course many original sources have come up with a new twist, or have a strong personal point of view how the problem should be solved, so this gives the impression that the problem is not solved. Many authors are most interested in showing to the world their (favourite) solution, amny don't even understand the other solutions. Indeed, for people working in different fields, different variants of the problem and different resolutions of the paradox are more interesting. There is no one size fits all. Richard Gill (talk)

## MHP Plan

Hi! I would really like another set of eyes to look at my proposed plan for resolving the MHP content dispute. Would you be so kind as to review it and tell me what you think of the plan? Thanks! Guy Macon (talk) 08:58, 11 May 2011 (UTC)

Thanks Guy, I sure would like to .. but I have a lot of deadlines ahead and then three weeks off-line. I will try to look this weekend. Richard Gill (talk) 17:44, 11 May 2011 (UTC)

## 2 questions

Gill110951 Please can you just explain the two passages here[13]:

2.Replace x by 2x 6.Publish the applications in appropriate hardly refereed electronic journals which provide publishing outlets for those who cannot publish in the serious journals, get yourself a degree and a professorship, if necessary at some only virtual university, get yourself students who write yet more articles about your polynomials

Thanks   — Preceding unsigned comment added by CuccioLia (talkcontribs) 19:53, 12 May 2011 (CEST)
CuccioLia, and you "didn't get the fun"?  Gerhardvalentin (talk) 20:55, 12 May 2011 (UTC)
• This is your explanation? or are you just joking?--CuccioLia (talk) 23:54, 12 May 2011 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

Let me try.

Here is the entire statement, for context:

(start of quote)

Gill polynomials

How to become a famous mathematician

1. Find a famous family of polynomials (e.g. Hermite, Chebyshev, ...)
2. Replace x by 2x
3. Check if you did not hereby hit another already existing family; if so simply repeat (replace x by 2x, again)
4. Rewrite the differential equation, generating equation, etc, etc, accordingly
5. Look for applications of the old family of polynomials. Every single application of the old polynomials, by appropriate substitution, yields an application of your new family
6. Publish the applications in appropriate hardly refereed electronic journals which provide publishing outlets for those who cannot publish in the serious journals, get yourself a degree and a professorship, if necessary at some only virtual university, get yourself students who write yet more articles about your polynomials

Seriously, maybe this trick ought to be explained in the appropriate article on wikipedia. Then the article on Boubaker polynomials can simply refer to that article and to the article on Chebyshev polynomials. Similarly the article on Chebyshev polynomials can refer to this trick-article and mention Boubaker polynomials as an example.

PS the topic is clearly ignotable rather than notable (cf infamous, igNoble, ignominious, ignorable)

(end of quote)

He is clearly expressing a very reasonable opinion about how Wikipedia's rules concerning reliable sources interact with the topic of mathematics. There is nothing wrong with him doing so with some lighthearted humor/parody. Guy Macon (talk) 03:31, 13 May 2011 (UTC)

The context of my indeed humourously intended remark was the discussions about the so-called Boubaker polynomials which are nothing else than Chebyshev polynomials with x replaced by 2x. Invented by Boubaker and "promoted" by Boubaker and his students. One wonders whether Boubaker knew or did not know that he was re-inventing the Chebyshev polynomials. Either way does not do any credit to him. Richard Gill (talk) 05:33, 13 May 2011 (UTC)

## About Boubaker Polynomials and Gill Polynomials, Dear Gill: still not clear!

I see, it was humour!
From the old dicussion (which was serious) one can see that things are not so. Polynomials that replace x by 2x in chebyshev are Dickson polynomials. This is a basic and simple information. Was Dickson <and his students> re-inventing the Chebyshev polynomials? is there something problematic with Polynomials in general?
The single remaining problem with me is the content of studies published by well known some Professors with known history and position ( Eminent al.Professor P. Barry Website, Professor A. Yildirim Homepage, Eminent Professor Neil J. A. Sloane Homepage Prof Dr Syed T. Mohyuin website...). See Chapter 6: (p 23),(page 40):The boubaker polynomials; are they works of someones' students? buzz?

--CuccioLia (talk) 08:00, 13 May 2011 (UTC)

The Boubaker polynomials are a sequence of polynomials, a trivial variant of the Chebyshev polynomials, which due to relentless self-promotion by Boubaker and his colleagues have taken on an alternative name. I'm sorry if I gave misleading information as to *which* trivial variant they are. My little humorous essay (which did not mention Boubaker, and which obviously can be rewritten in many variants, since a clever mathematician can think of lots more ways to generate new families of polynomials from old) can easily be modified. So yes, there is something wrong with polynomials in general, or rather with the ckecklist of 1) differential equation, 2) recursion equation, 3) applications... . One also needs to check that the family of polynomials is not a simple variant of an existing family. And the existence of many publications are no guide to mathematical notability. Richard Gill (talk) 10:09, 13 May 2011 (UTC)

Professor Gill110951, Yes you are right.
just two ambiguous points:
The work of the authors mentioned below is a trick?
Dickson polynomials are a trick also, they are simply :${\displaystyle D_{n}(2x)=2T_{n}(x)\,}$

Ok, but they are published there in all the Wikis [14], so what is the paradox with the so-called Boubaker polynomials (how are they linked to Chebyshev? also =2T_n(x)??).

PS. in the reference it is demonstrated thet thes polynommials are linked to Fermat ??! polynomials ::: ${\displaystyle B_{n}(x)={\frac {1}{({{\sqrt {2}})}^{n}}}F_{n}({\frac {2{\sqrt {2}}x}{3}})+{\frac {3}{{({\sqrt {2}})}^{n-2}}}F_{n-2}({\frac {2{\sqrt {2}}x}{3}}),n=0,1,2,...,}$
a buzz too?
Thank you for giving help on this, I am doing a study on it.
The wikipedia page on Dickson polynomials explains that they are defined for general fields and only happen to correspond to Chebyshev polynomials when the field is the field of complex numbers. This means that they are defined and studied in their own right in a completely different context to the Chebyshev polynomials, and indeed, in contexts where the ordinary Chebyshev polynomials are meaningless. It is merely a coincidence that they coincide, over the complex numbers.
As far as I can see the nth Boubaker polynomial is a simple linear combination of the nth, n-1th and n-2th Chebyshev polynomial of the second kind. The linear combination carefully chosen so as to preserve orthogonality. This is a trick which can be applied to any existing family of orthogonal polynomials. And it's a trick which allows one to get the differential equation, the generating function, and so on and so on, easily. The same trick allows one to make any application of Boubaker polynomials into an equally easy (or equally difficult) application of Chebyshev polynomials. So it seems to me that there ought to be an article on Boubaker polynomials on wikipedia and the article ought to be very short, explaining the bridge to Chebyshev. Or, within the article on Chebyshev polynomials there could be a short section on Boubaker and the article on Boubaker could redirect to Chebyshev. It seems to me that there is no notable theory or application of Boubaker polynomials which could not have been done equally easily with Chebyshev. This is a rather different situation from Dickson, or Fermat!
You seem to be an expert in this field (I am not). Do you know if Boubaker himself was aware of Chebyshev polynomials? And are the main "supporters" of Boubaker aware of the relationship? Hence I do not know if the writers on Boubaker polynomials are deliberately using a trick or if they are just unaware of the already existing literature. Richard Gill (talk) 14:50, 14 May 2011 (UTC)
Thank you for explanation. me too I am not an expert, but very interested in cultural differences. It is difficult to know the question about boubaker himself as it seems to refer to a dead person (see here in this encyclopedia [15] and here [16] it is said Boubaker polynomials have been named after Boubaker Boubaker (1897-1966))
From the buzz, I can see that the problem in the french Wiki with that is a cultural one. have a nice day .--CuccioLia (talk) 20:43, 14 May 2011 (UTC)

## The Holy Grail of MHP

Here is a short and elementary and complete solution of the MHP, which actually covers the biased host situation just as well as the usual symmetric case. There is no computation of a conditional probability. All we have to do is to consider two kinds of players: a player who in some situations would stay, and a player who in all situations would switch. We show that both kinds of players are going to end up with a goat with probability at least 1/3. In other words, it's not possible to do better than to get the car with probabillity 2/3. But always switching does give you the car with probability 2/3. Hence always switching achieves the best that you can possibly do.

Suppose all doors are equally likely to hide the car, and you choose Door 1.

If you are planning to stick to Door 1 if offered the choice to switch to Door 2, you'll not get the car if it is behind Door 2. In that case Monty would certainly open Door 3, you'll have the choice between Doors 1 and 2, and you'll keep to Door 1. Chance 1 in 3.

Similarly if you are planning to stick to Door 1 if offered the choice to switch to Door 3, you'll not get the car if it is behind Door 3. Probability 1/3.

If on the other hand you are planning to switch anyway, you'll not get the car if it is behind Door 1. Chance 1 in 3.

Altogether this covers every possible way of playing, and however Monty chooses his door: there's always a chance of at least 1/3 that you'll end up with a goat. This means that there is no way you can do better than getting the car with chance 2/3.

We know that "always switching" guarantees you *exactly* a chance of 2/3 of getting the car. I've just shown you that there is no way this can be improved.

Side remark 1: For those who are interested in conditional probabilities, the previous remarks prove that the conditional probabilities of the location of the car (given you chose door x and the host opened door y) will always be in support of switching. Otherwise, we could improve on the 2/3 overall succcess-chance of always switching, by not switching in a situation indicated by the conditional probability of winning by switching being less than 1/2.
Side remark 2: For those who are worried that I did not talk about randomized strategies (e.g. you toss an unbiased coin to decide whether to switch or stay, when you chose Door 1 and the host opened Door 3) it suffices to remark that you could as well have tossed your coin in advance of the host opening a door. Thus this is the same as choosing a deterministic strategy in advance, by randomization. Since any deterministic strategy gives you a goat with probability at least 1/3, the same is true when you choose one such strategy at random.

Of course 20 text-books in elementary probability theory do MHP in a different way, while the previous analysis is only implicit in the recent papers of A.V. Gnedin. However as a service to wikipedia editors Richard D. Gill (mathematician) will place this analysis on his university home page so there is at least one reliable source for it. Richard Gill (talk) 13:11, 1 July 2011 (UTC)

Incredibly! Now try to formulate this in proper math.Nijdam (talk) 09:54, 2 July 2011 (UTC)
See [17]. Richard Gill (talk) 12:28, 2 July 2011 (UTC)
Who are you trying to fool? Yourself? Nijdam (talk) 16:29, 2 July 2011 (UTC)
Please try to be civil on my talk page, you're a guest here. So you don't follow the argument? So far you're the only one who didn't get it. Check for yourself: for each of the 3x2x2 deterministic strategies of the player (first choice of door; whether to switch or stay if the lower numbered of the other two is opened; idem for the higher numbered) there's a door number such that a car behind that door will not be won. Richard Gill (talk) 17:57, 2 July 2011 (UTC)

Richard, is this (in)correct:
Maths and learning maths is one fine thing, but the MHP question is not "just maths". Any mathematical formulation (e.g. "conditional probability"), as long as it's not just only a lesson in maths and just only an "example in learning Bayes", but if it actually claimed to supposedly be giving an *answer to the famous MHP question*, is demonstrably false as long as its result differs from "probability to win by switching = 2/3". Because any differing result inevitably must have been based on unproven assumptions and consequently impossibly can be taken seriously, as long as it doesn't offer an alternative second result at the same time, differing in the other direction, likewise. Probability to win by switching, based on known facts, is 2/3 (48/72). So

if a "conditional" result says "47/72", it has to offer a second alternative result at the same time that says "or equally 50/72".
if a "conditional" result says "40/72", it has to offer a second alternative result at the same time that says "or equally 64/72".
if a "conditional" result says "37/72", it has to offer a second alternative result at the same time that says "or equally 70/72".
if a "conditional" result says "36/72", it has to offer a second alternative result at the same time that says "or equally 72/72".

Otherwise any differing result would be "wrong from the outset". Because one impossibly can claim "but what if ...", without actually saying also "but what if NOT ...", just in order to remain seriously.

Gerhard, I find it difficult to understand you. First of all, Vos Savant's question was not "what is the probability?" but "should you switch?". Secondly, what do we want to assume, beyond the words of Marilyn Vos Savant? You can only get an answer by making some assumptions. Thirdly, going back to probability questions, asking "what is the probability" is actually an ambiguous question because there are so many different ways to understand the concept of probability (frequentist, subjective...), and there are different probabilities which can be of interest (conditional, unconditional...).
When laypersons ask "what is the probability?" you can be sure they don't understand their own question. Lucia de Berk went to jail for life because a judge asked a statistician "what is the probability?"
The title of my paper was "Monty Hall Problem is not a probability puzzle: it's a challenge in mathematical modelling". MHP is indeed not just maths (pure maths). The real problem, when faced with Vos Savant's question, is to come up with a decent mathematical formulation. I argue that the mathematician should offer the consumer a menu of different options. There are no free lunches. The more you want to get out, the more you have to put in.
My opinion is written out in my Statistica Neerlandica paper and it's: I hope you were wise enough to choose your door initially completely at random. That's my extra assumption. If you will accept this assumption, then my answer to you is that you should switch. You'll win the car with (unconditional) probability 2/3, because 2/3 of the time you initially pick a goat (because *you* chose a door "at random"). I am not interested in "the" conditional probability of wining given the door you initially chose and given which door was opened by the host. This has many different meanings depending on what you mean by probability, and it is only possible to answer the question under supplementary assumptions not specified by Vos Savant. I also know that you can't get a better result that the 2/3 which is guaranteed by choosing at random and aways switching. The host can prevent you from doing better.
If you want to make more assumptions, we can do more. For instance, if you know the car is *hidden* at random, so all doors are equally likely to hide the car, then I can (of course) tell you to switch, I can (of course) tell you that you'll get the car with (unconditional) probability 2/3, but I can also tell you that you cannot do better. This can be proved with the help of Bayes' theorem or with the help of Sasha Gnedin's clever observation: given a strategy of the player there is a door such that a car behind this door will not go to the player, however the host plays. Proof: by inspection. Consequence: however the player plays, the probability he gets the car can't be more than 2/3. So a strategy which guarantees 2/3 is optimal. Always switching guarantees 2/3. No need to learn Bayes theorem to get this conclusion. Richard Gill (talk) 09:17, 3 July 2011 (UTC)

## my edits

Richard, please pardon that my edits had the side effect to erase what you just had written in the article talk page. The actual differences are [here]. I don't edit any more for the moment, just to avoid repetition of the trouble. Please pardon the trouble. Gerhardvalentin (talk) 08:57, 5 July 2011 (UTC)

All seems OK to me! Richard Gill (talk) 10:10, 5 July 2011 (UTC)

## Fairness?

I consider it quite unfair that you wrote: Especially since the editor who insisted that the simple solution was downright wrong (and as a corollary, all sources giving it, are not reliable) got banned for a year. You well know my ban has nothing to do with this view. Is this the way you discuss? Tell me: do you use the MHP in one of your courses, and do you give the incorrect explanations as a solution? That would explain a lot. Nijdam (talk) 21:28, 7 July 2011 (UTC)

Well, I thought your ban did result from the fact that your behaviour contravened wikipedia principles. You insisted that sources could be disqualified because they contradicted your opinion of the mathematical truth. Your behaviour was judged to be disruptive, because of this. I did not make this decision, and I did not support it. But it certainly is the case that the atmosphere among the present editors working on the page is a lot more friendly and constructive than it used to be. The two editors who took the most extreme views were both banned.
How I use MHP in my courses: read my Statistica Neerlandica paper, especially the title and the conclusion. My starting point is Marilyn Vos Savant's written words (her paraphrase of Craig Whitaker's question). I note that she asks for a decision, not for a probability. I take the liberty to start thinking about this decision in advance of appearing on the show, since, as a mathematician, I have a responsibility to give my client (the player) the best possible advice. If a client comes with a stupid question I do not just answer the question (the answer will be stupid too) and send in the bill for a hefty consultation fee - that is the approach which gave Lucia de Berk a life sentence for serial murder. Clients often have a preconceived idea of how they want a mathematician to solve their problems and very often those preconceived ideas are wrong, so I take the liberty to discuss possible reformulations of the client's question with him. This is called the Socratic method and I learnt it from my teacher Jan Hemelrijk who was brilliant both doing and teaching statistical consultation.
From this philosophy I come to my personal favourite solution "choose your door completely at random and switch. Don't worry about conditional probabilities, they are meaningless". Please also read the latest publications by A.V. Gnedin on arXiv.org. He shows how the decision theoretic notion of dominance can be used to powerful effect. It turns out that one can say a great deal about optimal decisions in MHP without using probability at all.
Here is a quote from a paper on the related two envelope problem by Caspar Albers, Barteld Kooi and Willem Schaafsma on the related two envelopes problem. "It often happens that the statistician is asked to use data in order to compute some posterior probability, to make a distributional inference, or to suggest an optimal decision. Some, perhaps many, of these situations are such that the lack of relevant information is so large that it is wise not to try to settle the issue". Richard Gill (talk) 07:25, 8 July 2011 (UTC)
You're clearly missing the point about what I said of my ban. Then about the problem. Sometimes people, just like you now did, come up with the remark that the problem does not ask for a probability, but for a decision. I really cannot see the implication of such a remark. If I give the answer: "Yes, switch", the immediate reaction will be: "Why?". And the answer to this "why" is what it's about. And, you know what? The intriguing aspect of the problem is, that most people initially, seeing the two still closed doors, react by thinking the odds are equal. So, no probability? And, the player, having chosen a door and seeing one opened by the host, won't benefit from game theoretical approaches, nor from the fact that he (better?) could have made his choice at random. I agree with your remarks on statistical consultation. I myself was for 3 years involve as an assistant of Leppink. But I do not see any implication for the MHP. Nijdam (talk) 08:11, 11 July 2011 (UTC)
Oh, you want to solve the problem where the player doesn't know the rules in advance! I think that's a rather silly problem, and definitely not the mainstream version. The player knows in advance that the host is going to reveal a goat and offer a switch. So he can and should think strategically. Yes, everyone initially gets the wrong answer because they don't think strategically. Richard Gill (talk) 08:23, 11 July 2011 (UTC)
But of course, even though the *answer* is switch, the reason for the answer is what we really want to know. My point is that there is not one unique good answer. There are many good answers.Every answer will require making some assumptions, and part of a good answer is good justification of assumptions. A good answer which uses probability must explain what it means by probability. From this perspective, the answer that an always switcher wins 2/3 of the games while an always stayer only wins 1/3, under the assumption only that the initial choice incorrect 1/3 of the time, is splendid. Go on to remark that if we don't *know* that the car is hidden by a fair randomization device, we can still go in to ensure the truth of our assumption by choosing our own door uniformly at random. Then for completeness one can refer to the minimax theorem which tells us that there is no way to be sure of a better result. I think that statisticians need to know that their game is nt the only game in town, and they need to understand and explain what they probability means. Richard Gill (talk) 05:20, 13 July 2011 (UTC)
Let me spell it out for you: the simple solution S0 reads: the car is with probability 1/3 behind door 1. As the opened door 3 does not show the car, it will be with probability 2/3 behind the remaining door 2. Another simple solution S1 reads: the initial choice of door 1 hits the car with probability 1/3, Hence switching gives the car with probability 2/3. My simple (!) question is: do you mention any of these solutions as a solution to the MHP in your courses? Just answer with no or yes, and the number of the solution. Nijdam (talk) 05:25, 18 July 2011 (UTC)
Nijdam, I do understand what you're getting at. I am careful to say things which are correct. So of course I don't give S0 as a solution. I do give S1 as a solution. But on wikipedia the rules are that you can only write what has been published in reliable sources, and the definition of reliable sources had got nothing to do with the truth or falsity of what is written in them. Wikipedia summarizes what people write. Also things which are incorrect. If you would like to correct wikipedia the only option you have is to write reliable sources yourself, and hope that others will write about what you have written. Then maybe in ten years or so wikipedia will cite you, too. Richard Gill (talk) 15:00, 18 July 2011 (UTC)
Quote (see my talk page): If a source makes an obvious error of fact, it is clearly unreliable. Next question to you:: Do you make clear to your students that the solution S1 is the solution for the unconditional version and not to the full (conditional) version? Nijdam (talk) 14:24, 21 July 2011 (UTC)
Is your quote a quote from wikipedia policy, or is it your opinion? And what does "obvious" mean? How is "obvious" determined, within wikipedia policies? Regarding your next question: I make it clear to my students which answer is an answer to which question. But I am only interested in the question "should you switch?". I am not interested in the question "what is the probability.." except as a means to answer the decision question. Richard Gill (talk) 18:39, 21 July 2011 (UTC)
Why didn't you just look at my talk page? You would have seen it was an answer to a question of mine about Wiki policy. Strange that you try to question this. Don't you agree that a source that is mistaken cannot be a reliable source? Next again about the question to be answered. You said yourself, the important issue is what the reason is for the answer: yes, you should switch. What do you think most people come up with? What do you think makes the MHP notorious? Finally, if you clearly tell your students about which answer belongs to which version, why are you reluctant to tell this to the Wiki readers?? Nijdam (talk) 21:45, 21 July 2011 (UTC)
OK, so which source which presents a simple solution do you consider is obviously wrong? Is Selvin's first letter obviously wrong? I am not reluctant to tell to wikipedia readers that there is a difference between unconditional and conditional probabilities. I would like the presentation of so-called simple solutions to clearly explain what it is they compute. I think the conditional solutions should be presented in a constructive way, not a destructive way. The general reader is not interested to know that some academic sources think that some solutions are wrong. They are interested to learn about MHP.

This discussion does not interest me very much any more. Richard Gill (talk) 22:16, 21 July 2011 (UTC)

Unreliable are sources like Devlin and Adams, presenting solutions with logical errors in it. I didn't say you were reluctant to tell about the difference between unconditional and conditional PROBABILITY, but about the two main different interpretations of the problem. You said in the past you did not want to make clear from the start this difference, but yet present the simple solution, and hence leaving open the possibility the reader may consider it a solution to the full (conditional) version of the problem. It is this point the main discussion is about. And I'm sure you are aware of it. The general reader is definitely interested to learn about the MHP, and he should learn about it, but in a correct way. Any problem with this? Nijdam (talk) 12:55, 22 July 2011 (UTC)
Nijdam, the simple solution is a simple argument why switching is a good idea. I am not in favour of presenting the simple solution as a solution of the problem "what is the conditional probability...?". I've written up on Citizendium and StatProb how I think MHP should be presented in an encyclopedia. Though now I would also highight the dominance argument: every strategy is dominated by an "always switching" strategy, so there is simply no point in asking for probabilities, conditional or otherwise.

Please could we drop this discussion now. I am working on other projects. Richard Gill (talk) 15:25, 22 July 2011 (UTC)

Of course, no further discussion. Just two remarks. I never suggested you would use the simple solution as a conditional probability, but why you were reluctant to make a clear distinction between the two main interpretations. Anyway, in Dutch we should say: "je lijkt om de hete brij heen te draaien." Good luck with your new challenges. Nijdam (talk) 22:04, 22 July 2011 (UTC)
I am sure that you have both heard the statement that every statistic is the answer to a question, the problem is, which question? Once you have a clearly-defined, well-posed question the answer is much easier. Whitaker's question is not such a question.Martin Hogbin (talk) 08:38, 8 July 2011 (UTC)
And 80% of statistics are just made up. Seriously, you hit the nail on the head, Martin. And MHP is *both* a popular brain-teaser *and* a popular vehicle for teaching conditional probability in the probability class, *and* a splendid vehicle for teaching game theory, too. It challenges our conceptions of probability and it challenges the "experts" in communicating with ordinary folk. And that's why it is so fascinating. Richard Gill (talk) 19:19, 8 July 2011 (UTC)
And that is why there is a logical separation of the MHP is into two parts. The first covering the popular brain teaser, without too much fretting about detail, and the second the vehicle for teaching and learning, in which all the bases are covered. Martin Hogbin (talk) 08:48, 11 July 2011 (UTC)
Yes! Right now I am enjoying two envelopes problem where we have the same dual nature and hence the same problems (together with the problem of a possessive editor). And, enjoying the latest discoveries of Sasha Gnedin on MHP. Now this really is amazing, that some completely new mathematics could be done on this simple and old problem. It is really brilliant. Richard Gill (talk) 13:48, 11 July 2011 (UTC)
There are strong similarities between the two articles. I am trying to present the subject matter in a way that makes it accessible to most people. Of course, we want the mathematical detail as well.
Many people will only find the TEP interesting because it has some relation to reality. They imagine being perpetually perplexed on being presented with two envelopes. For these people anything that in impractical is purely theoretical. The problem could be presented to mathematicians in a purely abstract way but most people would find such a problem uninteresting. Martin Hogbin (talk) 08:42, 13 July 2011 (UTC)
Most people *should* find TEP uninteresting! It is a conundrum about logic. It is only interesting if you are interested in logic and possibly also in semantics and mathematics. There is no vast popular literature on it. There is only a vast technical literature.

And interestingly, there is almost no *secondary* or *tertiary* literature on TEP. It is almost all purely research articles, each one promoting the author's more or less original point of view, and each one criticising earlier "solutions". And so it goes on. The three papers which cite those two young US philosophy PhD's do so in order to criticize their solution, and to propose an alternative. According to wikipedia guidelines on reliable sources, the article on TEP should be very very brief and just reproduce the comments in a couple of standard (undergraduate) textbooks on TEP. EG David Cox's remarks in his book on inference. I have no idea if there is a standard philosophy undergraduate text which mentions TEP. Our friend iNic hasn't mentioned one. We should take a look at other encyclopedia articles on TEP. I think I will write one for StatProb, and then wikipedia editors can use it. Survey papers are admissable secondary sources for wikipedia provided they do not promote the author's own research. They are a primary source for the latter. Richard Gill (talk) 16:14, 13 July 2011 (UTC)

Ordinary people won't be perplexed. They know by symmetry that switching is OK but a waste of time (if you don't open your envelope). They don't really understand probability calculations anyway, so they know there is something wrong with the argument, but don't care what.

Regarding Smullyan's version, they also know the answer (it doesn't matter whether you switch or not) so they know which argument of Smullyan's is correct. As writer after writer has stated, the problem of both original TEP and of TEP without probability is using the same symbol (original) or the same words (Smullyan) to denote two different things. It's a stupid problem and has a simple resolution.

Well, and if we are allowed to look in our envelope, then everything is different. But no longer very interesting for laypersons. It turns out to be a fact of life that there are "theoretical" situations (but I can simulate them on a computer for you, they are not that theoretical!) where according to conditional expectation value you should switch, whatever. OK, and this is just a fact about a random variable with infinite expectation value: if I give you X you'll always be disappointed compared to its expectation. But to get its expectation I would have to give you the average of millions and millions of copies of X. Eventually the average will be so large that you'll prefer the swap whatever the value of the X I gave you first. Then there are all kinds of nice calculations about whether or not you should switch given a prior distribution of X and there are cute things about what to do if you don't want to trust a prior ... then you should randomize. It's all rather technical, isn't it. Only interesting for specialists. By the way this is *not* theoretical since I can approximate a distribution with infinite expectation with one with very very very large finite expectation. I can create approximately the same paradox without actually using infinite values. Syverson does that. It's very technical and hardly interesting for ordinary folk. Richard Gill (talk) 16:26, 13 July 2011 (UTC)