Talk:Coupon collector's problem

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated Start-class, Low-priority)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
Start Class
Low Priority
 Field: Probability and statistics
WikiProject Statistics (Rated Start-class, Low-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Low  This article has been rated as Low-importance on the importance scale.


Regarding the numerical example in the first paragraph of this article:

The mathematical analysis of the problem reveals that the expected number of trials needed grows as O(nlog(n)). For example, when n = 50 it takes about 225 samples to collect all 50 coupons.

Although the O-notation is correct as it is, I think it would be better to be consistent with the following article and write n ln n + \gamma n + O(1). Especially since the numerical example is otherwise not reproducible. Unfortunately I do not know how to edit the very first paragraph of a page -- and couldn't find further information on this question. Can you help? (Netzwerkerin (talk) 22:08, 24 May 2009 (UTC))

To edit the very first paragraph of a page, click the "edit" button at the top of the page – the third of the buttons "article • talk • edit • ..." etc. (Or, go to My Preferences -> Gadgets, and in "User interface gadgets", turn on the option to "Add an [edit] link for the lead section of a page".) Shreevatsa (talk) 22:16, 24 May 2009 (UTC)

What does that notation even mean? What math course should I have taken to even know what O(stuff) means? (talk) 20:36, 9 July 2012 (UTC)

See Big_O_notation. It's usually taught in computer science courses on theory or algorithms, not math courses. — Preceding unsigned comment added by Acertain (talkcontribs) 18:21, 14 November 2012 (UTC)

Different probabilities[edit]

There should be a section on the solution to the problem if not all coupons are equally distributed. (talk) 00:49, 4 September 2009 (UTC)

Problem definition[edit]

I think the problem definition could be a bit clearer. It says you are collecting random coupons with replacement. Then you are asked how many coupons you are expected to draw before collecting all n of them.

It's very easy to misunderstand this (as I did): You draw one coupon, then you put it back, which means you now have 0 coupons. You can keep drawing them, but if you keep putting them back, you'll never collect more than 1. A better definition of the problem would be:

Given n coupons, how many coupons do you expect you need to draw with replacement before having drawn each coupon at least once. —Preceding unsigned comment added by (talk) 11:30, 21 January 2010 (UTC)

I agree with this. The problem defn. on the page was not clear to me at all. Magicmike (talk) 19:42, 22 February 2011 (UTC)

Merger Proposal[edit]

Coupon collector's problem (generating function approach) could be integrated under a section Solutions. I just don't understand any of it to be able to advise... Haruth (talk) 22:06, 9 August 2011 (UTC)

I'd prefer it be transwikied to wikiversity or wikibooks or something. Or maybe just deleted. I like having proofs in articles when they are short, to the point, and most importantly illuminating, and I think the parts already in the "Solution" section are good, but to me this one looks more like a long tedious calculation, written in a non-encyclopedic textbook-like style. —David Eppstein (talk) 23:09, 9 August 2011 (UTC)
Actually I'd rather keep it as it is, though style and form could be improved. Since problem is well known under this name, I c no reason not to have an article. Even if you argue the current article('s style) doesn't feel particularly encyclopedic, the content nevertheless is. Hence there is no treason to transfer it to wikiversity or wikibooks.
The integration as an example or application into a more promeinent article is an option but i don't really like that one either. Since we need to keep at least a Redirect anyhow (so that the problem can easily find by its name) and furthermore such an integration would messup the proper matching with interwikis.
Generally I see no issue with having (small) entries for common or well known recreational, educational, historical math problems as long as there's a proper name for it and they are covered in literature, i.e. can be properly sourced. Keep in mind WP is many things to many people and I'd assume this article certainly has its readership.
So all in all style and form improvements - yes, but no reason for an integration or a removal.--Kmhkmh (talk) 01:19, 10 August 2011 (UTC)
P.S. I didn't really look at target article for the merger before. Given that it is just about the coupon collector's problem as well and has no interwikis my comment regarding the integration doesn't really apply. Hence yes both articles should be merged indeed.--Kmhkmh (talk) 09:36, 10 August 2011 (UTC)
I agree with David. This kind of detailed proof is useless. In fact, I would vote for deletion. Mhym (talk) 07:51, 10 August 2011 (UTC)
Again useless to who? But more importantly from useless assessment of the of the proof doesn't follow one for the article as a whole. We still want to provide the information what the coupon collector problem is.--Kmhkmh (talk) 09:21, 10 August 2011 (UTC)
I didn't think anyone was talking about deleting this article, it's notable and fairly reasonably written. The generating function proof article though has no indication of notability and no citation and just looks like an exercise somebody did. I don't see anything there that is worth keeping. So I also agree with David Eppstein, transwiki or delete the generating function approach article. It is quite common to use such an approach to problems like this, if someone bothers to come up with a citation they can stick some basic stuff here about it but any details of the proof should be left to the citation. Without a citation I wouldn't even bother mentioning the approach. Dmcq (talk) 15:39, 11 August 2011 (UTC)
Ah ok, then I've misread this conversation as I was primarily concerned with this article. As far as generating function approach article is concerned I'm fine with a transfer and/or deletion.--Kmhkmh (talk) 16:00, 11 August 2011 (UTC)
For what it's worth, I found the generating function proof quite useful. While I understand that it might not fit with Wikipedia's policies, it should be moved to somewhere more appropriate, not deleted. (talk) 14:46, 31 October 2011 (UTC)

LOL, this issue is very funny. Anyway, we should bring the article to here. --Yodamgod (talk) 15:02, 22 July 2012 (UTC)

I'd concur with a merge - at least of a type. This page talks about approximations a lot. There is a fairly simple explicit exact solution for the distribution function etc. The indicated page talks to it, but isn't very clearly presented. If there is time and effort to present clearly, should be an improvement. Not sure that there is enough interest here to do so. Paul Erickson Oct 10, 2013 — Preceding unsigned comment added by (talk) 16:43, 10 October 2013 (UTC)


I keep getting either 85 if I use (50 x (log_base_10 of 50)), or 196 if I use (50 x (log_natural of 50)), so how do you get 225? (talk) 17:20, 28 April 2012 (UTC)

There are lower order terms that you're leaving out, particularly the γn term. Using the natural log, (50 ln 50) + 50γ + 1/2 ≈ 224.96. —David Eppstein (talk) 18:50, 28 April 2012 (UTC)
Note that γ here (which looks like y or \/ on some displays) is \gamma \approx 0.5772156649, the Euler–Mascheroni constant. —WhackTheWiki (talk) 19:01, 9 February 2014 (UTC)

On a related note, I find the use of "log" in place of "ln" referring to the natural log rather than log_10 to be confusing throughout the article. In my personal opinion it would help a lot if the notation were standardized. (talk) 18:53, 1 April 2013 (UTC)rlrogers

Log is the standard notation, for research mathematics. I realize there's an ISO standard that recommends ln, but it also recommends using lg for something other than its usual meaning of the binary logarithm, so I don't put a lot of credit in it. I agree that we shouldn't inconsistently mix log and ln as we currently do, though. —David Eppstein (talk) 20:18, 1 April 2013 (UTC)

Coupon ?[edit]

Is this BE usage. In ordinary AE understanding you collect coupons by finding discount ads and clipping them. Having more than one per product is only rarely accepted. [1] "Collecting" several of those to make a set isn't a common scheme. You sometimes encounter competitions where collecting various game pieces is required to enter to win. But I haven't seen those referred to as coupons in AE. Is there another word for this? I think my assholes needs to be loosened up. (talk) 03:13, 24 August 2012 (UTC)

Real world examples[edit]

I was directed to this problem today after asking a question on a forum. The question concerned purchasing six packs of Pabst Blue Ribbon bottles, whose caps each represent one of 52 standard playing cards. Turns out the expected number of beers you would have to buy is 235.978 to get all 52, which works out to 39.33 six packs. A quick Monte-Carlo simulation showed that the expected number drops to around 226 beers if you assume that no individual six pack contains duplicates which seems to be the case so far. Just thought this was a fun example, of course there are many other real world examples of this problem as it is defined in real world terms. Perhaps eventually there could be a section on the wiki page of specific examples such as this? I love this problem!

Mark and recapture[edit]

No, this is not a related problem. In fact, it implicitly assumes that there is a large number of birds, while a CCP talks about capturing all coupons. I will rv again. Please let other editors decide. Thank you. Mhym (talk) 22:45, 18 August 2013 (UTC)

Of course it's not the same problem, but the very fact that you can describe it in comparison with CCP makes me think they are sufficiently related to warrant a link for people who are looking for the right formulation of their problem. I believe it is "tangentially related" and deserves a mention there as per MOS:SEEALSO. OK, I won't do it myself, but would encourage someone else to do so. --A3 nm (talk)

Tail estimates[edit]

I'm not familiar with how and/or when to do stuff, so I'll let this be done by someone more familiar with contributing to wikipedia. But I wanted to inform that the book Randomized Algorithms from 1995 would be a good citiation for the section about tail estimates. — Preceding unsigned comment added by 2A00:1398:9:FB00:6236:DDFF:FE0E:B05F (talk) 16:00, 28 January 2014 (UTC)