# Talk:Binomial distribution

WikiProject Mathematics (Rated B-class, Mid-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 Mid Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.
WikiProject Statistics (Rated B-class, Top-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
Top  This article has been rated as Top-importance on the importance scale.

## Clarifications

If you go to previous versions and look at the first one, 02/15/2001, which is yours?, you will see :

1). q (1-p), maybe a typo?

2). And the formula for the numbers of ways of picking X items out of N items was: N!/X!/(N-X)!. This is plain wrong. Yes, after requesting a change for a week, I changed it.

3).There were also wording problems. RoseParks.

I see now the problem. (1-p) was intended as a parenthetical definition. I guess N1/X!/(N-X)! worked in my programming codes so I couldn't see the ambiguity. How would you calculate N!/X!/(N-X)!? From right to left? On the other hand, Today is 02/20/2001, so I think your "requesting a change for a week" is a bit off. Today is only the 20th by my calendar. In any case, the criticism has led to something better. Dick Beldin---- In answer to your question on how you evaluate, N!/X!/(X-N)!, this is ambiguous. In any easy example.

2/4/12 is ambiguous since

• (2/4)/12= 2/48=1/24 while
• 2/(4/12)= 24/4= 6.

Multiplication is associative over the reals. If you look at division as the inverse operation of multplication, i.e. 2/4/12=2*4^1*12^1=1/24 you are okay. If you look at division in the ordinary sense, you must specify the order of operations.RoseParks

I agree that an expression with successive divisions appears ambiguous. Most mathematicians I know do indeed consider division as the inverse of multiplication and many programming languages explicitly specify that multiplication and divisions are performed left to right. You are correct, it is not a universal convention. In addition, the vertical placement of numerator and denominator is clearer. Dick Beldin

## Confidence Interval?

I was looking for information about confidence intervals on a binomial distribution, but was surprised not to find it here. I know this case isn't quite as simple as for normal distributions, but it would be nice to have here, if somebody would like to contribute the information.

You mean CI of p, the success probability, as estimated from the data. If 70 successes in 100 trials, then p_est = 0.7, and your question is what is standard deviation of p_est. It is sqrt(p_est(1-p_est)/n_trials). The 95% confidence interval is +/- 2 standard deviations. My question is what happens if the CI range is outside the allowed 0 to 1 range for a probability. This can happen if p_est is ~1 or ~0. The CI has to be assymetric. Any ideas?

In the case where the confidence interval gets close to 0 or 1, the normal approximation of the binomial distribution is not accurate and rules like your "2 standard deviations" that are derived from the normal distribution are not accurate either. Depending on the circumstances, one can use a different approximation (such as the Poisson distribution) or the exact values of the binomial distribution. McKay 06:36, 27 October 2006 (UTC)
Another intuitive understanding of confidence intervals on binomial distribution is this: say you S success in N trials, now we don't know what p really is, but let's make a guess p_guess. You can use the binomial distribution to calculate the probability of seeing [S_observed or more success] in N trials if p=p_guess. If S~N (p~1), there will be a value for p_guess so that there is only a 5% chance of getting S>=S_observed. At this value for p_guess, you would say you are 95% confident that p>=p_guess. Similarly if S~0. For example if you did a trial with 45 samples (N=45) and all of them were successful (S=45), if p_guess=.95 there is only a 0.099 chance of seeing all 45 successful, so there you can be 1-0.099 => 90% confident that p>=95%. —The preceding unsigned comment was added by 64.122.234.42 (talk) 17:14, 11 April 2007 (UTC).
Yes, this is the kind of consideration I was shooting for with my original post. While I understand adequately how to create a hypothesis test for a one-sided alternative, I was hoping that someone would come forward with a good methodology for doing a two-sided alternative hypothesis, since this would imply some means of parameterizing the asymmetry of the distribution. I had this come up in a real-world scenario, where the question was whether or not we had a statistically significant result, and how close it might actually be, but the interval was not so critical or well-defined.

## Simulation?

I was looking for a pointer to quickly simulate a Binomial trial. That is, given a p and an n, I want to randomly select a result with a Binomial distribution. I know I can approximate this with a normal distribution, but I would prefer an exact result if it can be calculated quickly for n < 10,000. I'm sure others have come here looking as well. Thanks.

I added two references to the article which describe binomial random variate generation. A modern C implementation of Kachitvichyanukul and Schmeiser's BTPE algorithm is available as part of the GNU Scientific Library. --MarkSweep 04:08, 8 October 2005 (UTC)

## HIV positive?

Is it me, or should the "A typical example is the following: assume 5% of the population is HIV-positive." part in the second paragraph be changed to something a little less... you know... The HIV part is just not encyclopedia-ish...

That might depend on which population. Michael Hardy 19:42, 22 October 2005 (UTC)
Spot on. I thought exactly the same and immediately looked at the discussion. All political correctness aside, I just don't think anyone would feel harassed if we wrote "assume 5% of the population carry a certain gene" or "are infected with a certain desease", while I am very sure that everyone with an HIV-infection or someone who knows someone closely who is infected will at least feel strange on reading this paragraph. I am all against political correctness for its own sake, but if there's no need whatsoever to use a certain formulation that might be considered inappropriate, why use it?

## Probability mass function?

Okay, maybe this is standard jargon somewhere, but I've never come across it until today. I guess "mass" makes sense by the physical analogy to density. Honestly, I think it's stupid language. Should we also speak of cumulative mass distribution functions? Be consistent! I'm not going to change it, but a mathematician should. At the very least link it to the pmf page.

pmf is fairly standard. It is linked there now. No, cumulative mass distribution function is not a phrase I have heard. --Richard Clegg 08:25, 6 February 2006 (UTC)

## CDF Example Request

The article gives the following example: "A typical example is the following: assume 5% of the population is green-eyed. You pick 500 people randomly. How likely is it that you get 30 or more green-eyed people?".

This is a CDF example. Unfortunately, the expression given for CDF is not very clear to me. How about giving a worked example with the green-eyed people given in the article as a good example, please? --New Thought 15:12, 8 May 2006 (UTC)

I think the given CDF is really merely an introduction of notation. Perhaps there is no simple closed-form expression for the CDF, although there is an obvious algorithm for computing its values (just add up the appropriate values of the mass function). Michael Hardy 18:27, 8 May 2006 (UTC)
aha - that's the answer I was looking for! In that case, why not say something like, "The value can be computed with..."
$cdf(k;n,p) = \sum_{k=1}^n {n\choose k}p^k(1-p)^{n-k}\,$ --New Thought 09:14, 9 May 2006 (UTC)
Actually, in this case the CDF is
$F(k;n,p) = \sum_{j=0}^k {n\choose j}p^j(1-p)^{n-j}$
--MarkSweep (call me collect) 10:43, 9 May 2006 (UTC)
Good corection - I have added this expression to the article! --New Thought 13:04, 9 May 2006 (UTC)
That is correct only when k is an integer, and only when 0 ≤ k ≤ n. Michael Hardy 21:29, 9 May 2006 (UTC)
Why is it necessary to express the CDF's upper bound of summation in terms of the floor function? The binomial distribution support already indicates that the random variable must take on positive integer values, with the exception of zero (0, 1, ... n). Zane Dylanger (talk) 16:52, 5 June 2010 (UTC)
This is the binomial distribution - how can k not be either 0 or a positive integer? --New Thought 08:30, 10 May 2006 (UTC)
In that specific example, we have
$\Pr[X \geq 30] = 1 - \Pr[X \leq 29] = 1 - F(29; 500, 0.05)$
$= 1 - I_{0.95}(471,30) = I_{0.05}(30,471) \approx 17.647\%.$
You can compute this in terms of the incomplete Beta function, as indicated in the article, using your favorite numerical software. For example, in Mathematica this becomes BetaRegularized[0.05, 30, 471]. Direct summation is likely going to be less numerically stable than a carefully designed subroutine for evaluating the incomplete Beta function. --MarkSweep (call me collect) 06:11, 9 May 2006 (UTC)
Thanks very much for your response. I agree with you - and as it happens, I do use Maxima, which has a shed-load of distribution functions (load(distrib); followed by functions; will show them) - but I wanted to write the functions in Javascript for a web page. I went ahead and wrote the web page using the Poisson distribution - but I still think that this article should give expressions that people can use in normal languages and spreadsheets! I feel I've done my bit for Wikipedia maths clarity - in the Lottery_Mathematics article, mostly written by me, I did my best to make it clear exactly how to do each calculation! --New Thought 09:14, 9 May 2006 (UTC)

## "nmemonic" section

I really dislike the "nmemonic" section. If anyone else agrees, please delete it. McKay 14:55, 11 June 2006 (UTC)

I agree. The mnemonic section is laughable. I'm deleting it. Rjmorris 14:44, 18 June 2006 (UTC)

How about putting it here, then with attention on it someone may come up with better. Tabby 03:44, 31 October 2007 (UTC)

Here is the diff: [1]. But I agree with deleting it, it is unencyclopedic. Sander123 13:58, 31 October 2007 (UTC)

## Relationship to Bezier curves?

The article currently states: The formula for Bézier curves was inspired by the binomial distribution.

Would someone care to source that statement? It seems rather dubious to me, but if it's true it's worthy of a proper explanation and not the vague description of being "inspired by". Certainly the Bernstein polynomials, which constitute the basis functions for Béziers, contain a Binomial coefficient. But binomial coefficients exist all over the place. It doesn't necessarily imply that they have much at all to do with the Binomial distribution.

From reading about Bézier curves I've always had the impression that the decision to use Bersteins as their parametrization wasn't 'inspired' by anything, but merely chosen from a group of candidates on the merit of their desireable properties. (Being such properties as the fact that curve is guaranteed to be contained within the convex hull of the control points, that reversing the control points does not change the curve, that the tangents at the endpoints consist of the line between the endpoint and the neighboring control point, etc). --130.237.179.166 14:48, 3 September 2006 (UTC)

I'm deleting this since no justification has been offered. Zillions of things are "inspired" by the binomial distribution anyway and I don't see why this one is important enough to single out even if it is true. McKay 04:31, 28 October 2006 (UTC)

## Better Example

I feel like there could be a better example than picking 500 people out of a population "with replacement" and seeing how many were green-eyed. Perhaps a more sensical and applicable example could be: out of 50 web servers, each of which has a 1% chance of failing by the end of the day, how many failed servers do you have at the end of the day?

—The preceding unsigned comment was added by 18.216.0.100 (talkcontribs) .

I agree. The current example suffers from the need to do sampling with replacement, which will seem unnatural to people unaccustomed to sampling theory. --McKay 05:52, 29 November 2006 (UTC)
I agree with both of you. How about simply the "toss a coin..."? Hackneyed, perhaps, for us, but surely we want the general reader to "get the picture" as easily as possible? Gerald Tros 01:41, 21 May 2007 (UTC)

The example with a die is OK in principle. Most people, I reckon, will have seen and used dice. But why change the well known configuration ( 1 thru 6 dots) with "5 blank and 1 black side"? This now makes a familiar object unfamiliar and thus more difficult to mentally latch onto. Furthermore, a lesser issue, 'blank' and 'black' are two very similar words possibly leading to misreading. Why not just simply use: "Roll a die ten times and count the number of sixes.", thus appealing to a general feeling of wishing to see the highest value side turn up? Gerald Tros 01:41, 21 May 2007 (UTC)

I agree I like the original better as well. Sander123 09:53, 21 May 2007 (UTC)

--Why not start with a coin example? Isn't this the most straight-forward? The one that everyone did in 4th grade??--128.135.96.223 (talk) 20:34, 9 March 2008 (UTC)

## "Kitchen's theorem"

I deleted a new section on "Kitchen's theorem". It began by saying "...we can see by Kitchen's Theorem that..." without having first said what "Kitchen's theorem" is. That is not appropriate. Then, as far as I can tell, the theorem turned out to be a proposition found in many textbooks without the name "Kitchen's theorem". The notation in which it is written includes the use of the same letter for two different random variables in the same equality. Near the bottom it has some notation that is less than correct and that includes some very clumsy language. Then there is a signature---appropriate for a talk page but not for an article. In includes "Dr. William Kitchen PhD (Psychology)", apparently identifying that person as the one who added this material. It looks like an attempt to name after himself a proposition found in innumerable textbooks since before the births of most (or all?) people now living. Michael Hardy 20:11, 23 March 2007 (UTC)

Well Michael, it's nice to see a fellow 'Mathematician' scrutinising my work, labellng it a 'proposition'. Given the fact that my Theorom has went under rigorous investigation within a university, I fail to see how you can ever have seen it in "innumerable" textbooks. Perhaps you could name a few of them for my reference. And lets not get into a Mathematical jargon slanging match; whoever you are, I would be confident in my own Mathemaical standing to stand before anyone and prove my Theorem/ lemma. And, if it is indeed in many textbooks, I'd urge you to publish a proof of my statement. I have it on good authority, from highly esteemed Mathematicians, that the Theorem I put online is indeed a new and may I add correct proposition. It wasn't a Theorem as such, hence why I referred to it as the Binomial Lemma. I trust you know what a Lemma is! In future, before you make such claims, ensure that the nature of your statements is true. Do that rather than correcting me. And in response to this, if you do indeed give one, I'd appreciate being referred to as Dr. William Kitchen.— Preceding unsigned comment added by 84.66.3.105 (talkcontribs) 18:00, 29 March 2007

From A First Course in Probability, Fourth Edition (1994) by Sheldon Ross, page 181, exercise 26, quoted verbatim:
Let X be a negative binomial random variable with parameters r and p, and let Y be a binomial random variable with parameters n and p. Show that
$P\left\{X > n\right\} = P\left\{Y < r\right\}\,$
If you want to attribute this result to yourself in a Wikipedia article, may I suggest that you cite some published paper that you've written in which you state it? What was the nature of this "rigorous investigation"? Was it simply mathematicians confirming that the result is correct? If so, that's hardly surprising. Was it mathematicians with expertise in probability theory saying the result is new and was unknown before you introduced it? If so, I would find that surprising and I would dispute it. Or was it a professor saying he did not happen to have seen it before? If he's not a probabilist, that's not too surprising and is not the same as saying that it is novel. Michael Hardy 20:14, 29 March 2007 (UTC)
...oh, and since you emphasize that it's your own result, you should not put it in the article unless you also cite some place where you've published it in a journal, since otherwise it would be original research being presented here for the first time. Original research is contrary to Wikipedia policy. Michael Hardy 21:57, 29 March 2007 (UTC)

What you quoted from this textbook isn't even the same as my Theorem. And do not quote Wikipedia policy to me - take me to court, sue me, do whatever you wish. I have this Theorem in a journal, and have had it copyrighted to my name, so that scavengers on internet sites cannot attribute a novel idea to a text book they happen to have read. I had it checked, along with a proof by a university Professor who specialises in the concpets of probability and statisitics. It then underwent a stage of 'gaining plausibility', and under futher rigorous proof. There was a work through proof, and a proof by induction which clearly shows that the NEW theorem works, for all the possible values it outlines. I think you'll find the quote you have from your book involves a different concept to what I outlined before. I'll tell you what : take a look at it, and as Fermat said before he published his last Theorem "prove me wrong": I've got a mortgage on it saying you can't!! All the best, Dr. William Kitchen — Preceding unsigned comment added by 84.66.3.105 (talkcontribs) 22:59, 29 March 2007

OK, I will go back and look carefully at what you added to the article. But if it is to be included, it should be written clearly, using standard notation (not, for example, using the same letter for two different random variables in the same breath), standard language and spelling (e.g. "theorem", not "theorom" as you wrote above) and following standard Wikipedia conventions (e.g. who wrote what is in the edit history, NOT in the article itself). However, it would be a lot more efficient for you simply to tell me where to find your published article in the library than for you to go on at length about the whole history of your writing the article. (Oh, and I trust when you mention the copyright, you mean copyright on the article you wrote rather than on the theorem itself.) Michael Hardy 23:34, 29 March 2007 (UTC)

Well, I appreciate that. Like all Mathematicians, I like recognition for my work. I had to have it rigorously checked and compared with similar Theorems and Lemmas, to ensure I wasn't putting my name to a piece of work that someone else had previously discovered. Notation is a blunder, I hold my hands up on that front, and I understand the elementary nature of my error. I can provide you with my proof for the Theorem as soon as I finish my textbook which is in finalisation at the moment. All my work is momentarily on hold becasue of that. I welcome any scrutiny of my work - I feel that Mathematics is best done when under pressure from other esteemed Mathematicians. The workings of Wikipedia, however, are something I am not aware of, and I appreciate any guidelines you offer me to follow. Again, however, as I have already said, I know I can stand before any Mathematician and prove my Theorem. Regards Dr. William Kitchen

Hello Dr. William Kitchen, please try to relax a bit, nobody is trying to discredit your work. But we are talking about cross purposes. What one wants for an encyclopedia article on the binomial distribution is the fact that it is related to the negative Binomial distribution. Ideally such a statement should be sourced. If appropriate a proof can be added. There where a number of problems however with your contribution and Michael rightly reversed it. The notation is problematic (using X twice, using r both as an index and a parameter). The proof doesn't add to this article since it doesn't actually prove the theorem, it only give some basic definition and a referral. And finally, the theorem quoted is unknown to mathematicians, so it doesn't help one at all.
In my view the statement related the two distribution can stay in the article. But the proof you supplied should either be replaced by a proper proof or by a reference to a published book or peer reviewed paper.
As one final point, please do not make legal threats. Also wp:nor is established wikipedia policy, and this is not the place to put it to discussion. Sander123 12:09, 30 March 2007 (UTC)

Dr. Kitchen, could you tell us the title of the paper and the name of the journal and which issue it's in?

That would really be a whole lot more to the point than telling us how confident you are that everything about it is sound. Michael Hardy 20:31, 30 March 2007 (UTC)

## Normal Approximation

This approximation is a huge time-saver (exact calculations with large n are very onerous);

The exact calculations are only onerous if one doesn't have a computer. Considering that virtually all statistics is done over computers these days the above seems unimportant. 128.195.106.28 23:55, 31 March 2007 (UTC)

It is less important than it used to be, but if n is very large the exact computation can still be onerous. Perhaps more important is that the normal approximation means that a great many statistical tests designed for the normal distribution (such as the Student t-test, the F-test) can also be used for the binomial distribution under the right conditions. --McKay 05:38, 1 April 2007 (UTC)

Hmm, just a reader here, but I can't make a modern computer delay visually for any reasonable n (up to 9999999999) when using the exact solution. I advise my students to always use the exact test and that the normal approximation is a relic of a bygone era. However it is interesting and perhaps worth noting why the binomial becomes normal-ish. Also, I thought the ability to use the normal approximation was based on np not n - with a low enough p, even a huge n will be skewed.4.79.81.6 04:45, 1 November 2007 (UTC)

I experienced, that the normal approximation is indeed a time-saver if e.g. computing many different binomial distributions. In my case -- using octave -- computation speeded up a lot, especially since I was using quite large n's and always had to sum up about n/2 distributions (for only one point in the plot). So thank you for mentioning it in the article! --129.13.186.1 (talk) 10:13, 18 September 2009 (UTC)

But you might have been better off using the incomplete beta function result that is included. Melcombe (talk) 15:53, 18 September 2009 (UTC)

Your end result for the binomial approximation is incorrect. It should be N(np , (np(1-p))^1/2). You currently have N( np , (np(1-p))). —Preceding unsigned comment added by 24.29.95.138 (talk) 21:59, 24 January 2010 (UTC)

The formula as given in the article is correct. It matches the mean (np) and the variance (np(1−p)) of the binomial and the normal distributions.  … stpasha »  22:42, 24 January 2010 (UTC)
But normal distributions aren't given by mean and variance, they're given by mean and standard distribution.
It can be done both ways. In the Wikipedia article, the normal distribution is defined in terms of the variance, so to be consistent, its probably best to do it that way here too. PAR (talk) 16:00, 25 January 2010 (UTC)
In words, one can say “a normal distribution with mean xxx and standard deviation yyy”. But when writing a formula, it is always the $\mathcal{N}(\mu, \sigma^2)$, and I've never seen it otherwise.  … stpasha »  09:49, 26 January 2010 (UTC)
Actually, come to think of it, neither have I. But there's no mathematical proof that says thats the way it has to be done, that's what I meant.PAR (talk) 16:01, 26 January 2010 (UTC)

## Explicit derivations of mean and variance

This section is my first contribution. I sincerely hope it's sensible to have done so and that it is a (potential) boon to readers. I'm honing it, adding links, references, improving text etc. Please give me a couple of days, I'll post it in one single edit. I'd appreciate any advice you have for me regarding content choice, style etc. Thank you. Thanks already to Michael Hardy. Gerald Tros 01:34, 25 April 2007 (UTC). OK, a couple of weeks. It's almost ready :-) Gerald Tros 01:31, 11 May 2007 (UTC) Done. Gerald Tros 01:28, 16 May 2007 (UTC)

Might it not be a lot easier to demonstrate this proof using generating functions? I can easily do it this way, unless anyone can spot a good reason not to (it requires a lot less algebra...but does requires some GF results) Wrayal 20:45, 31 May 2007 (UTC)

I can see that it would be easier. But it would require more starting knowledge. I'd guess that anybody who knows about generating functions does not need to look up the derivation of the mean in wikipedia. Therefore I think the derivations should be kept as elementary as possible. Sander123 13:20, 5 June 2007 (UTC)

## Incorrect cdf

The cdf of a discrete distribution must be piecewise constant. ПБХ 15:13, 21 September 2007 (UTC)

## derivations

I hope somebody could help me in finding derivations or how to derive the skewness and kurtosis, even link to other sites will be much appreciated. —Preceding unsigned comment added by Student29 (talkcontribs) 19:29, 16 January 2008 (UTC)

After giving the expectation as np, the article states "This fact is easily proven as follows. Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by μ = p." This is not a proof. These sentences should really just be removed. —Preceding unsigned comment added by 68.50.194.132 (talk) 19:59, 16 February 2008 (UTC)

## Variance

In the section entitled Mean, variance and mode, it isn't clear to me how the expression given follows from "Using the definition of variance, we have..." Should I try to find this in the entry for variance, figure it out from the problem statement, or use the definition of variance given just above? In any case I don't see how it follows.Telliott (talk) 11:43, 14 March 2008 (UTC)

The figures have unlabeled axes, making them pretty much useless. Can someone either introduce new figures, or edit the existing ones to have axis labels? 209.94.128.119 (talk) 01:30, 13 November 2008 (UTC)

## Sampling

The article for Hypergeometric distribution describes how it is used for sampling without replacement and states that Binomial Distribution is used for sampling with replacement. How is Binomial Distribution method used for sampling? Virgil H. Soule (talk) 18:11, 2 July 2009 (UTC)

Removed:

As another example, assume 5% of a very large population to be green-eyed. You pick 100 people randomly. The number of green-eyed people you pick is a random variable X which approximately follows a binomial distribution with n = 100 and p = 0.05 (strictly a hypergeometric distribution).

If it isn't strictly a binomial distribution, then it is a bad example.

In lieu of misusing a hypergeometric distribution as an example of a binomial distribution, perhaps add a section detailing the relationship and how they are similar and yet different? Madkaugh (talk) 00:42, 13 October 2009 (UTC)

## Mode Expression Incorrect

The expression for the mode is incorrect. Imagine a Binomial distribution with p = 1.0 and n = 2, the expression for the model will return 3 while the true value is 2. —Preceding unsigned comment added by 86.165.211.190 (talk) 21:52, 1 November 2009 (UTC)

I've changed it to this:
$\text{mode} = \begin{cases}\lfloor (n+1)\,p\rfloor & \text{if }(n+1)p\text{ is 0 or a noninteger}, \\ \lfloor (n+1)\,p\rfloor \text{ and } \lfloor (n+1)\,p\rfloor - 1 &\text{if }(n+1)p\in\{1,\dots,n\}, \\ n & \text{if }(n+1)p = n + 1.\end{cases}$

Michael Hardy (talk) 06:17, 25 November 2009 (UTC)

## Paramater n

It doesn't make any sense for the parameter n to be 0. Most texts limit n to be a natural number. —Preceding unsigned comment added by 128.187.81.187 (talk) 21:03, 23 November 2009 (UTC)

The case n=0 give a valid distribution, which is a natural part of the same family and which is required in more complicated manipulations of distributions such as compounding. Melcombe (talk) 10:36, 24 November 2009 (UTC)

## Computing the cumulative distribution function (CDF)

It's common in Wikipedia math articles to discuss algorithms for computing quantities of interest. On this page it would be very helpful to have a discussion of computing the cumulative distribution function (CDF). The article does mention various methods that can be used in various circumstances, but this is an incomplete solution at best. For example, the article doesn't provide any guidance about choosing between (a) a combination of direct summation and the normal approximation and the poisson approximation, (b) a method based on the incomplete beta distribution, and (c) something else. A discussion of computing the CDF would be useful to a lot of people. ATBS 22:28, 30 November 2009 (UTC)ATBS —Preceding unsigned comment added by ATBS (talkcontribs)

## Normal approximations

The second rule of thumb for normal approximations looks suspicious. It can be written in the following form: use normal approximation whenever

n · |skewness| ≥ 3.33

In particular that rule claims normal approximation should not be used for any n when p = ½. So the sign should probably be reversed, and the factor n omitted?  … stpasha »  22:41, 30 November 2009 (UTC)

Fixed. -12.7.202.2 (talk) 18:29, 14 May 2010 (UTC)

There is an error in an example: σ = (p(1 − p)/n)1/2. Should be σ = (np(1 − p))1/2. Please someone fix it or explain what I do not understand there. —Preceding unsigned comment added by 213.197.179.210 (talk) 10:34, 5 May 2011 (UTC)

## Controlling the variance

Hi all,

I came up with a way to add variance to the binomial distribution, for this purpose I consider the history of success compare to the expected value. Here is my development (I hope it is OK to have a link)

I would really like to know what you think, to me it looks very cool as I use expected value and sum of binomial series and I didn't see anything like it anywhere.

What do you say? Ofermano (talk) 16:31, 28 June 2011 (UTC)

## Accessibility

Would it be possible to write an introductory section that gives just a conceptual description of what the binomial distribution is about, before we enter the maths? Like tossing a coin, or drawing marbles from a box, and replacing the drawn marble each time (and mixing the box up again)? --JN466 02:18, 3 July 2011 (UTC)

Good idea. The lead sort of introduces it, but there should be room for a more detailed overview. Sources shouldn't be too tricky to find. Alzarian16 (talk) 04:20, 3 July 2011 (UTC)

## Error in article

Hi, isn't the standard deviation calculated as : sqrt((p(1 − p) n)) ? In the article it is written as: sqrt((p(1 − p)/n)) — Preceding unsigned comment added by 213.55.184.169 (talk) 06:31, 15 March 2012 (UTC)

## Cumulative distribution function -- Example

Is it my imagination or are only the first and last probabilities for the biased coin correct? I have run that in SAS

data _null_ ;
p = 0.3 ;
do i = 0 to 6 ;
prob= (p**i) * ((1-p)**(6-i)) ;
put i=  prob= ;
end ;
run ;


and I get

i=0 prob=0.117649
i=1 prob=0.050421
i=2 prob=0.021609
i=3 prob=0.009261
i=4 prob=0.003969
i=5 prob=0.001701
i=6 prob=0.000729


Docsteve.518 (talk) 21:46, 15 April 2013 (UTC)

Yes, it is your imagination. Seriously, you have forgotten to include the combinatorial coefficient. Melcombe (talk) 21:51, 15 April 2013 (UTC)

Oh yes, thanks for that. That's what I get for trying to do it long hand.

The SAS functions exist for a reason!

data _null_ ;
do i = 0 to 6 ;
x = pmf('Binomial',i,.3,6) ;
put i=  x= ;
end ;
run ;


And yes, there's the sequence

i=0 x=0.117649
i=1 x=0.302526
i=2 x=0.324135
i=3 x=0.18522
i=4 x=0.059535
i=5 x=0.010206
i=6 x=0.000729


16:07, 16 April 2013 (UTC) — Preceding unsigned comment added by 72.43.218.26 (talk)

## Question about Cummulative Distribution Function

Firstly, I want I am wondering about the definition of the CDF

$F(k;n,p) = \Pr(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n\choose i}p^i(1-p)^{n-i}$

From my training (and looking at the graphs on THIS page), we should be defining F on the real numbers and writing < and not ≤, that is:

$F(x) = \left\{ {\begin{array}{*{20}{l}} 0&{x < 0}\\ {\sum\limits_{j = 0}^{k - 1} {\left( {\begin{array}{*{20}{c}} n\\ k \end{array}} \right)} \,\,{p^k}{{(1 - p)}^{n - k}}}&{k - 1 \le x < k,\,\,k \in \{ 1,2,...,n\} }\\ 1&{x \ge n} \end{array}} \right.$

or indeed:

$F(x) = \left\{ {\begin{array}{*{20}{l}} 0&{x < 0}\\ {\sum\limits_{j = 0}^{k - 1} f(j)}&{k - 1 \le x < k,\,\,k \in \{ 1,2,...,n\} }\\ 1&{x \ge n} \end{array}} \right.$

(I also switched to j in place of i since so many applications now assume i is the corresponding complex number.)

I had already given this formula in my MK wikipedia page and was wanting to add an iterative "computer graphing formula" and so checked to see if there was one on this page and became confused with the above formula.

BTW: Here is the iterative formula I was getting ready to add. Any suggestions here to make this clearer how to use?

$\begin{array}{*{20}{l}}{{F_0}(x) = 0}&{x < 0}\\{{F_{k + 1}}(x) = {F_k}(x) + f(k)}&{k \le x < k + 1,\,\,k = 0,1,2,...,n - 1}\\ {{F_{n + 1}}(x) = 1}&{x \ge n}\end{array}$

So if you make a sequence of the probabilities values (easy), you can easily make this into a sequence of n+2 points and then draw segments as in the above graphs. Having worked this out a gazillion times for my kiddies, I finally wrote it down.

Lfahlberg (talk) 09:06, 22 November 2013 (UTC)

## Graph, n

Please label the axes on the graphs, and state the allowable range for parameter n. Does n include zero? 71.139.165.140 (talk) 19:11, 14 December 2014 (UTC)

## We should add the moment generating function

Here is a reference to use: http://www.le.ac.uk/users/dsgp1/COURSES/MATHSTAT/5binomgf.pdf

Tal Galili (talk) 17:04, 25 March 2015 (UTC)

I took a quick look at other probability distribution pages on Wikipedia, and I'm not seeing any derivations of the mgfs there. I suppose that doesn't necessarily disqualify us for adding the mgf to this page, but considering that this page doesn't even derive the mean or variance of the binomial, then I don't see adding this material to this page. Blahb31 (talk) 21:45, 25 March 2015 (UTC)
Thank you, fair point. Since MGF are very basic for using these objects in various settings, I think this type of information should be available somewhere on Wikipedia. a) would you agree? b) if so - where do you think would it fit?

Cheers, Tal Galili (talk) 22:28, 25 March 2015 (UTC)

I'm going to say no. This is material for a textbook, not an encyclopedia. Blahb31 (talk) 11:54, 26 March 2015 (UTC)