Talk:Normal distribution/Archive 3

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Geometric mean and asset returns?

The usual mean is μ, what's the geometric mean? If μ is 1,15 and σ 0,2 the geometric mean seems to be around 1,132 or something? There's a formula for it right? -- JR, 10:29, 8 April 2007 (UTC)

It doesn't make sense to speak of a geometric mean of a random variable that's not always positive. Michael Hardy 20:26, 8 April 2007 (UTC)
I took a second look at the book (John Hull, chapt 13). It's assumed that the realized cumulative (geometric) return is ø(μ - σ^2/2, σ/T) over for example 100 periods T. So that distribution can not be used to test the real cumulative return. It sems to assume that the return per period T is distributed as ln x ~ ø(μ - σ^2/2, σ) (from eq 13.2) which is e ^ ø(8%, 20%) when μ = 10% and σ = 20%. The artithmetic mean of that lognormal distribution is 10.517... % and the geometric mean is 8.3287.... % per period T over a large number of periods. So e ^ ø(8%, 20%) is the return that can be tested for a large number of periods. It will give a compounded return of close to 8.3287.... %. So my question is, why would you call an expected cumulative return of 8.3287....% 10%? I'll see if reading the other chapters will clear things up. -- JR, 11:31, 15 April 2007 (UTC)

error in the cdf?

You need to be more specific about what exactly you think might be wrong. --MarkSweep 00:19, 8 September 2005 (UTC)

Integrating the normal density function

can any1 tell me wat the integral of 2pi^-0.5*e^(-o.5x^2) is?? i tried interating by parts and other methods but no luck. can sum1 help

The antiderivative does not have a closed-form expression. The definite integral can be found:
\int_{-\infty}^\infty e^{-x^2/2}\,dx = \sqrt{2\pi\ }.
See Gaussian function for a derivation of this. Michael Hardy 20:40, 22 May 2006 (UTC)
I didn't find an explicit derivation in the Gaussian function article, so I created this page: Integration_of_the_normal_density_function. Would it be appropiate to place this link somewhere in the normal distribution article? Mark.Howison 06:32, 1 February 2007 (UTC)

Sorry---it's not in Gaussian function; it's in Gaussian integral. Michael Hardy 21:31, 1 February 2007 (UTC)

Gaussian curve estimation

I came to this article looking for a way to approximate the gaussian curve, and couldnt find it on this page, which is a pity. It would be nice to have a paragraph about the different ways to approximate it. One such way (using polynoms on intervals) is described here: [1] I can write it, any suggestion for where to put this ? top level paragraph before trivia ? --Nicolas1981 15:53, 6 October 2006 (UTC)

I think it would fit there. Michael Hardy 19:52, 6 October 2006 (UTC)
I added it. I felt a bit bold because it is very drafty when compared to the rest of the page, but I hope that many people will bring their knowledge and make it an interesting paragraph :-) Nicolas1981 21:37, 6 October 2006 (UTC)

I just noticed that the French article has a good paragraph about a trivial way to approximate it (with steps). There is also this table on wikisource. I have to go out now, but if anyone wants to translate them, please do :-) Nicolas1981 21:54, 20 October 2006 (UTC)

reference in The Economist

Congratulations, guys - the Economist used Wikipedia as the source for a series of pdf graphs (the normal, power-law, poisson and one other) in an article on Bayesian logic in the latest addition. Good work! --Cruci 14:58, 8 January 2006 (UTC)

Typesetting conventions

Please notice the difference in (1) sizes of parentheses and (2) the dots at the end:


Michael Hardy 23:54, 8 January 2006 (UTC)

Quick compliment

I've taught intro statistics, I've treatments in many textbooks. This is head and shoulders above any other treatment! Really well done guys! Here is the Britannica article just for a point of comparison (2 paragraphs of math with 2 paragraphs of history) jbolden1517Talk 18:54, 5 May 2006 (UTC) <clapping>

Thank you. (Many people worked on this page; I'm just one of those.) Michael Hardy 22:02, 5 May 2006 (UTC)

Eigenfunction of FFT?

I believe the normal distribution is the eigenfunction of the Fourier transform. Is that correct? If so, should it be added? —Ben FrantzDale 16:57, 26 June 2006 (UTC)

That was quick. According to Gaussian function, all Gaussian functions with c2=2 are, so the standard normal, with σ=1, is an eigenfunction of the FFT. —Ben FrantzDale 16:57, 26 June 2006 (UTC)


i'm trying to find out what a q-function is, specifically q-orthogonal polynomials. I searched q-function in the search and it came here. I'm guessing this is wrong. —Preceding unsigned comment added by (talk) 05:04, 7 November 2006 (UTC)

Added archives

I added archives. I tried to organize the content so that any comments from 2006 are still on this page. There was one comment from 2006 that I didn't think was worth keeping. It's in the 2005 archive. If you have any questions about how I did the archive, ask me here or on my talk page. — Chris53516 (Talk) 14:51, 7 November 2006 (UTC)

Can you please link the article to the Czech version

Hello, can you please link the article to the Czech version as follows?

cs : Normální rozdělení

I would do it myself but as I see some characters as question marks in the main article I am afraid that I would damage the article by editing it. Thank you. —Dan

Ok, I did it. Check out how it is done, so you can do it yourself in the future. PAR 10:47, 12 November 2006 (UTC)

Standard normal distribution

In the section "Standardizing normal random variables" it's noted that "The standard normal distribution has been tabulated, and the other normal distributions are simple transformations of the standard one." Perhaps these simple transformations should be discussed? —The preceding unsigned comment was added by (talkcontribs) 11:36, 4 December 2006 (UTC).

They are discussed in the article, just above the sentence that you quote. Michael Hardy 21:58, 4 December 2006 (UTC)
I reworded the section slightly to make that clearer. --Coppertwig 04:35, 5 December 2006 (UTC)

Jonnas Mahoney?

Um... This is my first time commenting on anything on Wiki. There seems to be an error in the article, although I'm not certain. Jonnas Mahoney... should really be Johann Carl Friedrich Gauss? Who's Jonnas Mahoney? :S

Edit: lol. fixed. that was absolutely amazing.

—Preceding unsigned comment added by Virux (talkcontribs) 05:38, 10 December 2006 (UTC)

PDF function

I believe there is an error in the pdf function listed, it is missing a -(1/2) in the exponent of the exp!!! —The preceding unsigned comment was added by (talk) 19:02, 11 December 2006 (UTC).

Well, scanning the article I find the first mention of the pdf, and clearly the factor of −1/2 is there, where it belongs:
The probability density function of the normal distribution with mean μ and variance σ2 (equivalently, standard deviation σ) is a Gaussian function,

\frac{1}{\sigma\sqrt{2\pi}} \, \exp \left( -\frac{(x- \mu)^2}{2\sigma^2} \right) = {1 \over \sigma} \varphi \left(
\frac{x - \mu}{\sigma} \right),
\varphi(x)=\frac{1}{\sqrt{2\pi\,}} e^{-x^2/2}
is the density function of the "standard" normal distribution, i.e., the normal distribution with μ = 0 and σ = 1.
similarly I find the factor of −1/2 in all the other places in the article where the density is given. Did I miss one? If so, please be specific as to where it is found. Michael Hardy 21:26, 11 December 2006 (UTC)

One thing about the PDF: I was for a moment under the mistaken impression that the PDF can't go higher than 1. This mistaken impression was supported by the fact that the graphs have y values < 1. However, I believe it can go arbitrarily high (max \frac{1}{\sigma\sqrt{2\pi}}, where \sigma can be arbitrarily small). I wonder if someone could produce a graph with y values higher than one, just for illustration. dfrankow (talk) 22:36, 1 March 2008 (UTC)

There is a square missing in the pdf function given in the right hand column. Source of image is: —Preceding unsigned comment added by (talk) 03:14, 2 December 2009 (UTC)

Definition of density function

I know I'm probably being somewhat picky, but here goes: In the section "Characterization of the Normal Distribution," we find the sentence:

The most visual is the probability density function (plot at the top), which represents how likely each value of the random variable is.

This statment isn't technically accurate. Since a (real-valued) Gaussian random variable can take on any number on the real line, the probability of any particular number occuring is always zero. Instead, the PDF tells us the probability of the random variable taking on a value inside some region: if we integrate the pdf over the region, we get the probability that the random variable will take on a number in that region. I know that the pdf gives a sort of visual intuition for how likely a particular realization is, so I don't want to just axe the sentance, but maybe we can find a way to be precise about this while avoiding an overly pedantic discussion like the one I've just given? Mateoee 19:46, 12 December 2006 (UTC)

I took a try at it, staying away from calculus. It's still not correct, but its closer to the truth. PAR 23:50, 12 December 2006 (UTC)
I think I found a way to be precise without getting stuck in details or terminology. What do you think? Mateoee 03:19, 14 December 2006 (UTC)
Well, its correct, but to a newcomer, I think its less informative. Its a tough thing to write. PAR 03:41, 14 December 2006 (UTC)

The new version seems a bit vague. But I don't think this article is the right place to explain the nature of PDFs. It should just link to the article about those. Michael Hardy 17:05, 14 December 2006 (UTC)

Summary too high depth

I linked this article for a friend because they didn't know what a normal distribution was. However the summary lacked a breif english language notion of what one is. The summary is confusing for people who haven't had some statistics. If there's not immense negative reaction to altering the summary, I'll do that tommorow. i kan reed 23:08, 2 January 2007 (UTC)

weird way of generating gaussian

Does anyone know why the following method works?
Generate n random numbers so that n>=3
Add the results together
Repeat many times
Create a histogram of the sums. The histogram will be a "gaussian" distribution centered at n/2. I put "gaussian" in quotes because clearly the distribution will not go from negative infinity to infinity, but will rather go from 0 to n. It sounds bogus, but it really works! I really wish I knew why though. --uhvpirate 23:04, 16 January 2007 (UTC)

The article titled central limit theorem treats that phenomenon. Michael Hardy 01:34, 17 January 2007 (UTC)

lattice distribution

Can someone add a link about lattice distribution? Of course, and add an article about lattice distribution. Jackzhp 23:40, 7 February 2007 (UTC)

Open/closed interval notation

In this sentence: "' uniformly distributed on (0, 1], (e.g. the output from a random number generator)" I suspect the user who called this a "typo" and changed it to "[0, 1]" (matching square brackets) didn't understand the notation. "(0, 1]" means an interval that includes 1 but does not include 0. "[0, 1]" includes both 0 and 1. Each of these intervals also includes all the real numbers between 0 and 1. It's a standard mathematical notation. Maybe we need to put a link to a page on mathematical notation? --Coppertwig 13:10, 13 February 2007 (UTC)

sum-of-uniforms approximation

The sum-of-uniforms approximate scheme for generating normal variates cited in the last section of the article is probably fine for small sets (<10,000), but the statement about it being 12th order is misleading. The moments begin to diverge at the 4th order. Also, note that this scheme samples a distribution with compact support (-6,6); so it is ill-advised for any application that depends on accurate estimation of the mass of extreme outcomes. JADodson 18:58, 15 February 2007 (UTC)

Applications that depend on accurate estimation of the mass of extreme outcomes are rare, and they are rarely exactly normal, because the normal distribution is often used as an approximation to some nonnormal distribution, such as a gamma or beta or poisson or binomial or hypergeometric distribution. So an unsophisticated method is called for, such as the sum of uniforms. Bo Jacoby 16:01, 9 April 2007 (UTC).

Complex Gaussian Process

Consider complex Gaussian random variable,


were x and x are real Gaussian variables, with equal variances \sigma_r=\sigma_x=\sigma_y. The pdf of the joint variables will be,

\frac{1}{2\,\pi\,\sigma_r^2} e^{-\frac{x^2+y^2}{2 \sigma_r ^2}}

since \sigma_z=\sqrt{2}\sigma_r, the resulting PDF for the complex Gaussian variable is,

\frac{1}{\pi\,\sigma_z^2} e^{-\frac{|z|^2}{\sigma_z^2}}.

—Preceding unsigned comment added by Paclopes (talkcontribs) 22:36, 18 February 2007 (UTC)


In the article as it stands, the distribution has parameters \mu and \sigma^2, where as the distribution function has parameters \mu and \sigma (in addition to its argument, x). I have found sources corrobating this choice, but it seems odd. I am aware that wikipedia should report on the state of affairs, not try to repair on it. But if some sources could be found that use either \sigma in both cases, or \sigma^2 in both cases, we might do the same, and just indicate briefly that other sources do it differently. Any comments?--Niels Ø (noe) 12:06, 30 April 2007 (UTC)

I've changed it: they now all say μ and σ2 (I think the comment by user: misses the point). Michael Hardy 19:56, 27 August 2007 (UTC)

This is an unnecessary discussion. The two parameters are \mu and \sigma^2. It just so happens that in the pdf the square root of \sigma^2 appears. And since no one would write a non simplified item into a pdf function they wrote it as \sigma. It does not mean there is any discrepancy in the statement of the parameters.

If you need further explanation think of it like this. Whether you use \sigma or \sigma^2 as the parameter will essentially NOT change the pdf at all!

Remember that in a given normal distribution \sigma has some specified decimal value. If you use \sigma then that value will simply remain unchanged in the overall denominator and then squared in the exponent of e; if you use \sigma^2 then that value’s square root will be taken in the denominator and it will remain unchanged in the exponent of e. But in either case when you write the generalized function the denominator will always have \sigma and the exponent of e will always have \sigma^2, regardless of which one you choose to put in the statement of parameters. It does not matter and YOU CANNOT use BOTH at once in the statement of parameters. In addition, x is not a parameter, it is the representation of specific decimal values for the normally distributed random variable X. —Preceding unsigned comment added by (talk) 18:53, August 27, 2007 (UTC)

Error in Standard Deviation section?

Hi, I think there's an slight error in the "Standard Deviation" section of this article. That is, the article says that the area underneath the curve from -n\sigma to n\sigma is:


However, if \operatorname{erf}(x) is defined as:

\operatorname{erf}(x) = \frac{2}{\sqrt{\pi}}\int_0^x e^{-t^2} dt.


\operatorname{erf}(1/\sqrt{2}) = 0.3829249
\operatorname{erf}(2/\sqrt{2}) = 0.6826895
\operatorname{erf}(3/\sqrt{2}) = 0.8663856

Which is incorrect. However,

\operatorname{erf}(1\sqrt{2}) = 0.6826895
\operatorname{erf}(2\sqrt{2}) = 0.9544997
\operatorname{erf}(3\sqrt{2}) = 0.9973002

is correct. So, I think that the area underneath the curve in the article should be:

\operatorname{erf}\left(n \sqrt{2}\right)\,

Here's the R code that shows this:

> erf <- function(x) 2 * pnorm(x / sqrt(2)) - 1 
> erf(c(1,2,3)/(sqrt(2)))
[1] 0.3829249 0.6826895 0.8663856
> erf(c(1,2,3)*(sqrt(2)))
[1] 0.6826895 0.9544997 0.9973002

Thoughts? -- Joebeone (Talk) 18:19, 18 May 2007 (UTC)

I was wrong. I had the wrong formula written down for the relationship between R's pnorm() and \operatorname{erf}(x).
Here's a quick justification... From the defintion of \operatorname{erf}(x) (See: Error function),
\operatorname{erf}(x) = \frac{2}{\sqrt{\pi}} \int^x_0 e^{-t^2} dt
Now, the normal distribution function (pnorm() in R) is
\operatorname{P}(x) = \frac{1}{\sqrt{2\pi}} \int^x_{-\infty} e^{-u^2/2} du
so (\operatorname{\Phi}(x) is the cumulative normal distribution function[2]):
\operatorname{P}(x) - 1/2 = \operatorname{\Phi}(x) = \frac{1}{\sqrt{2\pi}} \int^x_0 e^{-u^2/2} du
Now substitute t = u/\sqrt{2}
\operatorname{P}(x) - 1/2 = \frac{1}{\sqrt{2\pi}} \int^{x/\sqrt{2}}_0 e^{-t^2} dt\sqrt{2}
\operatorname{P}(x) - 1/2 = \frac{1}{2} \cdot \operatorname{erf}(x/\sqrt{2})
\operatorname{erf}(x) = 2 \cdot \operatorname{P}(x\sqrt{2}) - 1
Now, using the definition in the article for the area underneath the normal distribution from -n\sigma to n\sigma:
we calculate
\operatorname{erf}(1/\sqrt{2}) = 0.6826895
\operatorname{erf}(2/\sqrt{2}) = 0.9544997
\operatorname{erf}(3/\sqrt{2}) = 0.9973002
using the following R code:
> erf <- function(x) 2 * pnorm(x * sqrt(2)) - 1 
> erf(c(1,2,3)/(sqrt(2)))
[1] 0.6826895 0.9544997 0.9973002

Sorry for so much ink spilled. -- Joebeone (Talk) 01:43, 19 May 2007 (UTC)

photon counts

Photon counts do not have a Gaussian (normal) distribution. Photon generation is a random process that can be approximated with the Poisson distribution (counting statistics). —Preceding unsigned comment added by (talk)

...and of course the Poisson distribution can be approximated by the normal distribution. Michael Hardy 02:14, 26 July 2007 (UTC)

...which means it isn't a good example of the normal distribution showing up in nature. MisterSheik 07:27, 26 July 2007 (UTC)

...Well, the normal distribution never shows up in nature, does it? But the law of large numbers implies that the normal distribution is a good approximation in many cases, including this one - at least assuming that the count is large. I suppose the argument gets complicated if you take into account dead-time in the counter and what not, but all the same, I think it's a fine example.--Niels Ø (noe) 08:58, 26 July 2007 (UTC)

There are physical effects that are the sum of many small errors, which are normally distributed, e.g., noise. These are better examples. MisterSheik 09:02, 26 July 2007 (UTC)

The probability distribution function of a normal random variable is mathematically scary, and so it seems to be an advanced concept. However, there are easier and better ways to describe random variables. See cumulant. The derivative of the cumulant generating function, g '(t) , is a nice description of a random variable. The photon count is described by the poisson distribution for which g '(t)  =  μ·et = μ+μ·t+μ·t2/2+... I you truncate the series to just one term you find g '(t) ~ μ, which describes a constant. This approximation is appropriate for bright light where the quantum fluctuation of light intensity is neglected. Include one more term to get g '(t) ~ μ+μ·t. This describes a normal distribution having mean value = μ and variance = μ. This approximation is appropriate for dim light where the fluctuation of light intensity is important, but where the granularity of photons can be neglected. If photons are counted one by one, then these approximations are insufficient and the poisson distribution is used. So the normal distribution is the two-term approximation of any random variable with well defined variance. Bo Jacoby 09:05, 26 July 2007 (UTC).
Cool. It would be good to expand the section titled "photon counting" so that this is clear. I'm not sure, but it seems that the normal distribution crops up here not because of the central limit theorem, but because, as you said "the normal distribution is the two-term approximation of any random variable with well defined variance." If that's a different reason, then it should be in a different paragraph, I think. Thanks for clarifying this by the way. MisterSheik 09:12, 26 July 2007 (UTC)
Thank you sir. The use of cumulant generating functions is not as common as it deserves, probably for historical reasons. The central limit theorem is sophisticated when expressed in the language of probability distribution functions, but straight forward when expressed in terms of cumulant generating functions. In the article Multiset#Cumulant generating function the central limit theorem is derived based on cumulant generating functions. A finite multiset of real numbers is an important special case of a random variable, and it is much easier to understand than the general case, so I prefer to study finite multisets before I proceed to general random variables. The important concept of a constant, g '(t) ~ μ, is described in Degenerate distribution. It is a random variable even if it in neither random nor variable. Bo Jacoby 13:23, 26 July 2007 (UTC).
I'm not sure I follow everything, but I'll give summarizing a shot: a) because of the central limit theorem, processes that are the sums of a lot of small errors are normally distributed, and b) because of the central limit theorem, processes that have well-defined variances are nearly normally distributed, which includes processes that are better-modeled by other distributions. My wording may be imprecise, but this is the gist, right? I think we should have two paragraphs. Noise is in the first paragraph, and photon counting in the second. MisterSheik 04:34, 27 July 2007 (UTC)
a) Yes. b) No. The random variable of playing heads or tails is represented by the multiset {0,1}. It is the simplest case of the bernoulli distribution, with p=1/2. It has mean value 1/2 and standard deviation 1/2, and the variance, being the square of the standard deviation, is 1/4. The derivative of the cumulant generating function is g '(t)  = 1/2+t/4+ terms of higher order. (Actually g '(t) = (et + 1)−1, see Cumulant#Cumulants of particular probability distributions). If you play it with hundredfold stake, the shape of the distribution function is unchanged and the derivative of the cumulant generating function becomes g '(t)  = 50+2500·t+ terms of higher order. However, if you rather play the game a hundred times the distributions function becomes bell-shaped, and the derivative of the cumulant generating function becomes g '(t) = 50+25·t+ insignificant terms of higher order. Even if the distribution of heads-or-tails is not at all normal, it acts in the same way as a normal distribution when played many times, because only the low-order terms in the cumulant generating functions matter. Bo Jacoby 11:41, 27 July 2007 (UTC).

It does indeed follow from the cental limit theorem that the Poisson distribution is approximately normal when its expected value is large. Michael Hardy 16:33, 27 July 2007 (UTC). Yes, I agree. Bo Jacoby 10:12, 28 July 2007 (UTC).


I think there is an error in the section Properties. It claims that if X, and Y are independent normal variables, then U=X+Y and V=X-Y are independent. However, though this holds for STANDARD normal X, Y, it does not hold generally.


and hence if Var(X) differs from Var(Y) then U and V are not independent.

Could you please correct it? Based on this incorrect information, I got inconsistent results in my computations and it took me half a day to find the source of the error. —Preceding unsigned comment added by (talk) 08:18, August 26, 2007 (UTC)


There are critiques of the normal curve, not simply Stephen Jay Gould-type critiques (though they might be relevant to consider in terms of the social implications of uncertainty). In fact, mathematicians like Mandelbrot recognized flaws in the assumptions behind the normal curve; but provided no alternatives and believed despite its imperfections, the use of the bell curve could not be sacrificed. Can anyone intelligently comment further and provide discussion of these views on the page? --Kenneth M Burke 01:45, 8 September 2007 (UTC)

history error

According to O'Connor and Robertson (2004) De Moivre's 'The Doctrine of Chance' was published on 13 November 1733, not 1734 as the article says. The date 1733 is confirmed by Ross (2002, p209). Ross goes on to tell us that the curve is so common it was regarded as

"'normal' for a data set to follow this curve....Following the lead of the British Statistician Karl Pearson, people began refering to the curve simply as the normal curve." Ross (2002, p209).

Ross, S. (2002), An Introduction to Probability, 6th edition, prentice hall, new jersey.

O'Connor and Robertson(2004), Abraham de Moivre, University of St Andrews, Available: —Preceding unsigned comment added by Ikenstein (talkcontribs) 02:01, 9 September 2007 (UTC)

The last line of general cdf formula tried to relate the general cdf to standard cdf, which was wrong, removed it. Vijayarya (talk) 13:10, 2 January 2009 (UTC)

Central Limit Theorem

A while ago I edited the first paragraph on the central limit theorem from:

The normal distribution has the very important property that under certain conditions, the distribution of a sum of a large number of independent variables is approximately normal. This is the central limit theorem.


The normal distribution has the very important property that under certain conditions, the distribution of a sum of a large number of identically distributed independent variables is approximately normal. This is the central limit theorem.

I thought I was so clever. But recently I talked to a math grad student friend of mine and he said that it's not necessary that the independent variables be identically distributed so long as other conditions are met. (He didn't go into detail about what those other conditions were, and I must confess, I probably wouldn't have followed if he had.)

Now when I reread the paragraph, I think my addition of identically distributed is generalized by (and therefore made redundant by) under certain conditions, which are probably the very condition my friend was thinking of.

Thoughts? Expert opinions? —Preceding unsigned comment added by (talk) 17:03, 13 September 2007 (UTC)

I agree with your second take on it. "under certain conditions" is general and covers the iid case, as well as others. I recommend having just that and linking to the CLT article. I should add that even independence is not necessary, although deviations from that cannot be too large. Here is an issue: there are more than one "central limit theorem"s, although one could argue the iid case is the canonical one. Baccyak4H (Yak!) 17:17, 13 September 2007 (UTC)

There are lots of different versions of central limit theorems. The one most frequently stated assumes the random variables are i.i.d. and have finite variance. Some versions allow them not to be identically distributed, but instead make weaker assumptions. Some get by with weaker assumptions than independence. I think this article can content itself with stating the most usual one, mentioning briefly that there are others, and linking to the main CLT article, which can treat those other versions at greater length. Michael Hardy 20:01, 13 September 2007 (UTC)

I've edited it to read Under certain conditions (such as being independent and identically-distributed), the sum of a large number of random variables is approximately normally distributed — this is the central limit theorem.; I think it's more concise and clear. Thoughts? ⇌Elektron 19:11, 14 September 2007 (UTC)

IQ tests

The paragraph

Sometimes, the difficulty and number of questions on an IQ test is selected in order to yield normal distributed results. Or else, the raw test scores are converted to IQ values by fitting them to the normal distribution. In either case, it is the deliberate result of test construction or score interpretation that leads to IQ scores being normally distributed for the majority of the population.

skillfully evades the question of whether IQ tests that yield normally distributed scores are always deliberately constructed to do so, or if a normal distribution of scores is to be expected for any reasonably broad test. The latter question was answered in the positive in the following paragraph:

Historically, though, intelligence tests were designed without any concern for producing a normal distribution, and scores came out approximately normally distributed anyway. American educational psychologist Arthur Jensen claims that any test that contains "a large number of items," "a wide range of item difficulties," "a variety of content or forms," and "items that have a significant correlation with the sum of all other scores" will inevitably produce a normal distribution.

However, the latter paragraph was commented out. Is it incorrect? AxelBoldt (talk) 02:57, 27 December 2007 (UTC)

I think the statement refers to the IQ tests rather that to the concept of IQ. Any test which is composed of a large number of independent subtests will approximately provide normally distributed results. Historically the IQ tests are important, however. Bo Jacoby (talk) 15:01, 27 December 2007 (UTC).

Carl Friedrich Gauß, not Gauss

Actually his surname is written Gauß (German sharp s) , not Gauss. I'll change that I'll leave that to you, but most articles also contain the spelling in the mother tongue) --Albedoshader (talk) 21:18, 30 April 2008 (UTC)

When writing in English, it is usually written "ss" rather than "ß", and sometimes when writing in German it's done that way (especially in Switzerland). Michael Hardy (talk) 00:06, 1 May 2008 (UTC)

Rows of Pascal's triangle?

Hi, I'm only in grade 11, so go easy on my maths, but I couldn't help noticing the other day that if you take a row of Pascal's triangle, and use each of the numbers as the y-value for consecutive points, it looks distinctly like a normal distribution. e.g. for the 6th row (x,y) (0,1) (1,6) (2,15) (3,20) (4,15) (5,6) (6,1) I find that the 40th row is pretty clear. Any comments? Cheers —Preceding unsigned comment added by (talk) 11:11, 9 May 2008 (UTC)

You are correct. This is a well-known result when approached from a slightly different context. The values in Pascal's triangle are the binomial coefficients and the probability masses in a binomial distribution which has parameter p=0.5 are proportional to these. You should find more in the binomial distribution article about how the distribution behaves as the "size" parameter N increases. Melcombe (talk) 12:37, 9 May 2008 (UTC)

OK, thanks for that. —Preceding unsigned comment added by (talk) 23:29, 10 May 2008 (UTC)


it's worth including that N(-x)=1-N(x). It's implied by 2N(x)-1=N(x)-N(-x) but it would be good to state it explicitly. (talk) —Preceding comment was added at 16:12, 15 May 2008 (UTC)

I presume that by N you mean what is often called Φ, the cumulative distribution function. Michael Hardy (talk) 17:08, 15 May 2008 (UTC)

Error in the Entropy ?

I found on another website ( that the entropy of the normal distribution is not exactly what you propose : \ln\left(\sigma\sqrt{2\,\pi\,e}\right)

but is equal to :

\mbox{If } p(x) \sim \mathcal{N}(\mu,\sigma^2)

\mbox{Then } H(p) = \mathrm{E}_p[-\ln p] = \mathrm{E}_p \left[-\ln \left[\frac{1}{\sigma \sqrt{2\pi} } \exp \left(-\frac{(x-\mu)^2}{2\sigma ^2} \right)\right] \right] = \frac{1}{2}\left( \ln [2\pi\sigma^2] +1 \right)

Does somebody know how the first one is obtained? Thanks. (talk) 14:56, 7 August 2008 (UTC)

The expressions are equivalent using simple manipulations including \ln\left(\sqrt{e}\right)=\frac{1}{2}. Melcombe (talk) 15:46, 7 August 2008 (UTC)

Thanks! I just did not notice that. (talk) 20:43, 7 August 2008 (UTC)

What if a random variable's reciprocal is Normal Distribution?

In my research, sometimes a variable is not Normal Distribution, but its log or reciprocal is Normal distribution. - for the log, we have log-normal distribution. Its expectation and variance can be easily calculated. For example, E[exp(x)]=exp(E[x]+var[x]/2), if x is normal distribution. - for the reciprocal, can we still have similar good results? can anyone calculate its expectation? Thanks. —Preceding unsigned comment added by Badou517 (talkcontribs) 17:56, 4 September 2008 (UTC)

extremal value?

given n i.i.d standard normal variates, their maximum should follow a Gumbel Distribution ('extremal value type I'). Does anybody know the exact parameters of the Gumbel? (there are 'scale' and 'location' parameters). I cannot seem to find this information on the net, and my statistical books are lacking... Shabbychef (talk) 19:33, 26 September 2008 (UTC)

sorry, it should possibly follow a reverse Weibull ('type III') distribution (?) again, are the parameters known? Shabbychef (talk) 19:55, 26 September 2008 (UTC)

Neither Gumbel nor "a reverse Weibull" is exactly correct. The Gumbel is the limiting distribution as n increases but you can get a better approximation for any given n using a "reverse Weibull" ... this follows from the penultimate approximation results which are moderately well-known if you are deep into theoretical extreme value analysis. For the Gumbel approximation there are standard theoretical results giving the asymptotic behaviour of the required standardising parameters, but the resulting approximations are very poor and misleading as to how well a Gumbel distribution would fit. Better approximations can be found by matching the theoretical quantiles at two (Gumbel) or 3(reverse Weibull) selected percentage points, given that you know that the cdf of the "maximum of n" is the nth power of the cdf of the original values. Melcombe (talk) 10:59, 29 September 2008 (UTC)

given that ultimately I am looking for a 1-sided test, I think I will take your suggestion and use Bonferroni's method/nth power of the cdf. I guess I am a bit surprised that nature did not provide a nice distribution for the max of a sample of normals. thanks for the help. Shabbychef (talk) 18:24, 29 September 2008 (UTC)

cdf table

It would be helpful if the article had a cdf table. I just wanted to look up a value and hoped to find it here. Of course there are some tables in the external links but I think a small table is encyclopedic enough to belong in the article. (talk) 19:31, 26 November 2008 (UTC)

See Wikipedia:Articles for deletion/T-table for discussion of similar tables for the Student t distribution. I think the real conclusion was that detailed tables should be elsewhere ("Transwiki to wikibooks"), although that was not what was implemented. Perhaps two small tables could be included here, one in each direction between probability and value, with perhsaps six rows in each. Melcombe (talk) 10:04, 27 November 2008 (UTC)
Tables fall into WP:NOT. O18 (talk) 04:24, 28 November 2008 (UTC)
Melcombe's suggestion of a six row table is better than nothing but I'd prefer a bit more precision. That's what's in the article about the t distribution. WP:NOT talks about collections of bare tables, not a table of values of a particular distribution in an article about the distribution. The deletion discussion for T-table resolved in favor of merging the table to the article, which is what I'm suggesting for this article. (talk) 08:33, 29 November 2008 (UTC), thanks for pointing that out. Can you give me a link to the deletion discussion, I can not find it. O18 (talk) 15:30, 29 November 2008 (UTC)
The deletion discussion is the one that Melcombe just linked to, Wikipedia:Articles for deletion/T-table. (talk) 20:51, 29 November 2008 (UTC)

multidimension case

can someone add a section talking about higher dimension case? —Preceding unsigned comment added by (talk) 01:32, 16 December 2008 (UTC)

See Multivariate normal distribution. Is there anything more worth saying in this article? Melcombe (talk) 09:55, 16 December 2008 (UTC)

entropy claculation

In the entropy calculation on the top right table what is the variable "e"?

I cannot see a definition within the main text body. —Preceding unsigned comment added by (talk) 10:20, 27 December 2008 (UTC)

Maximum likelihood estimation of parameters

This section appears wrong to me, it conflicts with "In All Likelihood: Statistical Modeling and Inference Using Likelihood" by Yudi Pawitan, and as well. In particular

It is conventional to denote the "log-likelihood function", i.e., the logarithm of the likelihood function, by a lower-case , and we have

\ell(\overline{X}_n,\sigma)=\log C-n\log\sigma-{\sum_{i=1}^n(X_i-\overline{X}_n)^2 \over 2\sigma^2},

Should be:

It is conventional to denote the "log-likelihood function", i.e., the logarithm of the likelihood function, by a lower-case , and we have

\ell(\overline{X}_n,\sigma)=-{n \over 2}  \log 2\pi-{n \over 2}\log\sigma^2-{\sum_{i=1}^n(X_i-\overline{X}_n)^2 \over 2\sigma^2},

The constant *is* important when doing Akaike weight comparisons. I'm unsure of this constant, as Yudi neglects this and the only other source is the website mentioned above. (talk) 15:48, 30 April 2009 (UTC)


I just answered my own question, duh!, the relevant term is the same with simple algebraic manipulation. However, the constant is still nice to have. —Preceding unsigned comment added by (talkcontribs) 17:03, 30 April 2009 (UTC)

sum of absolute values of normals, and max of normals

Can anyone concisely add something to the article about the sums of absolute values of normal variables? The article covers sums of normal variables (the sum is normally distributed, with appropriate modified mean and standard deviation), and the sums of squared normal variables (the sum is chi-squared distributed). But what if we sum the absolute values?

Also, how is \max_i z_i distributed (for z_i iid normal, or multivariate normal with covariance matrix \Sigma)? And same question for \max_i |z_i| , e.g. \|z\|_\infty? Lavaka (talk) 18:52, 31 August 2009 (UTC)

The absolute value of a standard normal random variable (when μ≠0 the situation gets even more trickier) follows the chi distribution with 1 degree of freedom, also known as half-normal distribution. If you look at the characteristic function of the chi distribution, you’ll see it is expressable in terms of special functions only, which means it is highly unlikely that there is a closed-form expression for a sum of several independent “half-normal” random variables. However when the number of summands n is large, you can use the central limit theorem to find the approximate limiting distribution:
 \sqrt{n} \left(\frac{1}{n}\sum_{j=1}^n |z_j| - \frac{2}{\pi} \right)\ \xrightarrow{d}\ \mathcal{N}\Big(0,\ 1-\frac{2}{\pi}\Big)
As for the max{zj} the situation is not that hopeless: if zj are iid standard normal then their maximum has pdf
(see the order statistic article). For a multivariate normal with covariance matrix Σ the notion of maximum is not defined. The distribution of max |zj| is given by exactly same formula
 nF(x)^{n-1}f(x), \,
f(x)=\sqrt{2\over\pi} e^{-x^2/2} \mathbf{1}_{x\geq0},\ \ F(x) = \int_0^x f(t)dt.
... stpasha » talk » 22:12, 31 August 2009 (UTC)
thanks Stpasha. Lavaka (talk) 22:46, 7 September 2009 (UTC)

error in product distribution ?

Since the this is a small portion of the article, I will quote the part of interest in my question:



  • So, my first question would be "what is the definition of z in the probability density definition?" my guess is that \scriptstyle Z = X Y, but it could be exlicitely noted.
  • But even then, the link given does not lead to a page within the statistic portal and we can't actully recover the formula of the pdf as the link given provide the formula of the Bessel function (modified 2nd kind) with a non-zero parameter. So, would it be possible to extend this section so that the pdf is actually given. —Preceding unsigned comment added by (talk) 06:15, 7 October 2009 (UTC)
Here z is just the argument of the probability function. It could have been denoted with any letter. The notation p(z) means “density of random variable XY evaluated at point XY=z”.
As for the “actual formula” — this is the actual formula. You can’t go any simpler than π−1K0(|z|). And K0 here is indeed the “modified Bessel function of the second kind”. Check for example reference at The linked article does provide a reasonable description of what that function is, including its associated differential equation:
 x^2y'' + xy' - x^2y = 0 \,
and its graph too. stpasha » 20:38, 7 October 2009 (UTC)

Etymology question

I heard somewhere long ago that the normal distribution derived its name from some connection with the normal equation, the solution of which is the solution to the least-squares problem. The normal equation in turn derives its name from a "norm" on a linear space. Is that not the case?

The article says, 'The name “normal distribution” was coined independently by Peirce, Galton and Lexis around 1875; the term was derived from the fact that this distribution was seen as typical, common, normal.' However, there is no citation. Jive Dadson (talk) 00:32, 15 October 2009 (UTC)

There is a citation to that claim: Earliest Known Uses of Some of the Words of Mathematics (Normal). That page in turn provides citations to the works where the term “normal” first appeared, and even quotes that “it is fair to say that [Pearson's] consistent and exclusive use of this term in his epoch-making publications led to its adoption throughout the statistical community”. stpasha » 06:34, 15 October 2009 (UTC)