Talk:Variance: Difference between revisions

Content deleted Content added

Inline

Revision as of 02:24, 16 March 2010

Mathematics B‑class Top‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Top	This article has been rated as Top-priority on the project's priority scale.

Statistics B‑class Top‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Top	This article has been rated as Top-importance on the importance scale.

Properties 4.1/4.2 are wrong / Verification of ALL properties required !

"In a finite population or sample, if the variable is extended with a number that is larger than all other numbers of the variable, then the variance will increase." This is not true, counter example: Given the samples 2,9,10 we have a mean of 7 and a variance of 38/3=12.7 resp. 38/2=19. Now we consider another sample 11, which is larger than the rest. New mean (of 2,9,10,11)is 8 and new variance is 50/4=12.5 resp. 50/3=16.7 The variance has decreased, because the new sample is closer to the mean than the average distance to the mean was before, furthermore the mean changes. The same applies to a smaller value (e.g. add negative signs to all numbers). This shows that care must be taken when using such informal statements, although I really appreciate them, because they provide quick intuition. However, dont add such a sentence if you cannot prove it.

Kevin 89.53.3.68 21:55, 15 March 2007 (UTC)[reply]

Well spotted. I've removed the offending points from the article. --Salix alba (talk) 00:21, 16 March 2007 (UTC)[reply]

Fine! I was quite surprised that there is a completely wrong statement (this is NOT a typing error but a wrong sentence, well meant of course, but obviously driven by intuition and not using mathematical reasoning as it should) in such a fundamental article like Variance. I have the bad feeling that if there is one error it is not unlikely that there is another error in the other "informal" properties. Therefore I propose someone should have a close look at all the informal properties (or better: provide a proof to each e.g. in the "formal" section). Until each and every propertiy is verified, we should keep the expert tag or a warning for the unexperienced reader. What do you think ?

Kevin 89.53.27.38 10:12, 17 March 2007 (UTC)[reply]

I'm removing the tag. I reviewed the list, and every item is either clearly correct (the square of a real number is always positive – do we really need a reference for that?) or else a wiki-link is provided to an article (e.g., Chebyshev's inequality) that is presumably well enough referenced. I did see one computational error in the bit about Fahrenheit vs. Centigrade (extra factor of 10) … I'll fix that, too. Oh – on Kevin's original complaint, the sentence probably could have been fixed by saying "...if the variable is extended with a number that is significantly farther from the mean than all other numbers of the variable, then the variance will increase." Or is that too imprecise? DavidCBryant 14:34, 20 April 2007 (UTC)[reply]

Obviously, you are right and I was wrong. My apologies. I would like some kind of rephrasing like David suggests, but I would avoid the word "significant" because that has a very different meaning in statistics. JulesEllis 00:41, 11 May 2007 (UTC)[reply]

var(x,w) in Matlab

what does it mean for var(x,w) in MATLAB?

w is an optional argument - if w=1 then you get the variance as defined by the 2nd moment

i.e.

var(x,1)={\frac {1}{N}}\sum _{i=1}^{N}(x_{i}-{\bar {x}})^{2}

,

but if w=0 you get the sample variance as discussed (somewhere) in the article

i.e.

var(x,0)={\frac {1}{N-1}}\sum _{i=1}^{N}(x_{i}-{\bar {x}})^{2}

In other words, var(1:6,1) gives the correct variance of a six sided die, but for a random sample e.g. A=ceil(6*rand(1,10)), the sample variance is obtained from var(A,0), which is equivalent to var(A).

Use of [] vs ()

I was wondering what rule was being used to determine whether E[X] or E(X) was used. Here are some rules I have come across:

My cosupervisor recommends using E[X] but Var(X).

"Probability and Statistics for Engineers and Scientists (5th ed)" (Walpole, Myers 1993) appears to use normally E(X), but uses E[f(X)] to distinguish outer brackets from inner brackets.

"Linear Regression Analysis" (Seber, 1984) uses E[X], var[X] and cov[X,Y]. Gmatht 10:02, 22 February 2006 (UTC)[reply]

Simplicity of the article

"looks as if any intelligent undergraduate would be able to follow it without much effort."

The required effort being to look somewhere other than Wikipedia for an entry level introduction to the concept.

And are highschoolers are not entitled to a Wikipedia entry they can understand?

How about unintelligent undergraduates?

How about PhD Molecular Biologists (myself)? —The preceding unsigned comment was added by 194.171.7.39 (talk • contribs) 14:51, 9 February 2006 (UTC)

I'm sorry you find it unclear. Did you understand expected value and standard deviation? The page defines variance as "how far from the expected value [a random variable's] values typically are". I can't think of a more simplified explanation. Do you think an opening section, such as homomorphism's, would help? --Mgreenbe 16:07, 9 February 2006 (UTC)[reply]

I'm on a Biomedical undergraduate programme studying descriptive statistics for laboratory work and I have to say I can't make head nor tail of this article. --Iscariot 19:20, 7 November 2006 (UTC)[reply]

Use basic wikipedia then!

I'm an economics undergraduate and i already know what standard deviation is, but this article makes no sense at all

Wow. This (and Mean, and Standard Deviation) are horrible. Undecipherable. Tons of variables being used without explanation. The grammar is awful. It reads like it was ripped from one of those 100 page math books from the turn of the 19th century--absolutely useless without the lectures. —The preceding unsigned comment was added by 68.100.26.175 (talk • contribs) 19:43, 5 April 2004 (UTC)

I was concerned when I read the words above, since I dislike bad grammar and overly complicated verbiage (see my recent editing of counterexample). But then I looked at the page, and it looks as if any intelligent undergraduate would be able to follow it without much effort. No "variables" are unexplained (and I have often been upset to find Wikipedia articles in which mathematical notation is unexplained; I'm a stickler about such things). But do go ahead and improve it if you can. And this article is quite light on the use of "variables"; I don't understand why you say "tons". (I have not looked at mean and standard deviation today.) Michael Hardy 18:06, 5 Apr 2004 (UTC)

i have to agree with the above complaints (and i understand the subject of the article completely). there are too many symbols meaning the same quantity or concept. just a heads up that i'm gonna read through this and do symbol consolidation (and i'll try to make the symbols compatible with other related articles) and perhaps a little exposition regarding the difference between population variance (divide by N) and sample variance (divide by N-1) and why there are these two slightly different formulae for ostensibly the same quantity. Rbj 02:33, 7 May 2006 (UTC)[reply]

Variance as analogous to moment of inertia!?!?

I removed this aside someone had put at the bottom, as it was just plain silly.

... and I've put it back, since it OBVIOUSLY makes sense. The mathematical analogy between the two is clear. Whether it is in some way fruitful is not clear to me at this moment; maybe someone can add something. Michael Hardy 22:48, 15 Apr 2005 (UTC)

I clarified the sentence.--Patrick 00:19, 16 Apr 2005 (UTC)

Funny, I tried to explain this relation for many years in one of my introductory statisics courses for psychology students, and it was always a futile attempt. That is, if you want to understand what variance is, then this relation helps no one. I have never had any benefit from this analogy. So on the one hand agree that the analogy is fascinating, but on the other hand I'm pretty sure that doesn't help anyone. Perhaps it makes sense on a page about moment inertia, but not here. I don't vote for removal, but it should not receive more attention than it has now (which is little). The point is that the concept of variance is very widely used while moment inertia is very specific. So if it has to be mentioned here, we can as well add a section about quantum uncertainty or investment risk, which are also operationalized as variance. Hmm, perhaps that would not be a bad idea indeed. JulesEllis 07:32, 14 January 2007 (UTC)[reply]

Investment risk? That's a practical application of variance in a particular circumstance, quite unlike the moment of inertia which has nothing to do with statistics. Fact is that these concepts are linked at a sufficiently fundamental level that statistics actually borrowed the term from mechanics. Which is why variance is also known as the second order central moment, and why we have moment generating functions. --Het 10:03, 18 January 2007 (UTC)[reply]

What is so fundamental about borrowing the term? It seems rather superficial to me. You say that statistics borrowed the term from mechanics, but this is only true for the term moment and not for the term variance. Furthermore, this borrowing pertains only to the word but not to the concept. I just checked the articles of De Moivre and Fisher and I do not see any mentioning of moment of inertia. Gauss called it mean error, so it seems he was not all too concerned about the similarity either. Well, I do not know everything, so perhaps you're right, but then please explain what is so fundamental about the relation with moment of inertia. Do some statisticians frequently use theorems or insights from mechanics when they reason about variance? That must be an area of statistics that totally escaped me. Otherwise I'm inclined to think that this is no more than a footnote to the history of the term moment. A relation that IMHO is fundamental is the similarity between the additivity of variances and the theorem of Pythagoras. JulesEllis 03:35, 10 February 2007 (UTC)[reply]

$\operatorname {var} (s^{2})$

Can anybody provide an estimator for the variance of the variance estimator? If I calculate the population variance from a sample I do not only want to know how good the result is on average (e.g. unbiased), but also how much the estimates may change for different samples.

Certainly, if the population is normally distributed, then the usual unbiased variance estimator S² satisfies

{(n-1)S^{2} \over \sigma ^{2}}\sim \chi _{n-1}^{2},

so the variance of that is the variance of the chi-square distribution with n − 1 degrees of freedom, i.e., it is 2(n − 1). Thus, the variance of S² itself is

{2\sigma ^{4} \over n-1}.

Michael Hardy 23:28, 20 January 2006 (UTC)[reply]

Michael, I would also like to have this in the article. I found a formula for this at http://mathworld.wolfram.com/SampleVarianceDistribution.html, with reference to Kenney and Keeping 1951, p. 164; Rose and Smith 2002, p. 264. I suggest that we add it to the article. JulesEllis 00:33, 27 August 2007 (UTC)[reply]

N vs N-1

"Intuitively, computing the variance by dividing by N instead of N − 1 underestimates the population variance. This is because we are using the sample mean as an estimate of the unknown population mean μ, and the raw counts of repeated elements in the sample instead of the unknown true probabilities."

I have several problems with these sentences. First, "intuitively" is an unfortunate word choice--math is not intuitive to many people. Second, it is not at all intuitive why dividing by N would underestimate the population variance. The explanation does not help: Why does it matter that we use the sample mean instead of the population mean? Furthermore, what precisely is meant by "raw counts of repeated elements"? Does this have something to do with sampling with replacement? At the very least, these terms need to be defined or linked, and the explanation should be clarified. If I understood what was being asserted here, I would do it--but I don't.

Danfrog 21:06, 22 March 2006 (UTC)[reply]

see my comment above. i will explain why the difference between dividing by N or N-1. (for an intuitive "taste": think of a sample of 1 from an RV with a positive variance, the numerator will be zero, but the denominator will be 1 yeilding a calculated variance of 0, without any problem. we want a 0/0 there to indicate that there is a limit issue and the variance is not likely to be zero. then consider a sample of just 2. now, assuming they're not equal, you can get an idea of what the variance is and dividing by 1 rather than 2 will get you the right answer. it's not a proof, just an intuitive hint.) Rbj 02:42, 7 May 2006 (UTC)[reply]

alternate proof of unbiasedness

To me, the current alternate proof is nearly as long as the original. To me, a quicker way is to write $S^{2}$ as

S^{2}={\frac {1}{n(n-1)}}\sum _{i=1}^{n}\sum _{j=i+1}^{n}\left(x_{i}-x_{j}\right)^{2}

which, while not computationally efficient, serves to illustrate that variance is decomposed into exactly ${n} \choose {2}$ pairs of distances. Then the desired result is a direct consequence of

\operatorname {E} \left[\left(x_{i}-x_{j}\right)^{2}\right]=2\sigma ^{2}

for

i\neq j

and

\sum _{i=1}^{n}\sum _{j=i+1}^{n}1={{n} \choose {2}}

Btyner 19:24, 26 April 2006 (UTC)[reply]

Hmmm, now it occurs to me that independence is required for that scratch shortcut to work, so never mind. Btyner 03:49, 17 May 2006 (UTC)[reply]

I used it in the introductory text that I added. I also added a shorter, more abstract proof. I didn't delete the older proofs because I'm not sure that mine is more readable.JulesEllis 00:33, 15 January 2007 (UTC)[reply]

R. A. Fisher developed it?

If so, as suggested by another article, should not it be mentioned in this one? I suggest only a short paragraph like this:

The term was first used by R. A. Fisher, in his 1918 paper "The Correlation Between Relatives on the Supposition of Mendelian Inheritance", where he first shows that mendelian inheritance was compatible wis continous variation of characters, differently from what previously seemed.

--Extremophile 06:10, 1 May 2006 (UTC)[reply]

Alternative formula

There is another formula that is slightly easier to calculate if the data is in a table or if the mean is an awkward number, and that is:

\sigma ^{2}={\frac {\sum _{i=1}^{n}x_{i}^{2}}{n}}-{\bar {x}}^{2}

But I don't know where to put it into the article. Any suggestions? x42bn6 Talk 07:22, 27 May 2006 (UTC)[reply]

True variance

There is a pretty substantial article at true variance which overlaps with much of this one. I really don't think we need two articles about this, but merging would be quite a task. Btyner 03:21, 31 May 2006 (UTC)[reply]

D'oh, now I see the whole discussion at Talk:True variance. What a shame ... Btyner 03:25, 31 May 2006 (UTC)[reply]

Finite population correction

Am I right in thinking that there is a different formula for an unbiased estimate of the population variance when you are sampling from a finite population? Should this be included here?

Technical template

I'm sorry, but I'm currently taking a Statistics class and most of this page is gobbledygook to me. My book says that "intuition" is right and using n instead of n-1 for the sample variance does indeed underestimate the population variance, and the article's explanation of why this is not so makes no sense to me. Please, let's use real, normal English explanations alongside the technical expositions. I came here trying to understand why n-1 makes the sample variance an unbiased estimator of the population variance; I leave knowing no more than when I came. -- Calion | Talk 03:54, 28 November 2006 (UTC)[reply]

Common sense introduction

I have added a more common sense introduction to the definition. The definitions in terms of expectations were mathematical correct, but it is obvious for me (statistics teacher on a university for more than two decades) that they don't make sense for anyone who has not taken a course in mathematical statistics. I think that there is more to statistics than just that, as the concepts have been used long before axiomatic probability theory developed. A strictly mathematical definition is fine for such technical concepts as the gamma distribution, which are most likely to be used by specialists, but not for such a basic concept as variance. So I have written a long introduction in an attempt to explain it really well without scaring people away with formulas. At the same time it also explains the n-1 versus n problem. JulesEllis 07:22, 14 January 2007 (UTC)JulesEllis[reply]

I'm what you might call an "amateur mathematician" (I'm into computer science and category theory) and statistics has not been my forte. I just wanted to say that this conversational intro by JuleEllis struck me as a very well-written intuitive introduction to some of the basic notions of variance. As someone who never paid much attention to statistics, I found this intro very clearly written and easy to understand, as well as providing an excellent motivation for the various ways to compute the variance. —Preceding unsigned comment added by 84.57.12.20 (talk) 15:55, 4 October 2007 (UTC)[reply]

I like the idea of what you're doing and I feel bad for deleting much of it. I think you could make the discussion much more concise however...what you're writing is more appropriate for a textbook or something. The text is still there though...you may want to salvage a lot of it and put it back in the article in some form or another. I think the article is a bit too try and technical. On the other hand I think we need to keep it short. I think that if you could make your explanation concise it would be very valuable to put under the "Definition" section, and not in a separate heading. Cazort 22:13, 4 October 2007 (UTC)[reply]

Properties

I have also tried to write a more introductory text for the properties. However, I don't know yet how to get square signs in it, and I didn't fill in the scale parameters of the Fahrenheit - Celsius transformation. This should be done. Also, I'm not sure how to make the integration with the more formal part of the text. I think that the introductory texts should pertain mostly to finite populations, even though that is a compromise to mathematical generality. Mathematical generality is desirable in the later parts of the article though.

JulesEllis 07:22, 14 January 2007 (UTC)Jules Ellis[reply]

Done. I also added some more general theory about the variance of sums. I also added the variance decomposition formula, as it is essential for analysis of variance. I consider this specialist information.

JulesEllis 20:17, 14 January 2007 (UTC)[reply]

The text reads "The variance of a finite sum of uncorrelated random variables is equal to the sum of their variances", but shouldn't it read "The variance of a finite sum of positive uncorrelated random variables is equal to the sum of their variances"? Otherwise this statement would seem to contradict the law of large numbers. Vectro 18:18, 2 July 2007 (UTC)[reply]

Suggestion for the definition section

I want to make these changes, but I'm not sure that others agree. Please comment:

1. Add to the definition section the two definitions of the sample variance (move some stuff of the other section to this place). For a reader who is unfamiliar with the topic it must be confusing that we give one definition and then talk about three different things.

2. Make the notation more consistent, and always use $s^{2}$ for the unbiased estimate and V for the other version.( ${\hat {\sigma }}^{2}$ would be possible too, but I think that V reads easier as it respects the convention to use Greek only for paramaters.

3. Reserve the name sample variance for $s^{2}$ . I believe that this is the default meaning of the term, and that texts that use the term differently are rare and usually say explicitely that they use the term in another meaning.

4. Reserve another term for V. Any suggestions? Perhaps uncorrected sample variance? I believe that there is no name that is generally agreed upon, so this it should be made clear that the term is only used locally in this article for ease of presentation.

5. Add that V is the ML estimator of the variance of a normal distribution.

6. Add that the asymptotic distribution of V and $s^{2}$ is a normal distribution as a consequence of the central limit theorem; specify its variance.

JulesEllis 19:53, 15 January 2007 (UTC)[reply]

Request for citation or clarification

Hi, is there any citation or corroboration for this statement?

, and the standard deviation that is obtained from the unbiased n-1 version of the variance is not unbiased.

71.198.188.12 07:44, 23 February 2007 (UTC)[reply]

I can verify this. However the bias is small, gets relatively smaller as n increases, and there is a constant known as

c_{4}

in some circles which can be used to construct an unbiased estimator of

\sigma

when the errors are normal. It's not too hard to prove. Btyner 12:18, 24 March 2007 (UTC)[reply]

This was really beyond the scope of this article so I've made a new one which for now lives at Unbiased estimation of standard deviation. Btyner 18:32, 24 March 2007 (UTC)[reply]

Looks good, much needed. A slight preference for calling unbiased estimation of variance as opposed to sd. --Salix alba (talk) 22:48, 24 March 2007 (UTC)[reply]

Why? The whole point of making it was to show how σ could be unbiasedly estimated. If we moved the stuff about unbiased estimation of σ² to that article, then we could call it "unbiased estimation of variance and standard deviation". Btyner 15:10, 25 March 2007 (UTC)[reply]

A simple explanation is that the standard deviation is a nonlinear function (square root) of the variance so the property of being unbiased does not carry over. This is because the operations of taking the mean and applying a function in general do not commute unless that function is linear. By definition an estimate is unbiased when its mean equals to the right thing.

About the variance itself, please see also the new introduction to Estimation of covariance matrices that I have just added. Jmath666 22:31, 25 March 2007 (UTC)[reply]

squares vs. absolute values

The article now states:

The squaring is done to get the negative signs of some differences away. In principle, you could also do that by taking the absolute values (i.e., just dropping the signs), but squaring is more convenient for mathematicians.

As I understand it, squaring is not an arbitrary choice as the article implies. Variances are additive (à la property #8) but “absolute value variances” are not. Can someone more mathematically knowledgeable confirm this? If so, it should be pointed out in the text. --75.15.152.144 08:31, 24 February 2007 (UTC)[reply]

You are right. That sort of thing is supposed to be covered by "more convenient for mathematicians", but in actuality that is a pretty meaningless phrase. The introduction is in my opinion quite terrible and will not help a non-math-savvy reader to understand any better. There are also serious errors in there: "The multiplication by 0.5 can be justified because if you consider all pairs, then you see each difference twice (namely as number 1 - number 2 and as number 2 - number 1)." This is numerology and wrong. The fact that 2 squares are formed is already compensated by dividing by 2 when the average square is formed. The reason for the 0.5 is because that is what matches the mathematical definition of variance (which could easily be altered by a constant without major harm). Also this way of computing the variance requires you to include the zero differences between a value and itself. This is not at all intutitive and way harder to grasp than measuring the difference between each value and the mean. The later "explanation" of the n-1 factor is also pure numerology and has nothing whatever to do with the real reason the n-1 factor is used. I propose to replace the introduction section entirely but will wait for objections. --McKay 02:51, 26 February 2007 (UTC)[reply]

Of course it is not arbitrary. When I wrote the sentence "because it is more convenient for mathematicians" I meant the additivity property, but I didn't want to become too technical at that stage. Feel free to replace it by something more clear. But if you refer to additivity, please explain why that would be a reason to prefer this definition. I think the reason is that it is mathematically convenient. That additivity is a nice property might be obvious for mathematicians, it is not obvious for other people. Someone else added the phrase about differentiability. That is convenient too, indeed. With respect to McKay's comment: I agree with your point about the factor 0.5. I wrote the original version of that intro, so I feel free to remove it directly after this post. I do not agree with you about n - 1. I know that the reason is the unbiasedness, but most lay persons will not understand that. This is so because they are often unable to imagine much more than the data at hand, let alone a (for them) fairly abstract concept as the sampling distribution of the mean. Understanding unbiasedness requires understanding the sampling distribution, however, because it is basically a statement about the expected value of that distribution. Furthermore, it can be argued that the unbiasedness is actually a poor reason to divide by n-1, because the ensuing standard deviation will still be biased and it will generally not entail the maximum likelihood estimate for the variance. The zeros on the diagonal provide just another way to understand why the division by n-1 instead of n yields a reasonable measure. You use the word "numerology", but in fact it provides an exact definition of the variance in a finite space with equal probabilities: The variance is the mean squared difference of distinct pairs, divided by 2. Where's the numerology in that? However, if the article contains a formulation that suggests that this is the only reason to divide by n-1, then I agree that this formulation should be changed. You also say that this way of introducing the variance is harder to grasp than the measuring the difference between each value and the mean. I disagree. When you ask people without statistical training to assess the variation in a row of numbers, they will start looking at pairwise differences, and not first compute the mean. JulesEllis 01:18, 11 May 2007 (UTC)[reply]

I agree that pairwise differences have intuitive appeal. I also like the bit about the diagonals as an explanation for the "n - 1", which I always find baffling whenever it pops up in statistics. Instead of "convenient for mathematicians," how about "has some nice mathematical properties" or "has some convenient mathematical properties"? --Coppertwig 19:44, 15 June 2007 (UTC)[reply]

Variance...not arbitrary? I think alas that it is arbitrary. There are some theorems that you can prove about optimality of variance under certain conditions but many of these theorems either use very strong assumptions of total normality (and fail under more reasonable assumptions), or they use circular reasoning, showing that variance makes sense when you are measuring your loss by something like mean squared error. I think that we need to strictly monitor this article so that we make sure that we treat variance as it is--as one way of measuring variation from the mean...a way that has certain enticing mathematical properties, and is widely used...but is not the only way of doing things and does not always have compelling physical, mathematical, or philosophical reasons for its use. Cazort 22:07, 4 October 2007 (UTC)[reply]

n?

"That is, the variance of the mean decreases with n." - Whats n? Fresheneesz 07:18, 21 March 2007 (UTC)[reply]

n means sample size (how many subjects). Herenthere (Talk) 00:44, 25 March 2007 (UTC)[reply]

clumsy

A lot of this looks a bit clumsily written. I'll be back. Michael Hardy 00:43, 25 March 2007 (UTC)[reply]

I agree entirely. The language is all over the place, and generally too conversational in tone. DRE 20:55, 27 July 2007 (UTC)[reply]

I totally agree. Cazort 22:04, 4 October 2007 (UTC)[reply]

Undefined variables

Every variable used in an expression should be defined. Consider the following extract:

If the random variable is discrete, this is the same as:

\sum _{i}(x_{i}-\mu )^{2}p_{i}\,.

So, what's $p_{i}$ ? There are other examples of such "magic variables" pointed out above. EmmetCaulfield 12:38, 9 May 2007 (UTC)[reply]

Numbering system

The numbering in the Properties, Introduction section is supposed to match up with the numbering in Properties, formal. I think 10 matches but I suspect 9 doesn't. "6 and 8 jointly imply that" is suspicious, since 8 is what is being proven maybe? Actually, I'm suspicious about all the section numbers, because who knows when someone might have inserted a section, changing all the numbers. Besides, 8a 8b and 8c are mentioned but there is only a simple section 8 above. How about naming each section, instead (or in addition to numbering them) to help keep things straight. --Coppertwig 19:49, 15 June 2007 (UTC)[reply]

I think this needs to be changed: "Properties 6 and 8 jointly imply that..." in 8c in the formal section. First of all, I think it would be more specific and clearer to say "6.3 and 8b". Secondly, referring to 8b is confusing since usually the numbers refer to sections in the introduction, while here another section in the formal section needs to be referred to, suggesting the need to revamp the numbering system. Thirdly, I don't think any of the preceding discussion establishes that cov(aX,bY) = ab cov(X,Y), which is needed here. Suggestions on how to change it are welcome. --Coppertwig 19:15, 16 June 2007 (UTC)[reply]

Over two years later and these invalid references are still there... I'll try to fix some of them. —Keenan Pepper 19:41, 31 January 2010 (UTC)[reply]

Fotiable?

"A more understandable measure is the square root of the variance, called the standard deviation. As its name implies it gives in a standard fotiable for all real numbers..."

There are no definitions of the word fotiable available from Google, and Wikipedia itself doesn't have an article or definition for it. This word should either be defined in the article, or be replaced with a word that can be defined.

Also, the word in does not belong in that sentence. 130.195.5.7 21:23, 18 July 2007 (UTC)[reply]

Wrong/ambiguous formula -- divide by n-1

Shouldn't the variance be the sum of the squares divided by n-1? This page simply defines it as the sum of the squares of the differences. There is a page on Wolfram MathWorld that gives two different formulas, the "population variance", which does not divide by n-1, and the "sample variance", which does divide by n-1. This needs to be clarified in this article because only one method is presented. --Wykypydya 18:53, 29 July 2007 (UTC)[reply]

Sigh.... I don't know why I didn't immediately see a hundred copies of this question above this one on this talk page, since it seems as if we've gone through this that many times.

This page does NOT "simply define it as the sum of squares of the differences"; rather, it multiplies each difference by the corresponding probability. In case the probability distribution is uniform on a set of n points, then that probability is 1/n.

It is only when one is estimating a population variance by using a sample variance that one divides by n − 1. And the reasons for doing that are highly debatable. But that gives an unbiased estimate. Michael Hardy 19:07, 29 July 2007 (UTC)[reply]

Unit Variance

Does not explain what Unit Variance actually is. --81.86.122.174 15:45, 2 August 2007 (UTC)[reply]

"Elementary description"

Could somebody please remove this section. I tried but my change was undone. I'm sorry, but the text is simply horrific. The definition of variance is very simple - use it. Here are some examples from that section:

"compute the difference between each possible pair of numbers; square the differences; compute the mean of these squares; divide this by 2. The resulting value is the variance."

I'm not saying this isn't correct, but why should this strange n^2 algorithm be presented as the first way to calculate variance?

"In principle, this can be done by taking the absolute values (i.e., just dropping the signs), but squaring is more convenient for mathematicians, as the squared function is differentiable for all real numbers, and the absolute value is non-differentiable at zero."

Why not to the power of four then, or something like that? Variance is defined as it is - end of story. The way it is defined gives it some interesting properties. For example, look up Chebyshev's inequality.

"So it could be argued that the diagonal should not be counted when computing the mean of the squares. "

AARGH!!!

"is done, then the variance would be 0.5 × (0 + 1 + ... + 1 + 0) /12 = 1.667"

No. The variance is the variance and the unbiased estimator for variance is another thing.

"generalized into a third definition of the variance:"

This isn't the definition of variance.

"The variance according to the definitions 3 or 4 is sometimes called the 'unbiased estimate'."

It is the unbiased estimate, which is not "another definition" of variance. This only confuses the reader.

130.188.8.12 10:45, 17 August 2007 (UTC)[reply]

After the introduction the article gives the definition that you want. So why should an introduction for lay people be deleted? I would agree with you if the article was written exclusively for mathematicians and statisticians. This isn't the case. The concept of variance is sufficiently important to try to give lay people an intuition of what it is. Note that such lay people might not even understand the simplest math formula, like x + 1 = 2. For previous versions of the article, which were written as you suggest, many people complained that it was unreadable, and the article was considered too technical. JulesEllis 23:35, 26 August 2007 (UTC)[reply]

If one cannot understand that simple equation, how can one understand the following sentences?

"It can be defined in several ways such as the following algorithm: compute the difference between each possible pair of numbers; square the differences; compute the mean of these squares; divide this by 2. The resulting value is the variance."

130.233.243.229 09:33, 6 September 2007 (UTC)[reply]

Because the latter explanation does not contain a formula. 90% of the people stop reading when they see a formula, simply because they expect that they will never understand it. Obviously, mathematicians are not among these 90%. See the many complaints above about the readability of the paper.JulesEllis 04:02, 29 October 2007 (UTC)[reply]

I would like to vote to have this section seriously re-done or removed. I may try my hand at editing it but in my opinion, it would be good to either delete it or move it lower in the article. What do others think? It seems to me that this section isn't defining the "elementary" way of looking at variance but is merely describing one equivalent way of looking at it. Personally, I find it an interesting way of looking at things...but the style of exposition doesn't seem appropriate to the rest of mathematical articles on wikipedia. Cazort 21:37, 4 October 2007 (UTC)[reply]

Before the section "Elementary description" was added, many non-mathematicians complained that they didn't understand a word of this article. See many comments above. Now, the section has been removed by someone and I have no doubt that it is again unreadable for anyone except mathematicians and statisticians. Frankly, I think the person who removed it did a disservice to all who want to know something about variance without having much math education. I would have no problem with this if it was a specialized math topic that most likely will be visited by mathematicians only. However, this is not the case. JulesEllis 03:52, 29 October 2007 (UTC)[reply]

Proof of the effect of a linear transformation on the Variance

Could the following proof perhaps be included in the section on Formal properties, right below "effect of a linear....."

$Var(aX+b)=E([aX+b-aE(X)-b]^{2}$
$=E(a^{2}[X-E(X)]^{2})$
$=a^{2}Var(X)$

I would do this myself, but I am not too confident about the formating conventions, and don't want to disrupt anything. Thanks! 62.214.253.142 18:35, 7 September 2007 (UTC)[reply]

Error in formula for Property 8.b

The first formula for Property 8.b does not match the similar formula on the Covariance page. I believe the one on the Covariance page is the correct one. Specifically, the formula is missing the sum of the variances.

70.251.113.146 17:02, 25 September 2007 (UTC)[reply]

Nevermind. I neglected the fact that Cov(X, X) = Var(X), so the sum the variances is, in fact, already included. —Preceding unsigned comment added by 70.251.113.146 (talk) 17:09, 25 September 2007 (UTC)[reply]

Problem with Style of This Article

I think that the style of this article is inappropriate for an encyclopedia article and is inconsistent with the style of the rest of mathematics articles on wikipedia. I find it ironic saying this because I usually advocate the other way around, but this page is too pedagogical. It reads like a textbook. I think we should delete much of the material, including the proofs. Wikipedia pages generally do not provide proofs of most mathematical results and I think that there is not a huge problem with this--this is what sites like PlanetMath are for. What does everyone else think? Cazort 22:01, 4 October 2007 (UTC)[reply]

By the way, I propose rewriting the "properties" section in mathematical notation, and removing most of the extended information from the "formal" section, removing some of the examples, and merging them into one section, much shorter. Cazort 22:03, 4 October 2007 (UTC)[reply]

I have the opposite opinion. About a year ago the page looked about the same as it is now, and then many people complained that they didn't understand it. This is the reason why the page was more like a textbook and of a different style than other mathematics articles. Most mathematics articles will be consulted only by people with some minimum math abilities, like being able to read an equation. This isn't true for variance. Frankly, I find it a disgrace that there are apparently so many mathematicians with so little respect for lay people's wish to grasp some important concepts at their own level, without having to go through a mathematics course first. E.g. how many non-mathematicians do you think will understand what the expectation operator means? My guess is that this is less than 1%, and for all others the present article will be unreadable. Is that what you want? I strongly urge you to undo the deletion of the "Elementary description" section. 82.93.234.194 03:33, 1 November 2007 (UTC)[reply]

I agree with you that this article needs to be made more accessible, but I think that we should try to make the whole article accessible, rather than having an "elementary description" section and then a separate section on properties, and then yet another section on "formal" aspects of those properties. That was one of the things I was objecting to. As another example, I think the "Characteristic property" is one of the most important properties of variance but the way it's described makes it so arcane that no one could understand it. Lastly, I also don't think that an expanded, chatty tone is necessarily the best way to make wikipedia accessible--wikipedia is a wiki and I think the best way to make it accessible is to make it concise and well cross-referenced: more words does not necessarily make it easier for people to obtain information. Of course, I think a lot of sections right now DO need more words and more explanation. The "Elementary description" wasn't exactly an elementary description so much as an elementary example. How about making an "elementary examples" section, and including some images in addition to some simple examples? Cazort (talk) 23:15, 31 January 2008 (UTC)[reply]

The problem is that there are two totally different potential reader groups for the article. One group consists of mathematicians and statisticians who already know the concept and just want to have an overview of some important facts. The other group consists of people who have no idea whatsoever about statistics, who do not know what a random variable is, do not know what "E" means, do not know what expectation is, and who may not even know what squaring is. You cannot address both groups at the same time. Mathematicians are trained to expect a rigorous, general definition at the outset, and exactly that will confuse and scare away most other people. The present article is totally useless for this last group. They will simply stop reading in the middle of the first formula, because they know that they will never understand anything of it. Adding a graph won't help them, because they won't understand the graph, simply because they never learned how to read such graphs. Regardless of the excellent explanations that you may add to the article, the mere fact that it starts with a formula will make it inaccessible for 90% of the people. Nevertheless, these people could have learned something from the old version of the article, if only you guys could accept that there exist people who do not eat formulas for breakfast. I can understand if you think the article was too conversational, but this could have been changed without replacing the bottom-up approach (from example to general formula) by the present top-down approach (from general formula to example). But have it your way, I'm done with it. It is clear that I am the only person with the opinion that it is important to make elementary concepts like variance understandable for lay people. I was always shocked if people proposed dramatic cuts to the finances of mathematics departments, but right now I understand it and I even agree with it. Scientists with this attitude do not deserve a single penny.JulesEllis (talk) 02:53, 7 May 2008 (UTC)[reply]

Also, note that the article is rated as too technical for a general audience. Replacing more text by math will make it worse. Frankly, I believe the present article is totally inadequate for non-mathematicians. This is a shame. The topic is too important - not only for mathematicians. But now I understand why so many are not interested in math. The present article is a showcase of how mathematicians tend to obscure easy concepts rather than clarify them. JulesEllis (talk) 06:46, 20 November 2007 (UTC)[reply]

Anyone have any decent graphs to post here that might explain variance visually? I'm imagining a graph of a sample with high variance vs. one with low variance. —Preceding unsigned comment added by 65.91.102.204 (talk) 19:55, 31 January 2008 (UTC)[reply]

I think that's a great idea! Cazort (talk) 23:15, 31 January 2008 (UTC)[reply]

When finite-population terms matter

"In the course of statistical measurements, sample sizes so small as to warrant the use of the unbiased variance virtually never occur... if the difference between n and n−1 ever matters to you, then you are probably up to no good anyway"

There are (fairly common) scenarios where the difference between n and n-1 is very important - in particular, multistage sampling can create a situation where large n at the first stage makes the sample large enough to be 'reputable' but small n at a later stage means finite-population corrections are important to the results.

Example: an acquaintance of mine is studying water pollution. Each measurement she takes is the sum of true pollutant level, systematic error, and random error. She wants to measure pollution levels at various times and places, but also needs to show that the random error in her measurements was within acceptable limits (i.e. that the population variance in the random-error component is less than some constant). Since the work takes place over an extended period of time, she can't just test against known samples at the start to demonstrate consistency; she needs to show that consistency is maintained throughout the work.

Over time she takes 500 water samples, each with its own level of pollutants, then divides each of these into three subsamples and measures pollutant level. The difference between these three measurements is due to random error, and so their variance is a sample variance for random error, from which we can estimate the population variance for random error. That estimate on its own is very inaccurate - but as long as it's unbiased, we can combine it with the other 499 samples to get a much more accurate estimate of population variance. In this case, using n instead of n-1 would result in dividing by 3 instead of 2, and so underestimating population variance by a factor of one-third.

This also applies when we're going the other way, and trying to use knowledge of population variance to estimate the sample variance (and hence, accuracy) of a given experimental design. In social research, for instance, we might easily end up visiting hundreds of households but only selecting a subset of the people in each household, and the variance associated with that selection is important to accuracy of the results. Given the number of people who live in a typical household, the difference between n and n-1 can be pretty important.

I'd edit the article, but frustratingly, I don't have citable sources handy. --144.53.251.2 (talk) 00:12, 6 February 2008 (UTC)[reply]

I find the whole chunk of text leading up to this quotation to be inappropriate. In any particular situation, either n or n-1 is right and the other is wrong. Good practice is to use the correct formula and not use the incorrect formula. What is the point of going on at length about the effect of making a mistake? I suggest this part of the article be reduced to almost nothing. McKay (talk) 04:06, 2 March 2009 (UTC)[reply]

Simple examples needed

I think that adding some simple examples to the article would be good. For example, an important simple example is the variance of an indicator random variable, which would be very good to add. zermalo (talk) 23:14, 1 April 2008 (UTC)[reply]

attention to characteristic property

The subsection "characteristic property" presently has

"The second moment of a random variable attains the minimum value when taken around the mean of the random variable, i.e.

\mathrm {E} X=\mathrm {argmin} _{a}\mathrm {E} (X-a)^{2}

. This property could be reversed, i.e. if the function

\phi

satisfies

\mathrm {E} X=\mathrm {argmin} _{a}\mathrm {E} \phi (X-a)

then it is necessary of the form

\phi =ax^{2}+b

."

I think the conditions for the "reversed" result need to be firmed up. I think the stated condition need to hold for all the distributions of X, not just a single one? Melcombe (talk) 16:11, 14 April 2008 (UTC)[reply]

I've added "for all random variables X", and further tightened up this paragraph a bit. --Lambiam 20:35, 23 April 2008 (UTC)[reply]

Maths in Definition

Can someone look at the maths formatting in subsection "Discrete case" under definition? I don't know what notation is actually intended here, but the results look very odd ...the part dealing with probability masses. Melcombe (talk) 12:51, 15 May 2008 (UTC)[reply]

Bienaymé formula

It has no proof, and no separate entry. Furthermore, I can't find many references to it online, let alone a proof. Perhaps a proof/entry could be constructed? —Preceding unsigned comment added by 89.0.150.221 (talk) 19:41, 6 January 2009 (UTC)[reply]

There is another problem too. The formula here is for any finite number of random variables and cites Uncorrelated for the definition of that concept. However Uncorrelated states "Uncorrelatedness is a relation between only two random variables.". I will fix this problem. McKay (talk) 09:53, 29 January 2009 (UTC)[reply]

It looks like this website explains it, but I couldn't understand it. http://sepwww.stanford.edu/sep/prof/pvi/rand/paper_html/node16.html —Preceding unsigned comment added by 190.94.3.118 (talk) 16:48, 13 November 2009 (UTC)[reply]

sample variance

"One common source of confusion is that the term sample variance may refer to either the unbiased estimator s2 of the population variance, or to the variance of the sample viewed as a finite population." -- As opinined above, I question whether the obsolete and rare usage with denominator n should be included here at all under the name "sample variance". My impression (as a mathematician who is not a statistician) is that these days "sample variance" is a standard concept and other uses of the phrase would be regarded as wrong. Is there a modern significant reference that shows I'm wrong? McKay (talk) 09:08, 6 March 2009 (UTC)[reply]

That page you linked has multiple issues, not the least of which is the fact that it only references your publications, whereas the concept of sample variance is clearly not your own invention. However this page is not the right place to discuss the article on ukrainian wiki, the question raised here is whether the concept “sample variance” should be defined with the denominator n, or (n−1), or both. The linked page doesn't provide any reasonable resources to help with this question. … stpasha » 07:41, 3 December 2009 (UTC)[reply]

Expected Deviation

"Unlike expected deviation, variance has different units from the variable" -- expected deviation leads to this page... 66.168.1.178 (talk) 18:27, 2 November 2009 (UTC)[reply]

I've redirected expected deviation to absolute deviation. Michael Hardy (talk) 19:34, 2 November 2009 (UTC)[reply]

Die/Dice

I was about to correct the erroneous use of "dice" instead of "die" (the singular), but I see that

this appears to be an on-going point of contention and
someone has added the comment 'please do not change back to "die": "die" is historically correct, but "dice" is more comprehensible nowadays'.

It therefore seems useful to make the case for change explicit.

The argument that "dice" should be preferred because it is more comprehensible nowadays is, I believe, fallacious. Firstly, it seems more likely that its author is merely expressing his own opinion in implying that "die" is insufficiently comprehensible, as opposed to relying on some form of evidence. In contrast, there are a number of contributors who not only recognise the error but wish to correct it.

Secondly, it is of course quite true that the meaning, usage and spelling of English words all change over time. However, such evolution often begins in misuse, and it is reasonable to expect of a reference work that it perpetuate correct usage rather than yield to illiteracy.

ScotSez: Hear Hear! I concur. "Dice" is the plural form. In this example a single die is being thrown. I vote for "die", not "dice" —Preceding unsigned comment added by Zirconscot (talk • contribs) 22:53, 6 January 2010 (UTC)[reply]

I too was about to correct to correct dice to die. Then I noticed the revert war and don't want to join in.

How about a note on the first "die" referring to this justification for "die" not "dice"? TrevMrgn (talk) 17:28, 29 January 2010 (UTC)[reply]

Missing Parenthesis

ChristianCHRR says on 16.01.10: the opening parenthesis after "expected absolute deviation 1.5" has no corresponding closing parenthesis. —Preceding unsigned comment added by ChristianCHRR (talk • contribs) 01:06, 17 January 2010 (UTC)[reply]

Can you please simplify the bullshit on the main page. —Preceding unsigned comment added by 194.66.72.76 (talk) 18:39, 24 January 2010 (UTC)[reply]

@@ Line 57: / Line 57: @@
 ::Use basic wikipedia then!
+I'm an economics undergraduate and i already know what standard deviation is, but this article makes no sense at all
 ----