Talk:Weighted arithmetic mean

Statistics C‑class High‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
C	This article has been rated as C-class on Wikipedia's content assessment scale.
High	This article has been rated as High-importance on the importance scale.

Merge Convex combination into Weighted mean?

Convex combinations and weighted means are, on a fundamental level, precisely the same things. The major difference between them seems to be the way they are written. In a weighted sum, a normalizing factor is written outside the sum sign; in a convex combination, it is absorbed into the coefficients, which then sum to 1. Is this notational distinction a big enough deal to warrant separate articles? Melchoir 02:12, 19 April 2006 (UTC)[reply]

While I wait, I'll add these articles to each others' categories. Melchoir 02:13, 19 April 2006 (UTC)[reply]

It sounds to me like a "convex combination" is an application of the weighted mean. If this is true, it should contain a link to the weighted mean article for examples on how to do the calculations, but should remain a separate article (since the average person reading about the weighted mean will have absolutely no interest in the "convex combination" application). StuRat 03:51, 19 April 2006 (UTC)[reply]

Actually, the more I think about it, the more I think they should stay separate. Partly because of the reasons you bring up, and partly because the formalism can actually be important at times. Well, I'll remove the tags in a bit if no one else speaks up. Thanks for commenting! Melchoir 06:14, 19 April 2006 (UTC)[reply]

You're welcome. StuRat 21:15, 19 April 2006 (UTC)[reply]

...done. Melchoir 08:44, 30 April 2006 (UTC)[reply]

Confusing Lead Sentance?

"The weighted mean, or weighted average, of a non-empty list of data

with corresponding non-negative weights

at least one of which is positive, is..."

So at least one of which is positive? I assume it can't be the weights, since they are stated to always be non-negative, is it the data that always has to be positive, and why is that the case? CallipygianSchoolGirl (talk) 15:15, 21 January 2008 (UTC)[reply]

It is the weights. There is a small difference between nonnegative and positive: zero is nonnegative but it is not positive. So what the text says is that not all the weights can be zero. However, it is a bit confusing so I reformulated it. -- Jitse Niesen (talk) 17:13, 21 January 2008 (UTC)[reply]

Covariance

The covariance section is confusing to me. It seems like the covariance of the mean of two independent random variables, X, and Y, should be

\operatorname {cov} ({\frac {X+Y}{2}},{\frac {X+Y}{2}})

={\frac {\operatorname {cov} (X,X)+\operatorname {cov} (X,Y)+\operatorname {cov} (Y,X)+\operatorname {cov} (Y,Y)}{4}}

={\frac {\operatorname {cov} (X,X)+\operatorname {cov} (Y,Y)}{4}}

Generalizing this, it seems like the covariance of a mean should be

\operatorname {cov} \left({\frac {\sum _{i=1}^{n}X_{i}}{n}},{\frac {\sum _{i=1}^{n}X_{i}}{n}}\right)={\frac {\sum _{i=1}^{n}\operatorname {cov} (X_{i},X_{i})}{n^{2}}}

.

Is this right? It seems different than the result of

\sigma _{\bar {x}}^{2}=(\mathbf {W} ^{T}\mathbf {C} ^{-1}\mathbf {W} )^{-1},

listed on this page. —Ben FrantzDale (talk) 15:02, 17 September 2008 (UTC)[reply]

That is the variance of the mean. Since cov(X, X) is the same as var(X), we have

\operatorname {cov} \left({\frac {X+Y}{2}},{\frac {X+Y}{2}}\right)=\operatorname {var} \left({\frac {X+Y}{2}}\right),

and

\operatorname {cov} (X_{i},X_{i})=\operatorname {var} (X_{i}).\,

So what you've shown is the standard result that

\operatorname {var} \left({\frac {\sum _{i=1}^{n}X_{i}}{x}}\right)={\frac {\sum _{i=1}^{n}\operatorname {var} (X_{i})}{n^{2}}}.

Michael Hardy (talk) 18:05, 17 September 2008 (UTC)[reply]

.... oh: you assumed independence, but of course the matrix result says clearly that that's a case where they're NOT independent. You relied on cov(X, Y) = 0. But obviously cov(X, Y) would be the corresponding entry in the given covariance matrix. Michael Hardy (talk) 18:06, 17 September 2008 (UTC)[reply]

"Dealing with variance" and "Weighted sample variance"

Thank you for these and the preceding very useful sections. I have a few suggestions and a question. First, the starting sentence under "Weighted sample variance" is misleading, as this section is not about the uncertainty or error in the weighted mean, but is giving the formula for weighted variance of the sample. Second, the formula for the weights at the start of "Dealing with variance" assumes that the $\sigma _{i}$ are normalized already. I suggest instead $w_{i}={\frac {\frac {1}{\sigma _{i}^{2}}}{\sum _{i}{\frac {1}{\sigma _{i}^{2}}}}}$ to make this plainer. Third, the reference at http://pygsl.sourceforge.net/reference/pygsl/node36.html is not formatted correctly -- I suspect it is old and should be http://pygsl.sourceforge.net/reference/pygsl/node52.html . Finally, my question: the last formula in "Dealing with variance" has a factor of ${\frac {1}{n-1}}$ in it. Given the use of $V_{2}$ elsewhere in the page, I am wondering if if the ${\frac {1}{n-1}}$ should be modified to use $V_{2}$ in some way? Stgaser45 (talk) 13:36, 21 June 2009 (UTC)[reply]

- - I also found the "weighted sample variance" section misleading. The equation:

$\sigma _{\mathrm {weighted} }^{2}\ =\sum _{i=1}^{N}{{w_{i}}\left(x_{i}-\mu ^{*}\right)^{2}}$

assumes normalized weights. It should be replaced with:

$\sigma _{\mathrm {weighted} }^{2}\ ={\frac {\sum _{i=1}^{N}{{w_{i}}\left(x_{i}-\mu ^{*}\right)^{2}}}{\sum _{i=1}^{N}{w_{i}}}}$

Drdan14 (talk) 19:18, 30 October 2009 (UTC)[reply]

"Correcting for Over/Under Dispersion"

Does anyone have a reference for this section or a derivation of why the scaling by $\chi _{\nu }$ is correct? —Preceding unsigned comment added by 64.22.160.1 (talk) 13:08, 7 April 2010 (UTC)[reply]

References, literature, further reading

I think this article is brilliant, it addresses an important topic with much detail. However, it does neither supply (citeable) references nor hints to literature. Does anybody know a good book on this topic? (I am specifically interested in dealing with variance etc.) --Hokanomono ✉ 17:07, 24 September 2009 (UTC)[reply]

I have revamped the weighted variance section, added some references and added a new section about weighted covariance, you might take a look at it. And feel free to correct it if you find better equations! --89.83.73.89 (talk) 18:19, 10 June 2013 (UTC)[reply]

Exponentially decreasing weights

This is shaping up to be a good article with treatment of a wide range of related problems. However, I can't get my head around this sentence in the last section: "at step $(1-w)^{-1}$ , the weight approximately equals ${e^{-1}}(1-w)=0.39(1-w)$ , the tail area the value $e^{-1}$ , the head area ${1-e^{-1}}=0.61$ ". It seems to simply be missing some verbs, but I don't know enough on the topic to confidently correct it myself. Anyone know what's going on? Tomatoman (talk) 22:56, 30 September 2009 (UTC)[reply]

Statistical properties

Mid of that para: For uncorrelated observations with standard deviations σi, the weighted sample mean has standard deviation \sigma(\bar x)= \sqrt {\sum_{i=1}^n {w_i^2 \sigma^2_i}}

would imply stdev of mean going up with n (simply assume w=sig=1)! missing a 1./(sum(wi))^2 under the sqrt? —Preceding unsigned comment added by 87.174.74.56 (talk) 11:58, 23 January 2010 (UTC)[reply]

The weights here w_i sum up to 1. Dmcq (talk) 16:02, 23 January 2010 (UTC)[reply]

Notation for multiplication of specific numbers

As far as I can recall, I have never seen the notation (a)b before for multiplying specific numbers. I think the example at the top of the article would be more accessible to our readers if we instead used the more standard notation a × b, when a and b are specific numbers. (I've used Unicode there, which might not display properly on all browsers, because I don't know LaTeX very well. However, the LaTeX version should work on all browsers.)--greenrd (talk) 08:36, 1 October 2010 (UTC)[reply]

Good point. I've changed the LaTeX accordingly. StuRat (talk) 04:25, 7 June 2012 (UTC)[reply]

Unreadable for the Everyman

If your a mathematician then I'm sure this Wiki page is great. If your not, then this Wiki page leaves much to be desired. It gives absolutely no lay answer or examples, ergo, you need a math degree for it to be relevent. How about one of you Good Will Huntings start off the article by adding in a dumbed down section so that those who arent Math gods can actual understand what Weighted Mean actually means because this article in it's current state doesn't help me, nor I'm sure most people who find it. Thanks. :) — Preceding unsigned comment added by Deepintexas (talk • contribs) 07:24, 27 December 2010 (UTC)[reply]

That's why I added the Weighted_mean#Examples section, which requires no math skills beyond division, and is enough material for the math newbie. Are you saying you can't understand that ? (As for the rest, it really does get complex, so probably shouldn't be attempted by non-mathematicians.) StuRat (talk) 03:37, 7 June 2012 (UTC)[reply]

Reverse calculate

So, if you know the original data points, and you know the data points produced by some method of weighted averaging, can you easily discover the method by which the weighted average data points were made?

I'm sure my question doesn't make sense, so I'll give an example. I have the following original data points: 98.2, 97.8, 97.7, 97.1, 97.5, 97.4, 97.6 97.3, 97.0.

The weighted average data points are as follows: 98.93276, 98.78197, 98.63793, 98.4332, 98.30897, 98.187965, 98.109695, 98.00191, 97.86853.

Is it possible to discover what weight I gave to each original data point to produce the weighted average data point? If so, how? And if it is possible, let's include it in the article.

Am I missing something ? Just divide the weighted data points by the unweighted ones to get the weights. StuRat (talk) 03:33, 7 June 2012 (UTC)[reply]

Biased vs. Unbiased

The Weighted Sample Variance section emphasizes that the variance is "biased" without explaining. While the 'unbiased estimator' is linked and briefly mentioned, It seems like the concept of bias should be addressed explicitly.--All Clues Key (talk) 23:41, 6 June 2012 (UTC)[reply]

I had the same feeling as you and thus I revamped the section a bit. You will now find the two unbiased estimators (one for "reliability"-like weights, and one for "repeats"-like weights).