Talk:Bessel's correction

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated Start-class, Low-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Low  This article has been rated as Low-importance on the importance scale.
 
WikiProject Mathematics (Rated Start-class, Low-priority)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
Start Class
Low Priority
 Field: Probability and statistics

Someone here could to say what means that E in this mathematical equation:


\begin{align}
& = \frac{1}{n-1}\operatorname{E}\left(\sum_{i=1}^n(x_i-\mu)^2 - 2(\overline{x}-\mu)\sum_{i=1}^n(x_i-\mu)  + \sum_{i=1}^n(\overline{x}-\mu)^2\right) \\
\end{align}

thanks — Preceding unsigned comment added by 181.55.69.28 (talk) 16:09, 11 August 2013 (UTC)

At least one sentence here is wrong[edit]

In the middle of the article, it says "The answer is "yes" except when the sample mean happens to be the same as the population mean."

Say I have the population {1,1,1,1,1,3,3,3,3,3,5,5,5,5,6}. My population mean is 3.06667. If I happen to choose a sample {1,6}, I'll have a sample mean of 3.5, but my sample variance estimated by dividing by N will obviously be larger than the population variance.

Mglerner (talk) 19:14, 1 September 2011 (UTC)

OK, I'm confused...[edit]

How did they get from here:


\begin{align}
& = \frac{1}{n-1}\operatorname{E}\left(\sum_{i=1}^n(x_i-\mu)^2 - 2(\overline{x}-\mu)\sum_{i=1}^n(x_i-\mu)  + \sum_{i=1}^n(\overline{x}-\mu)^2\right) \\
\end{align}

to here:


\begin{align}
& = \frac{1}{n-1}\operatorname{E}\left(\sum_{i=1}^n(x_i-\mu)^2 - 2(\overline{x}-\mu)n (\frac{\sum_{i=1}^nx_i}{n}-\mu)  + n(\overline{x}-\mu)^2\right) \\
\end{align}

??? - 114.76.235.170 (talk) 15:20, 6 August 2010 (UTC)

Two things were done. One was this:
 \sum_{i=1}^n(\overline{x}-\mu)^2 = n(\overline{x}-\mu)^2.
The index i does not appear within the scope of the summation sign, i.e. the term
 (\overline{x} - \mu)^2 \,
does not change as i goes from 1 to n. In other words the sum
 \sum_{i=1}^n (\overline{x} - \mu)^2
is just
 \underbrace{(\overline{x} - \mu)^2 + \cdots + (\overline{x} - \mu)^2}_{n\text{ terms}} \,
hence it is
 n(\overline{x} - \mu)^2. \,
The other thing that was done is as follows:

\begin{align}
\sum_{i=1}^n (x_i - \mu) & = \left(\sum_{i=1}^n x_i\right) - \left(\sum_{i=1}^n \mu\right) \\[10pt]
& = n \left( \frac{\sum_{i=1}^n x_i}{n} \right) - \left(n\mu\right) \\[10pt]
& = n\left( \frac{\sum_{i=1}^n x_i}{n} - \mu \right).
\end{align}
: ~~~~
I know its not very Wiki-like, but I have to say: Brilliant Article...! Mmick66 (talk) 07:15, 6 May 2011 (UTC)
And I'm going to have to go even further in non-Wikiness and agree; brilliant! The subtle point about the estimator becoming unbiased for the variance but not for the standard deviation via Jensen's inequality -- absolutely brilliant! Superpronker (talk) 12:45, 2 June 2011 (UTC)

Bessel usage for standard deviation[edit]

Why is the formula with Bessel's correction used as a default (ie in Matlab) to estimate standard deviation, instead of the unbiased version (such as Unbiased_estimation_of_standard_deviation) to estimate standard deviation? Frankmeulenaar (talk) 06:43, 12 July 2011 (UTC)

Isn't n*(a/n) = a ?[edit]

In the final formula:


\frac{1}{n-1}\left[\sum_{i=1}^n \sigma^2 - n(\sigma^2/n)\right]

Isn't:


n(\sigma^2/n) = \sigma^2 ?

And therefore the sum is zero?

Or is that factor outside the summation?

Even if it's outside, why not just put \sigma^2 ? — Preceding unsigned comment added by 71.107.55.85 (talk) 08:12, 28 June 2012 (UTC)

The first term reduces to n\sigma^2 while second is \sigma^2, so the result is not zero. The formula is presumably there because it relates to a formula above, with corresponding terms. Melcombe (talk) 12:46, 28 June 2012 (UTC)

Ah ok, so it's outside the summation then. A few more steps would make it more clear. — Preceding unsigned comment added by 71.107.55.85 (talk) 07:10, 2 July 2012 (UTC)

Formula doesn't make sense in context[edit]

In the final section, when dicussing a particular random sampling, the article throws out the formula:

\operatorname{Var}(\overline{x}) = \operatorname{E}((\overline{x}-\mu)^2) = \sigma^2/n. \,

However, in the context of a single random sampling of the data, this formula doesn't make sense. It seems to want to measure the variance of the sample mean with respect to the real mean. However, in the case of a single random sampling, you just have one sample mean, and one number. Therefore the expected value is just that number.

Or is this formula stating the variance you'd get if you took a number of random samplings, took the sample mean of each of those samplings, and then calculated the variance of the the sample means of several random samplings to the true mean?

If the second is true (and it's the only explanation that makes sense to me), this formula has been thrown a bit of context, given the text above it, that dicusses a single random sampling, and then says "Also", then seems to throw the formula up on the wall, and see if it sticks...

Furthermore, it's unexplained nature makes it very difficult to make the leap to the final formula below it, which essentially justifies the results of the whole page. — Preceding unsigned comment added by 71.107.55.85 (talk) 08:31, 28 June 2012 (UTC)

This relates to the question above. The formula complained of is set down because it is the thing that is substituted in the earlier formula to get the final result. Melcombe (talk) 12:50, 28 June 2012 (UTC)

I'm not complaining about the formula - I understand it's trying to link the two parts. I'm complaining about the context - reading from top to bottom (as is common), you're dicussing one random sampling from the population, and then introducting a formula that *appears* to be talking about results from a set of random samplings. But it's unclear whether this is so, or I just misunderstand the formula. Or both. — Preceding unsigned comment added by 71.107.55.85 (talk) 07:12, 2 July 2012 (UTC)

Putting the -1 in n-1.[edit]

Do I understand that the "real" answer in the example would be 36/5 = 7.2, the observed answer is 16/5 = 3.2 and using Bessel's correction would give: 16/(5-1) = 4? Four being closer to the correct answer of 7.2 than 3.2 which is considered a low value per the article text? Thank you. Fotoguzzi (talk) 11:55, 1 February 2013 (UTC)

Comment as a very occasional editor.[edit]

I had need to read this article. For me it read well overall. I found the article well set out in its flow of explanation for a person yet to understand this topic. Better than a number of other basic stats articles. Providing this feedback to those who have worked on this article. With my thanks.

CitizenofEarth001 (talk) 10:15, 17 February 2014 (UTC)

Proof Alternate #3 moved to #1 with intuition unhidden[edit]

The intuitive explanation in proof alternate #3 is very clear for even a lay person to understand, I think this makes it an ideal candidate for being in the main text. MATThematical (talk) 05:00, 26 May 2014 (UTC)