# Talk:Chauvenet's criterion

WikiProject Statistics (Rated Start-class, Low-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
Low  This article has been rated as Low-importance on the importance scale.

## Untitled

I'm a bit frustrated because I can't find any source that explicity says that the Chauvenet of the criterion is William Chauvenet. The dates are about right (1863 the criterion was published, according to 1), and I don't know of any other famous mathematical Chauvenets, but I would really prefer proof. Can anyone find some? LizardWizard 08:02, Mar 23, 2005 (UTC)

Response The original Chauvenet work was printed as part of his treatise: Chauvenet, William. A Manual of Spherical and Practical Astronomy V. II. 1863. Reprint of 1891 5th ed. Dover, N.Y.: 1960. I believe Yale University has a copy of the original. The pages of the 5th Edition that you would want are in the appendix from pp. 474 - 566. By the way, the last sentence in his appendix says: "What I have given may serve the purpose of giving the reader greater confidence in the correctness and value of Peirce's Criterion."

S. Ross

## objections to example given; ambiguous and false

"The probability of taking data more than two standard deviations from the mean is roughly .05. Six measurements were taken, so the probability that one should be so far from the mean is .05*6 = .3"

I have several objections to this statement: the first sentence above is true in the case of data from a normal distribution for which the true mean and true variance are known. In general it is not true. Furthermore, in the example given, the observed (estimated) values of these parameters are used. One can't just replace true values of the parameters with the estimated values and expect whatever probabilistic statement one's trying to make to remain true (and not necessarily even "roughly" true as stated, especially for a sample of size 6).

Furthermore, even IF the data are from a normal distribution and IF one does happen to know the parameters' true values (or good estimates thereof), the probabilistic statement made (the second sentence above) is ambiguous and false: the probability that "one should be so far from the mean" (in the exact sense) is in fact zero, since the normal distribution has no point masses. If the event whose probability is meant to be calculated is "exactly one should be so far or further from the mean", then that value is actually ${\displaystyle {{6} \choose {1}}(0.05)^{1}\times (1-0.05)^{5}\cong 0.232}$, not 0.3 as claimed. Or, the probability that "at least one should be so far or further from the mean is actually ${\displaystyle 1-(1-0.05)^{6}\cong 0.265}$, not 0.3 as claimed.

I would fix the article myself but first I'd like to hear what others think about all this. Btyner 30 June 2005 22:46 (UTC)

I'm the author, and, I apologize, not a statistician. However, even I recognize that the criterion as defined in reference materials is in some ways "bad statistics." The value for comparison with .5 really should be calculated by multiplying sample size by p-value naively. You are not actually finding the probability that one or more measurements should be so far or farther from the mean (I believe because it is meant to be an easy-to-apply criterion). See, for instance, [1], which seems to use the same source I did. However, a bit more research shows this doesn't seem to be well-standardized; contrast [2]. I apologize for putting a false statement in the example, and I welcome revisions. LizardWizard 01:18, August 22, 2005 (UTC)

## standard deviation; sample standard deviation?

Since the example is a sampling, wouldn't the standard deviation be more appropriate using sample standard deviation (N-1 rather than N)? The change on 17 November 2007 to the page changed it to use N. I think this is likely incorrect, but I'm not mathy enough to know for sure. The next example with it at 0.7 is inconsistent as-is, so the page needs changed regardless.

## Under 'Example'

Under 'Example': what do you do with the new mean once the 50 outlier is discarded? Perform the process again?? 71.139.163.204 (talk) 00:15, 4 November 2014 (UTC)