|WikiProject Statistics||(Rated B-class, Mid-importance)|
What's a Renee Masse? (In the first sentence).
I did it that way so I wouldn't have to use * or x or × for multiplication. What would you prefer? dcljr 13:53 Dec 23, 2002 (UTC)
When using multi-letter variables a multiplication sign avoids ambiguity, and in this case coincidentally there was even a little more ambiguity.
Any of the three is fine with me, × is neatest, but more cumbersome to write. - Patrick 14:26 Dec 23, 2002 (UTC)
more info but clearer answers i don't get a thing and my exam is tommorrow!!!!
Outliers as 2 s.d.s away from men
Is there not a second defintion of outliers, as lying more than two standard deviations away from the mean? Or am I mixing other things up? I am a physicist, and it is a long time since I did "real" statistics... Batmanand | Talk 09:48, 28 September 2006 (UTC)
This is a poor definition of outliers as it changes upon recursion, i.e. the standard deviation is highly dependent on the outliers. Check boxplot for a simple but easy to understand definition that is not distribution dependent.
Isn't an outlier (German: Ausleger, Swedish: utliggare) also a supporting 2nd keel for a canoe or sailing boat that makes it almost a catamaran? Hmm... apparently this is called a outrigger on an outrigger canoe in English. Other languages would use "rig" for things that have sails. --LA2 23:04, 1 August 2007 (UTC)
I'm opposed to the "Mathematical definition" section in the article. Determining whether an observation is an outlier is ultimately a subjective decision, and any definition based on measures such as standard deviation or interquartile range is completely arbitrary.
Methods for identifying outliers
I want to know what is the source of the method of using the Interquantile range mentioned in the text?? Is it just a rule of thumb, or does it have a more objective explanation?--Forich (talk) 21:33, 9 June 2008 (UTC)
- It is a popular method, known as "Interquartile range" (IQR), where k = 3 is suposed to identify extreme outliers, and k = 1.5 mild outliers. This method has no scientific basis; it belongs to the category of Mumbo Jumbo methods in statistics. --Lambiam 04:51, 10 June 2008 (UTC)
Pointing a citation to its source
The second citation
- "2. ^ Grubbs, F. E.: 1969, Procedures for detecting outlying observations in samples. Technometrics 11, 1–21."
I googled it and found it on jstor at http://www.jstor.org/pss/1266761
Would it be a good idea to link directly to the article in the reference section? I didn't know how and I didn't know if that was appropriate or not. It seems appropriate though.
How is it spoken ?
- Former. Hear it at http://www.merriam-webster.com/dictionary/outlier Glrx (talk) 22:21, 7 December 2012 (UTC)
Much of this section is directly plagiarized from A Survey of Outlier Detection Methodologies (2004) by Hodge & Austin. For example,
"Type 1 - Determine the outliers with no prior knowledge of the data. This is essentially a learning approach analogous to unsupervised clustering. The approach processes the data as a static distribution, pinpoints the most remote points, and flags them as potential outliers." is word-for-word identical, as are the definitions of the following 2 types. At the absolute minimum their work should be cited. — Preceding unsigned comment added by 188.8.131.52 (talk) 21:48, 7 March 2013 (UTC)
Outliers exclusion citation
I think the phrase "Deletion of outlier data is a controversial practice frowned on by many scientists and science instructors" needs some kind of citation or should be reformulated / removed. I mean - why is it controversial? What can happen? Examples of bad things that happend because of outliers exclusion? — Preceding unsigned comment added by 184.108.40.206 (talk) 10:26, 18 July 2013 (UTC)
Outlier as measurement error
The whole article refers to outliers as possible errors (eg "An outlier may be due to variability in the measurement or it may indicate experimental error"). This is not right or actually incomplete. An outlier can also point at something real going on which is unusual. As a colleague pointed out, it could be your next Nobel prize. If I remember well, NASA had detected the ozone hole in data before Joe Farman published it, but NASA had excluded those. So, I would propose to change the tone of this article and emphasize that statistics helps to identify outliers, which are especially interesting points pointing at measurement errors, unusual distributions (the tails) and possibly new phenomena. — Preceding unsigned comment added by Pjtverheijen (talk • contribs) 05:58, 12 March 2014 (UTC)