|WikiProject Statistics||(Rated B-class, Mid-importance)|
What's a Renee Masse? (In the first sentence).
I did it that way so I wouldn't have to use * or x or × for multiplication. What would you prefer? dcljr 13:53 Dec 23, 2002 (UTC)
When using multi-letter variables a multiplication sign avoids ambiguity, and in this case coincidentally there was even a little more ambiguity.
Any of the three is fine with me, × is neatest, but more cumbersome to write. - Patrick 14:26 Dec 23, 2002 (UTC)
more info but clearer answers i don't get a thing and my exam is tommorrow!!!!
Outliers as 2 s.d.s away from men
Is there not a second defintion of outliers, as lying more than two standard deviations away from the mean? Or am I mixing other things up? I am a physicist, and it is a long time since I did "real" statistics... Batmanand | Talk 09:48, 28 September 2006 (UTC)
This is a poor definition of outliers as it changes upon recursion, i.e. the standard deviation is highly dependent on the outliers. Check boxplot for a simple but easy to understand definition that is not distribution dependent.
Isn't an outlier (German: Ausleger, Swedish: utliggare) also a supporting 2nd keel for a canoe or sailing boat that makes it almost a catamaran? Hmm... apparently this is called a outrigger on an outrigger canoe in English. Other languages would use "rig" for things that have sails. --LA2 23:04, 1 August 2007 (UTC)
I'm opposed to the "Mathematical definition" section in the article. Determining whether an observation is an outlier is ultimately a subjective decision, and any definition based on measures such as standard deviation or interquartile range is completely arbitrary.
Methods for identifying outliers
I want to know what is the source of the method of using the Interquantile range mentioned in the text?? Is it just a rule of thumb, or does it have a more objective explanation?--Forich (talk) 21:33, 9 June 2008 (UTC)
- It is a popular method, known as "Interquartile range" (IQR), where k = 3 is suposed to identify extreme outliers, and k = 1.5 mild outliers. This method has no scientific basis; it belongs to the category of Mumbo Jumbo methods in statistics. --Lambiam 04:51, 10 June 2008 (UTC)
Pointing a citation to its source
The second citation
- "2. ^ Grubbs, F. E.: 1969, Procedures for detecting outlying observations in samples. Technometrics 11, 1–21."
I googled it and found it on jstor at http://www.jstor.org/pss/1266761
Would it be a good idea to link directly to the article in the reference section? I didn't know how and I didn't know if that was appropriate or not. It seems appropriate though.
How is it spoken ?
- Former. Hear it at http://www.merriam-webster.com/dictionary/outlier Glrx (talk) 22:21, 7 December 2012 (UTC)
Much of this section is directly plagiarized from A Survey of Outlier Detection Methodologies (2004) by Hodge & Austin. For example,
"Type 1 - Determine the outliers with no prior knowledge of the data. This is essentially a learning approach analogous to unsupervised clustering. The approach processes the data as a static distribution, pinpoints the most remote points, and flags them as potential outliers." is word-for-word identical, as are the definitions of the following 2 types. At the absolute minimum their work should be cited. — Preceding unsigned comment added by 220.127.116.11 (talk) 21:48, 7 March 2013 (UTC)
Outliers exclusion citation
I think the phrase "Deletion of outlier data is a controversial practice frowned on by many scientists and science instructors" needs some kind of citation or should be reformulated / removed. I mean - why is it controversial? What can happen? Examples of bad things that happend because of outliers exclusion? — Preceding unsigned comment added by 18.104.22.168 (talk) 10:26, 18 July 2013 (UTC)