Talk:Misuse of statistics

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Data Dredging[edit]

The paragraphs on data dredging seem completely reasonable to me. I checked them against the main data dredging page, and against my textbook.

However, I thought that it would be far too possible for people casually checking this page to use this section as reason to believe that this kind of analysis is always wrong, and that's just not true. So I added a short paragraph explaining the caveat: it's OK to do it as long as you check yourself!

This is the first time I've edited Wikipedia... hope I did it right.71.193.16.80 (talk) 17:53, 2 March 2008 (UTC)[reply]

Almost certain that this interpretation of the 95% C.I is wrong[edit]

"In marketing terms all a company has to do to promote a neutral (useless) product is to find or conduct, for example, 20 studies with a confidence level of 95%. Even if the product is really useless, on average one of the 20 studies will show a positive effect purely by chance (this is what a 95% level of confidence means)"

The 95% C.I refers to the "middle" 95% of the distribution; this means that the outer 5% of the data counts both over and under the interval. So to just consider the "unusually good" experiments would only be the top 2.5% of the sample.

Long story short, I think that on average, only 1 in 40 studies will show the positive effect


P.S: It's kind of ironic that an article about the misuse of statistics would (accidently) misuse statistics Akshayaj 21:31, 2 July 2007 (UTC)[reply]

It depends on whether the significance test is one-tailed or two tailed. In this case, a one-tailed test would be used, so a false positive would arise in 1 out of 20 cases. In a two tailed test, one in 40 would be a false positive and one in 40 would be a false negative.--Wikiman2718 (talk) 18:40, 25 May 2019 (UTC)[reply]

Oversimplification of C.I (again)[edit]

"Data mining is the examination of large compilations of statistics in order to find a correlation. Since the required confidence interval to establish a relationship between two parameters is usually pegged at 95% (meaning that there is a 95% chance that two parameters are related), there is also a 5% chance to find a correlation between two sets of completely random variables."

Usually, a researcher says there is a significant relationship between two R.Vs if they can reject the Null Hypothesis, which means that there is <5% chance of this correlation showing up just by chance. While this is basically what the paragraph above is saying, I think it confuses the issue, as well as unfairly always pegs the error rate at 5%.

As I'm not really sure how this is different from the section I editted above, I'll delete it and let someone who knows the specifics of data mining edit it back in. Akshayaj 21:51, 2 July 2007 (UTC)[reply]

Update[edit]

I see the data mining section has been put back in. I'm pretty sure this setion is incorrect. For one thing, nobody ever uses Confidence Intervals to establish correlations, but rather uses R^2. The idea that there's a 5% chance that two independent variables show correlation "just by chance" is wrong

However, I'm no expert on data mining, so I'll defer to the author of the paragraph and ask for another poster to confirm or deny. Akshayaj 19:09, 18 July 2007 (UTC)[reply]

Good book to read[edit]

The book _Lies, damn lies, and statistics_ is a good reading on misuse of statistics, as commonly used to promote politicians, policies, products, ideologies, and medical ideas regardless of their merit.


From Talk:Misuse Of Statistics:

The latest presidential election might provide some food for thought about the misuse of statistics.

What do you think of the exit polls number not matching the vote on voting machines? Is the wide discrepancy proving anything? Does those exit poll numbers amount to a good statistic misused, is it flawed, or is it good?

So many ways to misuse a statistic. So little time! (-;

bad example[edit]

" With a subject on which the general public has no personal knowledge of, you can fool a lot of people. For example you can say on TV "Most autistics are hopelessly incurable if raised without parents or normal education" and many people will only remember the first part of the claim, "Most autistics are hopelessly incurable". "

The suggestion that autism is curable is itself arguable. Whether hopelessly or otherwise - the behavioural symptoms can be addressed, but the underlying condition cannot be cured.

Unfortunately I can't think of a better alternative - but this is a wrong 'un.


Indented line I rewrote this section with better examples and proper citations. Jdoucett (talk) 02:38, 16 November 2010 (UTC)[reply]

Additional info???[edit]

I am wondering if there might be a box or section to devoted to specific procedures for DIAGNOSING/CURING misuses? Or does that belong as a whole different entry like: "Detecting Misuse of Statistics"?


Quality of the article and NPOV[edit]

Obviously there are unlimited examples of people with something to sell abusing statistics, but lets keep the examples here off of hot button issues, there's no benefit to it and people can debate contemporary topics under the appropriate topic headings. The Michael Fumento bit reads like an example of the very thing it's attempting to illustrate: selective reporting. And its citation is a dead link. --DKEdwards 21:12, 21 November 2006 (UTC)[reply]

Jargon...[edit]

I only tagged the section Linguistically asserting unit measure when it is empirically violated for jargon, though other sections might need cleanup too. But that example needs to be completely rewritten, because there is no way the average reader can make any sense of it. I have a background in stats and I am not even sure what that section means.--70.80.234.196 (talk) 18:14, 2 May 2010 (UTC)[reply]

I've just deleted the whole section. It's very opaque, and seems off-topic to me - more about probability axioms than statistics. --Avenue (talk) 21:56, 10 August 2010 (UTC)[reply]

Introductory Comments[edit]

The introductory comments seem extreme given the lack of references. Are there any references for this? If professional statisticians fool themselves, by which standard do we judge this exactly? 68.35.128.100 (talk) 00:58, 30 August 2010 (UTC)[reply]

Zero hypothesis vs. neglectable small correlation[edit]

Another quite common problem with statistics is that for many people what seems to be only important is if the zero hypothesis is true or not, but not how strong the correlation actually is. For example people want to know if smoking can cause cancer, but if, e.g. only one of 10 billion smokers would get cancer they would consider 'smoking causes cancer' an important statement, although for practical purposes the risk would be neglectable compared to other cancer risk factors. In other words the difference between 'nobody gets cancer from smoking' to '1 in 10 billion people gets cancer from smoking' is considered more important than the difference between '1 in 10 billion people gets cancer from smoking' and '1 in 20 people gets cancer from smoking'. This problem is related to the so-called Zero-risk_bias, but creating sensation news headlines from very small correlations imho should also be considered misuse of statistics. O.mangold (talk) 12:21, 13 December 2010 (UTC)[reply]

Most basic misuse of statistics?[edit]

In my experience, the most basic misuse of statistics, particularly prevalent in the (news) media, is to take probabilities relevant to a general situation and apply them to a specific instance. Should this be added to the article, perhaps in the introduction?

Perhaps it is not so much a misuse of statistics, as a misunderstanding of statistics or ill-education?

Spartan26 (talk) 02:36, 14 October 2011 (UTC)[reply]

An important question is whether you can find a good citation for someone pointing out this problem. Melcombe (talk) 08:46, 14 October 2011 (UTC)[reply]

Challenge[edit]

I challenge the Non-enduring class fallacies section. Google searches all ultimately point to this article as the source. Unless citations are produced the section will be deleted.172.250.105.20 (talk) 19:49, 17 April 2014 (UTC)[reply]

Done.172.250.105.20 (talk) 17:50, 3 May 2014 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified 5 external links on Misuse of statistics. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 17:31, 2 April 2016 (UTC)[reply]

Requested move 20 April 2018[edit]

The following is a closed discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. Editors desiring to contest the closing decision should consider a move review. No further edits should be made to this section.

The result of the move request was: not moved. Andrewa (talk) 17:44, 28 April 2018 (UTC)[reply]


Misuse of statisticsError of statistics – Wikipedia has been the subject of criticism for gender bias. By renaming this article to Error of statistics, we can eliminate any possible gender bias for this article, and fix the redirect from the main statistics article. Brian Everlasting (talk) 02:05, 20 April 2018 (UTC)[reply]

Are you pulling our legs? "mis- (1) prefix meaning "bad, wrong," from Old English mis-, from Proto-Germanic *missa- "divergent, astray" (source also of Old Frisian and Old Saxon mis-, Middle Dutch misse-, Old High German missa-, German miß-, Old Norse mis-, Gothic missa-), perhaps literally "in a changed manner," and with a root sense of "difference, change" (compare Gothic misso "mutually"), and thus from PIE *mit-to-, from root *mei- (1) "to change." Productive as word-forming element in Old English (as in mislæran "to give bad advice, teach amiss"). In 14c.-16c. in a few verbs its sense began to be felt as "unfavorably" and was used as an intensive prefix with verbs already expressing negative feeling (as in misdoubt). Practically a separate word in Old and early Middle English (and often written as such). Old English also had an adjective (mislic "diverse, unlike, various") and an adverb (mislice "in various directions, wrongly, astray") derived from it, corresponding to German misslich (adj.)." There is no gender involved in "mis-" 128.135.96.56 (talk) 02:22, 20 April 2018 (UTC)[reply]
"miss (n.2) "the term of honour to a young girl" [Johnson], originally (c. 1600) a shortened form of mistress. By 1640s as "prostitute, concubine;" sense of "title for a young unmarried woman, girl" first recorded 1660s." "mistress (n.) early 14c., "female teacher, governess," from Old French maistresse "mistress (lover); housekeeper; governess, female teacher" (Modern French maîtresse), fem. of maistre "master" (see master (n.)). Sense of "a woman who employs others or has authority over servants" is from early 15c. Sense of "kept woman of a married man" is from early 15c." Not related. 128.135.96.56 (talk) 02:39, 20 April 2018 (UTC)[reply]
  • Strong oppose and speedy close - misconstrues or misunderstands the meaning/origin of the word. Seems like WP:POINT or WP:RIGHTGREATWRONGS, not a valid reason supported by WP:CRITERIA. -- Netoholic @ 03:28, 20 April 2018 (UTC)[reply]
  • oppose. Like this editor‘s other recent move proposal this is simply nonsensical. There is no gender bias, there is no other reason to move, and the suggested title is markedly worse than the current one.--JohnBlackburnewordsdeeds 07:06, 20 April 2018 (UTC)[reply]
  • Oppose ridiculous. Headbomb {t · c · p · b} 18:12, 26 April 2018 (UTC)[reply]
  • Closing comment: Where is BJAODN when I need it? Andrewa (talk) 17:44, 28 April 2018 (UTC)[reply]

The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page or in a move review. No further edits should be made to this section.