Comment on Dubious[edit]

The citation (USEPA December 1992) contains numerous statistical tests, some presented as p-values and some as confidence intervals. Figures 5-1 through 5-4 show some of the test statistics used in the citation. In two of the figures the statistics are for individual studies and can be assumed to be prospective. In the other two, the statistics are for pooled studies and can be assumed to be retrospective. Table 5-9 includes the results of the test of the hypothesis RR = 1 versus RR > 1 for individual studies and for pooled studies. In the cited report, no distinction between prospective tests and retrospective tests was made. This is a departure from the traditional scientific method, which makes a strict distinction between predictions of the future and explanations of the past. Gjsis (talk)

Deleted history text from the lead[edit]

I noticed that the history text from the lead was deleted and the rationale given was "Remove lede para repeated almost exactly two paras later." This was an issue that was raised earlier in the discussion and I forgot to respond to it. I agree that we shouldn't repeat things needlessly. However, the lead is supposed to summarize the entire article (WP:LEAD). Describing the history of statistical significance should be a part of that summary. Rather than delete the text, I would strongly prefer to see it either paraphrased or at least have the text in the history section expanded so that it won't be perceived as needless repetition. danielkueh (talk) 21:02, 10 October 2016 (UTC)

I aded a short note about the history back in. – SJ + 19:15, 24 October 2016 (UTC)

Journals banning significance testing[edit]

There is a small movement among some journals to ban significance testing as justification of results. This is largely in subfields where significance testing has been overused or misinterpreted. For instance, Basic and Applied Social Psychology, back in early 2015. I think this is worth mentioning somewhere in the article. Thoughts? – SJ + 19:15, 24 October 2016 (UTC)

I guess the issue would be due weight wp:weight. Are there prominent secondary sources (e.g., review articles) that comment and encourage this movement? Or is this just an editorial policy of a handful of journals? If the latter, I recommend holding off. danielkueh (talk) 20:18, 24 October 2016 (UTC)
Yes, it seems to be a big deal to some secondary sources. Some suggest that using null hypothesis significance testing to estimate the importance of a result is controversial. Here's Nature noting the controversy, here is Science News calling the method flawed, and here's an overview of the argument over P-values from a stats prof. All highlight the decision by BASP as a critical point in this field-wide discussion. – SJ + 04:05, 4 November 2016 (UTC)

Part of the debate: Why Most Published Research Findings Are False: [1]. Isambard Kingdom (talk) 20:28, 24 October 2016 (UTC)

Thanks for sharing but the PLoS article doesn't recommend or encourage the banning of statistical significance. Instead, it recommends researchers not to just "chase statistical significance" and that they should also be improving other factors related to sample size and experimental design. danielkueh (talk) 20:40, 24 October 2016 (UTC)

This article may be really helpful as a citation in the Reproducibility section. This could help to provide insight as to why sometimes it is so difficult to reproduce a study when the original researchers "chased" statistical significance. Cite error: There are <ref> tags on this page without content in them (see the help page). (talk) 01:23, 7 November 2016 (UTC)

Timeline for introduction of 'null hypothesis' as concept[edit]

The null hypothesis wasn't given that name until 1935 (per Lady_tasting_tea), perhaps there is a way to describe the original definition / the Neyman-Pearson results without using that term. (In the history section). At the least this could include a ref to Fisher's work clarifying the concept. – SJ + 04:24, 4 November 2016 (UTC)