|WikiProject Statistics||(Rated Start-class, Mid-importance)|
|WikiProject Mathematics||(Rated Start-class, Mid-importance)|
Critical significance level
Dear author, Are you sure we cannot reject the null hypothesis at the 5% significance level ? Maybe at 1% since p-value is .027?
- It's best not to address questions using 'Dear author' - articles are written collaboratively. Richard001 08:29, 23 May 2007 (UTC)
Working at critical significance level 0.05, we may not reject the null hypothesis in this case. The null hypothesis was that the die is fair (two-tailed), not that the die is loaded towards 6 (a one-tailed statement) but, to calculate the binomial test statistic, we run a one-tail test. The result (P = 0.27) must then be multiplied by 2 before comparing with the critical significance level, 0.05. Thus, 2(0.27) > 0.05, so we cannot reject the null hypothesis.
- Sorry - when the two.side statistic is computed in R, the p-value is less then twice the one-tailed. Hence
binom.test(51,235,(1/6),alternative="two.sided")returns p-value = 0.04375, and
binom.test(51,235,(1/6),alternative="greater")returns p-value = 0.02654. —Preceding unsigned comment added by Achristoffersen (talk • contribs) 09:43, 19 December 2008 (UTC)
Integrated explanation of how the two-tailed test works and why it is not the same as doubling the one-tailed test. Page now agrees with R :] —Preceding unsigned comment added by 220.127.116.11 (talk) 23:09, 27 May 2009 (UTC)
I see the logic of the two-tailed test of "equal effect size", but I am not sure it is right. At least it seems arguable to me. The effect size can be seen in probabilistic terms or it could be seen in absolute terms. It really is question of how you frame the question. Is it: What is the probability of getting Xroll(6) +/- Yroll(6)? Or is it: What is the probability of getting a value so far down the tail of the distribution? —Preceding unsigned comment added by 18.104.22.168 (talk) 14:33, 16 October 2010 (UTC)
- It's important to point out that good statistics programs, like R, do not calculate the two-tailed result using an "equal effect size" based on pure distance from mean. That's a heuristic used for back-of-the-envelope calculations. Unfortunately, that heuristic has also been picked up by some less-than-R statistics packages. The correct way to calculate the two-tailed binomial test is to (assuming outcome is less than mean, for this example) sum
- where the "special" is a the lowest outcome with . That is, R cuts a horizontal line across the distribution at a height equal to the probability of the observed outcome. It then sums the area underneath that horizontal line (thus capturing both the left and right tails). If you simply double the one-sided result or use an absolute effect size, your two-sided result will be more and more incorrect as your distribution becomes more and more skewed (i.e., as the expected probability strays farther away from 0.5). (side note: To see how R calculates the two-sided result, type "binom.test" alone (without quotes) in R) —TedPavlic (talk/contrib/@) 22:32, 19 February 2012 (UTC)
An equal "effect size" would suggest a symmetrical distribution, which this is not. I have deleted the relevant paragraph. — Preceding unsigned comment added by 22.214.171.124 (talk) 00:19, 8 May 2012 (UTC)
In statistical software packages
The text in this section that describes how to carry out the binomial test in Matlab is incorrect. Specifically, this statement is wrong:
- In MATLAB, use binofit:
[phat,pci]=binofit(51, 235,0.05)(generally two-tailed, one-tailed for the extreme cases "0 out of n" and "n out of n"). You will get back the probability for the dice to roll a six (phat) as well as the confidence interval (pci) for the confidence level of 95% = (1-0.05), respectively a significance of 5%.
In fact, binofit only returns a maximum likelihood estimate of the bias of the coin, and does not perform the hypothesis test. As far as I know, there is no built-in binomial test function in Matlab. I suggest that we delete the binofit line from the article. Paresnah (talk) 22:54, 26 July 2016 (UTC)