# Talk:P-value

Jump to: navigation, search

## Source cited for definition of P-value

For an article about p-values, it is odd that the definition is taken from a journal paper that disparages the p-value: I would not expect a fair/balanced definition from such a source. If the definition was not taken from the cited paper, then the citation is wrong and is probably due to somebody's personal agenda.

## Lead paragraph

The lead paragraph is unintelligible.drh (talk) 17:40, 13 April 2015 (UTC)

How about starting with something along these lines? In statistics, a p-value is a measure of how likely an observed result is to have occurred by random chance alone (under certain assumptions). Thus low p values indicate unlikely results that suggest a need for further explanations (beyond random chance). Tayste (edits) 23:22, 13 April 2015 (UTC)

The problem with that definition is that it is wrong. — Preceding unsigned comment added by 108.184.177.134 (talk) 15:50, 11 May 2015 (UTC)

How is it wrong? I'm a statistics layperson, so I could be missing something, but it's not clear what the objection is. When P < .05, that means that there's less than a 5% chance that the null hypothesis is true. Which means there's less than a 5% chance that the result would have happened anyway regardless of the correlations that the experimenters think/hope/believe/suspect are involved. Please explain the allegation of wrongness. Quercus solaris (talk) 21:55, 12 May 2015 (UTC)
The "probability that significance will occur when the null hypothesis is true" is not the same thing as the "probability that the null hypothesis is true when significance has occurred," just as the probability of someone dying if they are struck by lightning (presumably quite high) is not the same as the probability that someone was struck by lightning if they are dead (presumably quite low). More generally, the probability of A given B is not the same as the probability of B given A. Consider a null hypothesis that is almost certainly true, such as "The color of my underwear on a given day does not affect the temperature on the surface of Mars." If you do a study to test that null hypothesis, 1 in 20 times the p-value will be less than .05. When that happens, by your logic we should conclude that there is a greater than 95% chance that the color of your underwear affects the temperature on the surface of Mars! I hope that answers your question. — Preceding unsigned comment added by 108.184.177.134 (talk) 14:35, 13 May 2015 (UTC)
I like the approach of thinking critically about a null hypothesis that is almost certainly true (more like "damn well has to be true"). But the way you frame it does not match how P is usually used in journal articles nor what was described at the start of the thread. P is never "the probability of significance". P is the probability of the null hypothesis being true. (Full stop.) Regarding "the probability that the null hypothesis is true when significance has occurred," that value is, by definition of significance, always less than the significance level (which is usually 5%). Significance is a yes/no (significant or not) dichotomy with the threshold set at a value of P (at heart an arbitrary value, although a sensibly chosen one). Therefore, when you mention "a greater than 95% chance that the color of your underwear affects the temperature on the surface of Mars", which means "a greater than 95% chance that the null hypothesis is false", it has nothing to do with any logic that I or User:Tayste mentioned. When a journal article reporting a medical RCT says that "significance was set at P < .05", what it is saying is that "a result was considered significant only if there was a less than 5% chance of it occurring regardless of the treatment." So your thought experiment doesn't gibe with how P is normally used, but I like your critical approach. Quercus solaris (talk) 01:03, 14 May 2015 (UTC)

The statement that "P is the probability of the null hypothesis being true" is unequivocally incorrect (though it may be a common misconception). If any journal article interprets p-values that way, that is their error. You stated that my reference to "a greater than 95% chance that the null hypothesis is false" "has nothing to do with any logic that I or User:Tayste mentioned." On the contrary, you referred to "a 5% chance that the null hypothesis is true," which is exactly equivalent to "a 95% chance that the null hypothesis is false," just as a 95% chance of it raining is equivalent to a 5% chance of it not raining. Or more generally, P(A) = 1 - P(not A). These are very elementary concepts of probability that should be understood before attempting to debate about the meaning of p-values. — Preceding unsigned comment added by 99.47.244.244 (talk) 21:25, 14 May 2015 (UTC)

I'm sorry, I see now how I was wrong and I see that your underlying math/logic points are correct, even though I still think the words they were phrased in, particularly how significance is mentioned, are not saying it right (correct intent but wording not right). Even though I was correct in the portion "Regarding "the probability that the null hypothesis is true when significance has occurred," that value is, by definition of significance, always less than the significance level (which is usually 5%). Significance is a yes/no (significant or not) dichotomy with the threshold set at a value of P (at heart an arbitrary value, although a sensibly chosen one)."—even though that portion is correct, I was wrong elsewhere. I see that the American Medical Association's Glossary of Statistical Terms says that P is the "probability of obtaining the observed data (or data that are more extreme) if the null hypothesis were exactly true.44(p206)" So one can say: "I choose 5% as my threshold. Assume the null hypothesis *is* exactly true. Then P = .05 means that the probability of getting this data distribution is 5%." But, as you correctly pointed out with your "1 in 20 times", if you run the experiment 100 times, you should expect to get that data distribution around 5 times anyway. So you can't look at one of those runs and find out anything about the null hypothesis's truth. But my brain was jumbling it into something like "If your data isn't junk (i.e., if your methods were valid) and you got an extreme-ish distribution, the observed distribution (observed = did happen) had only a 5% chance of having that much extremeness if the null hypothesis were true, so the fact that it did have it means that you can be 95% sure that the null hypothesis is false." It's weird, I'm pretty sure a lot of laypeople jumble it that way, but now I am seeing how wrong it is. Wondering why it is common if it is wrong, I think the biggest factor involved is that many experiments only have one run (often they *can* only have one run), and people forget about the "1 in 20 runs" idea. They fixate on looking at that one run's data, and thinking—what? that a lot of meaning can be found there, when it's really not much? I don't know—already out of time to ponder. Sorry to have wasted your time on this. I do regret being remedial and I regret that only a sliver of the populace has a firm and robust grasp of statistics. Most of us, even if we try to teach ourselves by reading about it, quickly reach a point where it might as well be a mad scientist's squiggles on a chalkboard—we get lost in the math formulae. Even I, despite having a high IQ and doing fine in math in K-12, couldn't pass an algebra test anymore—too many years without studying. You were right that I shouldn't even be trying to debate the topic——but the portions where I was right made me feel the need to pursue it, to figure out what's what. Quercus solaris (talk) 22:53, 14 May 2015 (UTC)

## P value vs P-value vs p-value vs p value

I am avoiding italics here (because I can't be bothered checking how to render them). I know "P value" varies depending on style guide etc., but I think that we should at least be able to agree that, unless used as a modifier ("the p-value style"), a hyphen is unnecessary. Can we make this change? Pretty please. 36.224.218.156 (talk) 07:36, 7 May 2015 (UTC)

Just to clarify, there are two relevant topics here—one is about establishing a style point in Wikipedia's own style guide (WP:MOS), whereas the other is about this article's encyclopedic coverage about how the term is styled by many other style guides that exist. As for the latter, the coverage is already correct as-is. As for the former, no objection to your preference—the place to get started in proposing it is Wikipedia talk:Manual of Style. From there, it may be handled either at that page or at Wikipedia talk:WikiProject Statistics, depending on who gets involved and how much they care about the styling. Quercus solaris (talk) 00:00, 8 May 2015 (UTC)
Can you do it? Pretty please. 27.246.138.86 (talk) 14:35, 16 May 2015 (UTC)
If inspiration strikes. I lack time to participate in WP:MOS generally, and the people who do participate usually end up with good styling decisions, so it doesn't become necessary to me to get involved. The hyphen in this instance doesn't irritate me (although I would not choose it myself), so I may not find the time to pursue it. But if anyone chooses to bother, I would give a support vote. Quercus solaris (talk) 14:58, 16 May 2015 (UTC)