# Talk:Chi-squared test

WikiProject Mathematics (Rated C-class, High-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 C Class
 High Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.
WikiProject Statistics (Rated C-class, Top-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C  This article has been rated as C-Class on the quality scale.
Top  This article has been rated as Top-importance on the importance scale.

The explanation is completely in theoretical terms. I'm trying to understand an article better, and this piece is absolutely no help in doing so. Depaderico 04:11, 22 October 2007 (UTC)

The specifics of the examples are in the articles linked to, rather than in this article. Certainly more of those could be added, and this article remains somewhat stubby. But if you follow those links, you'll find some examples. Michael Hardy 20:21, 22 October 2007 (UTC)
I repeat my earlier assertion that this article would be aided by an example. It is, of course, fine to link to examples; however, the article should contain some examples. We are talking about an important statistical method, and this is what people will find when they google it. We should really try to explain it as clearly as possible, so that your average Joe (who has a less than 50% chance of having a good background in statistics) can walk away from this article knowing what a Chi-squared test is. Although I'm not saying that the article does a poor job of explaining the test--no, in fact the article is brilliantly written--, a concrete example is most useful in understanding something of this nature. --Depaderico (talk) 00:20, 1 April 2011 (UTC)

## Accuracy?

Whew. That last paragraph (after the table) is a blow-out non sequitor. Where did the p=0.5485 come from? —Preceding unsigned comment added by 70.79.12.65 (talk) 17:53, 13 January 2008 (UTC)

I'd like to know that too. I kinda forgot. --RoSeeker (talk) 17:12, 27 January 2008 (UTC)

Yeah, citing the equation that is used to calculate the 54.85% would be very beneficial. —Preceding unsigned comment added by 132.189.76.18 (talk) 19:46, 22 February 2008 (UTC)
The above editors are right. Also, I can't make sense of the statement that there is a 54 per cent probablility of seeing "this data" if the coin is fair. We all know that if we toss a fair coin 100 times the result could easily be 47-53 but it is not more likely than not that we get exactly 47-53. 48-52 would also come up pretty often. Itsmejudith (talk) 15:43, 12 March 2008 (UTC)
I fixed the wording so that the interpretation of the ~54% is correct. Baccyak4H (Yak!) 18:22, 12 March 2008 (UTC)
Clearer now, thanks. Itsmejudith (talk) 08:47, 13 March 2008 (UTC)

## Calculating P from Chi2 and DoF

This article seems to be a good introduction, but could use a lot more detail such as-

• Another example with >1 DoF (Degrees of Freedom)
• DoF = (r-1)(c-1)
• Where do the P-values come from? (how are they computed?)

If anyone knows a formula/algorithm for calculating a P-value from the Chi2 and degrees of freedom, please let me know.

--Karuna8 (talk) 18:14, 17 March 2008 (UTC)

Thank you. An example taking us through every step of calculating the Chi square of a 2x2 contingency table would seem to be a basic requirement. Itsmejudith (talk) 18:39, 17 March 2008 (UTC)

## Move Pearson's chi-squared test to chi-squared test?

Re the above, there's more info at Pearson's chi-squared test. At one point this page (i.e. Chi-squared test was just a disambiguation page but it has slowly expanded, i think because it wasn't clear it was just meant to be a disambig page. Could just revert it to the being a disambig page , but I've been thinking that to prevent us going around the same slow circle again it might be better to move the page currently at Pearson's chi-squared test to Chi-squared test with a note at the top along the lines of:

After all, I don't think there's any question that the vast majority of users who type in "chi-squared test" are looking for Pearson's. Nor is there any historical question that Pearson's paper introduced the use of the symbol chi in this context, so calling it simply "the chi-squared test" seems quite reasonable.

On Karuna8's last point, calculating a P-value from the Chi2 and degrees of freedom requires calculation of the cumulative distribution function of the chi-squared distribution, so more details are at chi-squared distribution, but in a nutshell you need to calculate the Incomplete gamma function which isn't simple. In the past people looked it up in a table, but most people these days use statistical software that has the chi-squared distribution's cdf programmed into it. I wouldn't know how to calculate it or program it from scratch — I'm sure that it's built into various software libraries based on algorithms in the relevant standard textbooks but I'm equally sure it's not available on Wikipedia I'm afraid, nor would I see adding it as a high priority. Qwfp (talk) 18:47, 17 March 2008 (UTC)

Thanks, I don't think moving is necessary, I just added a 'see also' to the Pearson's page for 'more detail'. That should suffice. Also thanks for the leads, if anyone knows of a written algorithm I could follow, let me know. --Karuna8 (talk) 18:55, 17 March 2008 (UTC)
Agree- I changed my mind- I didn't quite understand the difference before. Since 'chi2 test' really means pearson's, I think the two should be merged. This page provides a good introduction and the Pearson's page has the more detailed parts. No disambig page is necessary since Yates is really just a modified Pearson's (my books call it Yates correction, not a different test). --Karuna8 (talk) 15:40, 18 March 2008 (UTC)
It could do with sorting out. Simplest would seem to be to make Chi-squared test a disambiguation page again and to move all the material currently here that provides an introduction to the Pearson chi-squared to the Pearson's chi-squared test article. As Karuna says, this can stand as the introduction and the more detailed material in the Pearson's article can simply follow. Itsmejudith (talk) 20:49, 18 March 2008 (UTC)

It's ridiculous to say that "chi-squared test" really means Pearson's. Such a merger is the opposite of what we need to do. Michael Hardy (talk) 16:41, 22 March 2008 (UTC)

Can you explain the difference then, because I don't see it. --Karuna8 (talk) 17:20, 22 March 2008 (UTC)

The difference is that Pearson's chi-squared test is used only for testing a null hypothesis that about sizes of subsets that a population has been partitioned into. If you throw a die, you can get any of six outcomes; a null hypothesis may say the die is "fair", meaning all six happen equally frequently, or, in the language of the previous sentence, that all six of those subsets of the population are of equal sizes. A chi-squared test generally is any statistical test in which the probability distribution of the test statistic, assuming the null hypothesis is true, is a chi-squared distribution. A simple example would be a table in which the null hypothesis says just that rows and columns are independent. That's not Pearson's chi-squared test, but it is a chi-squared test. There are many other chi-squared tests besides Pearsons. Michael Hardy (talk) 17:52, 22 March 2008 (UTC)

So would it be better to use a 2x2 contingency table, as I have seen done in introductory stats texts? For example, a company has managerial and non-managerial staff, male and female. The null hypothesis is that men and women are equally likely to be managers. We draw up a 2x2 grid with the numbers of workers actually found in the categories, compare with the expected and calculate the Pearson's chi-squared to see if the null can be excluded. Itsmejudith (talk) 11:40, 1 April 2008 (UTC)

## What more?

Given that this article has been classed as high priority and is still classed as a stub, can people add discussion here of what needs to be done to improve things. Melcombe (talk) 14:13, 21 April 2008 (UTC)

Enough exemplar material to take a reader who has only vaguely heard of the test through to being able to use it in a real situation. That's my priority, anyway. Itsmejudith (talk) 14:28, 21 April 2008 (UTC)
And could we use the same 2x2 contingency table as in the contingency table article? Itsmejudith (talk) 14:29, 21 April 2008 (UTC)
I think the discussion in the section above concludes that there are many different tests that can reasonably be called chi-squared tests, so that "the test" is not quite appropriate. Potentially what is needed here is a set of simple examples of the different tests to help readers to distinguish between them, but leaving most details to other articles. Melcombe (talk) 11:50, 22 April 2008 (UTC)

I have added a section on testing of the variance of a normal population, and I think that there is now enough to move this article out of the stub class ... so I have done so. Melcombe (talk) 16:21, 4 August 2008 (UTC)

## First Line

A chi-squared test (also chi-squared or $\chi^2$ test) is any statistical hypothesis test in which the test statistic has a chi-squared distribution when the null hypothesis is true, or any in which the probability distribution of the test statistic (assuming the null hypothesis is true) can be made to approximate a chi-squared distribution as closely as desired by making the sample size large enough.

The first line saying that the null hypothesis is true isn't always true, but more like not been proved otherwise. —Preceding unsigned comment added by 202.89.166.150 (talk) 22:48, 16 November 2008 (UTC)

## Chi-squared test for variance in a normal population

This entire section must be removed / deleted as the author is confusing a normal distribution with that of a chi square distribution in his/her explanation. The normal variance of the Chi-Square Test is based upon that of a chi square distributive function but the explanation points to a normal distribution when the end user clicks on the hyperlink. To tell you guys the absolute truth, this entire wikipage should be re-written from scratch because the definitions and explanations seem to be written from the perspective of someone who is barely familiar with the subject matter shiznaw (talk) 19:31, 2 December 2012 (UTC)shiznaw@gmail.com John Allen Shaw, Econometrics, MA Univ of Utah

## The first sentence

The first sentence is ridiculously complicated! Statistics is very poorly explained on wikipedia, and this is one of the worst examples. — Preceding unsigned comment added by 137.43.182.222 (talk) 17:17, 12 December 2012 (UTC)

I agree. It needs a simple explanation in one paragraph so the reader gets the point, before giving the technical definition and any further explanation. And yes, examples would be good.I looked up this article to find out about the topic, and it sunk so quickly into detail that I had to use what I already know to understand it. (On first impression, I had no idea what it's talking about.)