Talk:Chi-squared distribution

Statistics High‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
High	This article has been rated as High-importance on the importance scale.

Mathematics High‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
High	This article has been rated as High-priority on the project's priority scale.

Daily pageviews of this article

A graph should have been displayed here but graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at pageviews.wmcloud.org

Archives

/Archive 1

Tip: Anchors are case-sensitive in most browsers.

This article links to one or more target anchors that no longer exist.

[[Chernoff bound#The first step in the proof of Chernoff bounds|Chernoff bounds]] The anchor (#The first step in the proof of Chernoff bounds) is no longer available because it was deleted by a user before.

Please help fix the broken anchors. You can remove this template after fixing the problems. | Reporting errors

Image[edit]

I would like to contribute this image if folks feel it would improve the Wiki. Since I have a COI with SAS Software and these charts are modeled in SAS, I wanted to post here first. Let me know if you feel the image would be useful.

Analytics447 (talk)

clean up derivation of 1 parameter pdf[edit]

The current notation has F_x and other symbols that are no defined. —Preceding unsigned comment added by Wilgamesh (talk • contribs) 17:28, 7 August 2009 (UTC)[reply]

needs historical treatment[edit]

I just came here looking for the history of Chi-Squared, credit for initial development, historical context, time line, rise in popularity, etc. I think a growing criticism of wikipedias math treatment is an inconsistant voice about whether or not it wants to be a technical encyclopedia or have elements of the history of the math. It probably needs both for completeness; however, in this case the article is sharply skewed in the technical direction with very little historical merit.

72.200.80.12 15:33, 3 June 2007 (UTC)[reply]

I am a grateful reader of many math articles on wikipedia. Access to the explanations of the knowledgeable editors of these pages has helped me many times. However, I agree with the comment posted on this talk page nearly two years ago. I suspect that those who can learn from an article in this style are already familiar with much of the information it contains. This article is too technical for me. I have a basic undergraduate education in math (through multi-variable calculus, but no separate statistics class). I post this simply as feedback to the editors as they reflect on the situation of their readers. —Preceding unsigned comment added by 67.169.186.230 (talk) 22:28, 27 March 2009 (UTC)[reply]

What does this sentence mean to tell me?[edit]

"If Xi are k independent, normally distributed random variables with mean 0 and variance 1..." Does it mean that X₁ is a normal variable whose mean is 0 variance is 1, X₂ is a normal variable whose mean is 0 and variance is 1, etc, or does it mean that if you looked at all k of the variables, their collective mean is 0 and variance is 1, or does it necessarily imply both meanings? I'm confused because I am accustomed to being told that we have a single variable X that we sample k observations of, not a whole bunch of different variables at once. 207.189.230.42 (talk) 07:16, 8 June 2009 (UTC)[reply]

There is no such thing as a "collective mean"; the mean is a property of a single random variable. Your first interpretation is correct, with the additional information that the different variables are also independent. --Zvika (talk) 07:41, 8 June 2009 (UTC)[reply]

Generalized chi-squared distributions[edit]

What is $\mu$ in this section? —Preceding unsigned comment added by 69.223.43.14 (talk) 00:37, 6 August 2009 (UTC)[reply]

Do the results in this section assume that

E[Re\{Z_{i}\}Im\{Z_{i}\}]=0

for all i? (Real and imaginary parts uncorrelated.) If so, that should be noted. —Preceding unsigned comment added by 198.151.13.8 (talk) 16:49, 13 August 2009 (UTC)[reply]

You're right, it isn't clear if real and imaginary are uncorrelated. It's also unclear if the variances of real and imaginary parts are equal. And the variable mu was already defined to be something else in a previous section. It’s all very confusing.

This section is a blind copy&paste from the paper written by Björnson et al. At least we need to make sure the notations/symbols are consistent with the rest of this wiki entry. —Preceding unsigned comment added by Qiuxing (talk • contribs) 15:35, 22 October 2009 (UTC)[reply]

Is there any reason to think that the information in this section is notable? Has this "generalization" been used anywhere except in the two quoted papers? Both of them are very recent and written by the same people. Has this even been called a generalized chi-squared distribution anywhere? --Zvika (talk) 17:44, 22 October 2009 (UTC)[reply]

It is not clear how this case of complex normals relates to that for real normals, but I guess it must be close given some reparameterisation. I think a common non-standard problem is to find the distribution of quadratic forms of normal random variables (seen in questions on stats newsgroups) and, given this, a version of the present section relating more directly to real-valued normals would be notable ... perhaps in separate article, or here, or as an addition to Quadratic form (statistics). Melcombe (talk) 10:18, 23 October 2009 (UTC)[reply]

Since this is a sum of squares of absolute values of complex Gaussians, it seems to me that it's exactly the same as the sum of twice as many real Gaussians; the real and imaginary parts of a complex Gaussian are real Gaussians. So, indeed, perhaps the discussion should be about quadratic forms of real Gaussians. Have you got a reference for that? I would think that should appear in a textbook rather than requiring a quote from a recent paper. --Zvika (talk) 17:19, 23 October 2009 (UTC)[reply]

I don't think text books are strong on stuff related to this. But Johnson et al. Continuous Univariate Distributions vol 1 (2nd edition) has three things:

(p 442) conditions for a quadratic form to be exactly chi-squared;
(p 444) references to results for the cdf of a weighted sum of chi-squareds as weighted sum of cdf's of F-distributions (although theweights are not given explicitly;
(p 450) references to numerical algorithms, including one which deals with a general quadratic form using numerical inversion of the characteristic function.

The dates for the references are 1962-1980 so it is not exactly new. The reference list to this paper (I can't see the paper itself) seems to contain many of the basic references. Melcombe (talk) 10:46, 26 October 2009 (UTC)[reply]

This section should be removed. It isn’t about chi-squared distributions. There are an infinite number of ways to generalize a chi-squared distribution. Are we going to try to list them all? And can anyone make an argument for why this generalization is significant? —Preceding unsigned comment added by 198.151.13.7 (talk • contribs)

I tend to agree.. I don't see much added value in the section as it currently stands. I suggest replacing the whole section with an {{expand section}} template. --Zvika (talk) 04:58, 29 October 2009 (UTC)[reply]

P.S. I don't mean that generalized chi-squared distributions don't have a place here, only that the current generalization doesn't seem to be notable. There are others which Melcombe mentioned which very well may be. --Zvika (talk) 04:59, 29 October 2009 (UTC)[reply]

A little care is needed regarding other articles linking here. There is a link from Erlang distribution directly to this section, but it doesn't actually relate to the present contents. Possibly the section should be made to fit in with this.

A very quick search showed that "generalized chi-square(d) distribution" is often used for one of the special cases of the quadratic form of normal variables, although one source meant a distribution as in Generalized gamma distribution. And of course there is the noncentral chi-squared distribution.

So, I suggest moving the present contents of the section to form a basis for an article "Generalized chi-squared distribution" (but noting that it is just a special case). Then replace the contents here with a brief section stating the various types of generalisation, with links to appropriate articles. The links to Wishart distribution and noncentral chi-squared distribution can then be moved out of the "see also" section and thus made more meaningful. Melcombe (talk) 10:45, 29 October 2009 (UTC)[reply]

Your proposed change to the current article sounds great. However, in my opinion moving the existing section to a new article doesn't solve the basic problem, which is that this material doesn't seem to be notable. In fact it seems to me that this material was written by a single IP who is apparently no longer watching this page (or at least not the talk page). Furthermore, that IP is registered to KTH which is where the authors of the only two papers citing this "generalization" are from. In short, this seems like a SPA with COI. --Zvika (talk) 16:40, 29 October 2009 (UTC)[reply]

They don't seem to be pushing anything too strongly. I would be reluctant to delete the citations .... after all the stats articles in general, including this one, contain far too few inline citations. But the material here might be reduced somewhat ... it may be rather a case of looking for an explicit formula which turns out not to be useful for anything further mathemtically, in preference to accepting that a more general computer algorithm may be more useful in practice. Melcombe (talk) 09:57, 30 October 2009 (UTC)[reply]

I have started Generalized chi-squared distribution based on one source I know of, but there is not yet an obvious place in it to move the stuff from the subsection here. Melcombe (talk) 14:48, 30 October 2009 (UTC)[reply]

Done.. I hope you like it. --Zvika (talk) 18:56, 2 November 2009 (UTC)[reply]

The material referred to above was deleted ( 21:55, 22 December 2009 by 198.151.13.7) as not being directly relevant here, but I haved placed a slightly edited version in Generalized chi-squared distribution as being more appropriate there. Melcombe (talk) 09:54, 23 December 2009 (UTC)[reply]

Simplify, Please[edit]

Could this please be translated into English? I'm sorry, but I'm sure this is perfectly clear for people who already know what it all means. I've got a degree in Mathematics, and I'm completely lost. Could the lead paragraph, at least, be modified so that somebody who has no background in statistics could at least get the general idea of what a Chi-squared distribution is? 71.236.100.183 (talk) 00:52, 19 August 2009 (UTC)[reply]

No offence intended, but how on earth did you get a degree in mathematics without learning about the Chi-squared distribution?! We learnt this at age 16 as part of the standard syllabus for English maths AS level. 78.105.234.140 (talk) 16:04, 20 August 2009 (UTC)[reply]

I have a maths degree from Cambridge University, and I didn't study this distribution either (I think it was taught in the second year stats course, but I only took first year probability). It's hardly the most common distribution, and learning lots of specific distributions isn't as important as understanding axiomatic probability theory anyway. In any case, it's hardly unreasonable to ask for clarity. I agree that the final sentence of the first paragraph is a bit unclear. Quietbritishjim (talk) 10:13, 21 August 2009 (UTC)[reply]

I agree that this needs to be simplified. If I knew more than the very basics of Chi-squared then I would not need to come to Wikipedia. The assumption needs to be that the reader has come to Wikipedia to find out how to use Chi-squared from scratch. This might best be achieved by having a more specific Introduction to Chi-squared section written to get the complete novice up to a level of using Chi-squared. This is an Encyclopedia. AussieRob (talk) 01:09, 11 February 2011 (UTC)[reply]

I have done quite a bit of statistics in the past and just came here to refresh myself for seeing if I should be using this distribution/test for some data. The article is utterly useless. It is only understandable if you already know what it is talking about and it reads more like collage notes than an explanation in an encyclopaedia. It needs an introduction which describes the sort of data and relationships it works on and this needs to be in language that is as close to layman's as possible. Then it can define terms. Then it can get mathsy. —Preceding unsigned comment added by 88.104.110.246 (talk) 08:17, 2 March 2011 (UTC)[reply]

I agree with the foregoing. This article, and all mathematics articles on Wikipedia, are apparently the province of a small cadre of experienced mathematicians who enjoy writing for each other. None of them are of any use to an intelligent but uninformed layman. Please, somebody, rewrite this article with an introduction that explains what chi-squared probability distributions actually describe in real-world terms, and what they are USED for. EWAdams (talk) 19:44, 29 April 2014 (UTC)[reply]

I don't think this article can be simplified much. The chi-squared distribution is a less intuitive distribution than say the binomial or Poisson distribution (which can come from very easy-to-understand processes). The article starts with:

the chi-squared distribution (also chi-square or χ²-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables.

The hyperlinked terms here are prerequisite knowledge (except for "degrees of freedom"). If you don't understand any of these, you won't understand the definition of the Chi-squared distribution, so you should learn these things first.

It would be like complaining that the article on Quantum Entaglement is too technical. Would you really expect to understand Quantum Entanglement without understanding Quantum Mechanics? The article could use analogies to explain it but then they would be straying from the truth. Monsterman222 (talk) 20:54, 1 October 2014 (UTC)[reply]

Notation[edit]

I suggest that we change the notation for chi² distribution from $\chi _{k}^{2}\,$ to $\chi ^{2}(k)\,$ (both of them can be encountered in the literature). The reason for such change is to have more uniform notation for different distributions across the Wikipedia: <name of distribution>(<parameters>). For example, we have N(μ,σ²) for normal, Poisson(λ) for Poisson, F(k₁,k₂) for F-distribution, Binomial(n,p) for binomial, etc. Such notation is also more convenient to display in-line using HTML: χ²(k). … stpasha » 20:54, 29 November 2009 (UTC)[reply]

I think either is OK, but the change will have to be in the entire article. --Zvika (talk) 07:01, 30 November 2009 (UTC)[reply]

isn't the conventional notation for degrees of freedom ν not k? —Preceding unsigned comment added by 144.118.202.147 (talk) 15:26, 5 June 2010 (UTC)[reply]

I have seen d, k, v, f (and obviously n, m etc when there's more than one) used for the degrees of freedom. I can't really tell which is the most common. --Paulginz (talk) 20:24, 22 July 2010 (UTC)[reply]

I was under the impression that for Z₁,...,Z_k, the number of degrees of freedom was in fact k-1, following from the fact that ${\frac {(n-1)s^{2}}{(\delta )^{2}}}~\chi _{k}^{2}\,$ — Preceding unsigned comment added by 131.251.252.71 (talk) 23:51, 2 June 2015 (UTC)[reply]

Approximately normal[edit]

Regarding the central limit theorem, the article makes the assertion

k > 50 is “approximately normal”

I don't know if this is a quote from the source or what but, regardless, the statement needs to be qualified in some fashion. Any value of of k could be considered "approximately normal" depending on how one defines that. What is the criterion being applied to make this assertion?

--Mcorazao (talk) 20:37, 4 May 2010 (UTC)[reply]

I have the 1978 edition of the book cited, and it says that, for k > 50, the difference between the chi squared and normal distributions is negligible^[1]. Still, this is a matter of opinion rather than fact. One person's negligible is another's catastrophe. I think a link to the Berry Esseen theorem would qualify the statement somewhat (or, at least, allow others to work it out for themselves), although the bounds it provides tend to substantially overestimate the error.

DonaghHorgan (talk) 12:34, 26 April 2013 (UTC)[reply]

References

^ Box, Hunter and Hunter (1978). Statistics for experimenters. Wiley. p. 118. ISBN 0471093157.

Skewness[edit]

how to find it? 174.91.224.96 (talk) 18:21, 28 November 2010 (UTC)[reply]

Scaled Chi square distribution[edit]

I feel that it is very important to mention scaled chi square distribution. Suppose Y is chi square with df k, what is the distribution of cY where c>0??? Jackzhp (talk) 03:21, 5 January 2011 (UTC)[reply]

Log[edit]

suppose Y has chi square distribution, what is the distribution of log(Y)? Jackzhp (talk) 02:51, 9 February 2011 (UTC)[reply]

Connection with Laplace distribution[edit]

The assertion that If $X_{i}\sim \mathrm {Laplace} (\mu ,\beta )\,$ then $\sum _{i=1}^{n}{\frac {2}{\beta |X_{i}-\mu |}}\sim \chi ^{2}(2n)\,$ seems unlikely. In fact, ${\frac {2|X_{i}-\mu |}{\beta }}\sim \chi ^{2}(2)\,$ , so it seems far more likely that $\sum _{i=1}^{n}{\frac {2|X_{i}-\mu |}{\beta }}\sim \chi ^{2}(2n)\,$ .

Primrose61 (talk) 22:42, 18 October 2011 (UTC)[reply]

You are correct; I was reading the article and noticed the same error. The same error was on the Laplace distribution page. I fixed them both.John Lawrence (talk) 18:09, 21 November 2011 (UTC)[reply]

CDF graph is wrong[edit]

The graph shows that CDF at $0$ $= 0.1$ for $k = 1$ . --Yecril (talk) 14:34, 24 March 2012 (UTC)[reply]

Ambiguous reference to "shape parameterization of the gamma distribution"[edit]

In the subsection "Gamma, exponential, and related distributions", the first sentence refers to "the shape parameterization of the gamma distribution". If one visits the wikipedia page for the gamma distribution, all three parameterizations contain a "shape" parameter. Hence, referring to the "shape parameterization" does nothing to disambiguate the parameterization being referred to here. Craniator (talk) 01:58, 18 March 2013 (UTC)[reply]

The probability distribution graph is wrong, now[edit]

I think somebody meant to change the CDF, but changed the graph for the PDF, instead. — Preceding unsigned comment added by 171.67.87.31 (talk) 23:48, 26 April 2013 (UTC)[reply]

Needs work[edit]

This article literally makes no sense to me. I'm wondering if I can decode even a single sentence or equation. It is written as if the reader already knows much about the distribution. As it stands, you need a bachelors in math just to figure out what the article is saying. ~EDDY ^{(talk/contribs)}~ 14:03, 1 July 2014 (UTC)[reply]

Made property more explicit[edit]

The article stated:

If Y is a vector of k i.i.d. standard normal random variables and A is a k×k idempotent matrix with rank k−n then the quadratic form Y^TAY is chi-squared distributed with k−n degrees of freedom.

I've added the necessary condition that A be symmetric. I get the impression that most experts would have assumed this from the fact that we have a quadratic form, but making this requirement explicit will help beginners. BTW, a counter-example to the theorem as originally stated would be: $A={\begin{bmatrix}1&1\\0&0\end{bmatrix}}$ , which is idempotent and leads to $Y_1^2+Y_1 Y_2$ which is not Chi-squared since it could potentially take on negative values. — Preceding unsigned comment added by Monsterman222 (talk • contribs) 20:44, 1 October 2014 (UTC)[reply]

Medians missing[edit]

Please add to the article … what is the exact median of the χ² distribution for the first few integer values for degrees of freedom? 104.129.194.123 (talk) 17:55, 11 June 2015 (UTC)[reply]

Arrow notation[edit]

In the "Relation to other distributions" section, the first entry gives the limit as $k\to \infty$ . What does the arrow with the d on top $\left({\xrightarrow {d}}\right)$ mean? It's not obvious to me. MystRivenExile (talk • contribs) 15:39, 22 August 2016 (UTC)[reply]

Mistake in Intro[edit]

Second paragraph. First sentence. I'm not sure what the author intended from this word. Perhaps something related to statistical independence since the first portion of this word links to an article on statistical independence. (indeution) Bradybray (talk) 17:08, 14 June 2017 (UTC)[reply]

Pronounciation[edit]

"Pronounced like chai tea, the chi-square,"

I think this is: 1) Wrong; and 2) Unhelpful.

1) The Greek letter chi starts with a 'k' sound, while chai tea starts with a 'ch' sound.

2) One unfamiliar foreign word is being illustrated with another unfamiliar foreign word?!

GeneCallahan (talk) 16:07, 29 June 2017 (UTC)[reply]

Moving page without consensus[edit]

Due to the page being moved unexpectedly and without discussion, the following is now featured on the About page part of Talk: "Chi-square distribution has been listed as a level-5 vital article in an unknown topic'". The page should be moved back to Chi-squared distribution, there was no consensus to move it. Кирилл С1 (talk) 10:41, 27 May 2020 (UTC)[reply]

It was equally broken before the move. I added the topic=Mathematics to fix it (I think). Dicklyon (talk) 07:03, 7 September 2020 (UTC)[reply]

error in the "definitions > introduction" section ?[edit]

I see in the "definitions > introduction" section the 3 lines hereunder that seems to be wrong.

\chi ^{2}={(m-Np)^{2} \over Npq}

Using

N=Np+N(1-p)

,

N=m+(N-m)

, and

q=1-p

, this equation can be rewritten as

\chi ^{2}={(m-Np)^{2} \over Np}+{(N-m-Nq)^{2} \over Nq}

I would suggest to replace these lines with lines hereunder and also to use Z in place of χ² :

Z^{2}={(m-Np)^{2} \over Npq}

as Z is the usual letter used for a standard normal distribution

Using

1=p+q

, this equation can be rewritten as

Z^{2}={(p+q)(m-Np)^{2} \over Npq}={(m-Np)^{2} \over Np}+{(m-Np)^{2} \over Nq}

BUT more important is that this part of the introduction has no sense as it is the Cochran's theorm that shows that variable hereunder has a χ² distribution

D=\sum _{i=1}^{n}{\frac {(O_{i}-E_{i})^{2}}{E_{i}}}

So I would recommend to suppress these very confusing lines and just replace them by a link to the Cochran's theorm
Regloor (talk) 13:47, 15 June 2021 (UTC)[reply]

Special case?[edit]

In what way is the chi square a special case of the gamma distribution? If the special case is defined by setting the gamma didtribution rate parameter beta to 0.5 this ought to be clearly stated.150.227.15.253 (talk) 19:39, 15 December 2022 (UTC)[reply]

Probability generating function[edit]

I am not sure we should have a PGF (probability generating function) in the infobox for a continuous distribution. It is actually a factorial moment generating function but if you understand that then you probably also understand that it is a simple adaptation of the moment generating function [take log(t)] and so do not need it stated in addition to the MGF. 193.240.203.36 (talk) 16:12, 18 May 2023 (UTC)[reply]

[1] Box, Hunter and Hunter (1978). Statistics for experimenters. Wiley. p. 118. ISBN 0471093157.

[1]