Chi-square distribution
From Wikipedia, the free encyclopedia
| Probability density function |
|
| Cumulative distribution function |
|
| notation: | ![]() |
|---|---|
| parameters: | k ∈ N1 — degrees of freedom |
| support: | x ∈ [0, +∞) |
| pdf: | ![]() |
| cdf: | ![]() |
| mean: | k |
| median: | ![]() |
| mode: | max{ k − 2, 0 } |
| variance: | 2k |
| skewness: | ![]() |
| kurtosis: | 12 / k |
| entropy: | ![]() |
| mgf: | for | t | ≤ ½ |
| cf: | [1] |
In probability theory, the chi-square distribution (also chi-squared or χ²-distribution) with k degrees of freedom is the distribution of a sum of squares of k independent standard normal random variables. It is one of the most widely used probability distributions in inferential statistics, e.g. in hypothesis testing or in construction of confidence intervals.[2][3][4][5]
The best-known situations in which the chi-square distribution is used are the common chi-square tests for goodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data. Many other statistical tests also lead to a use of this distribution, like Friedman's analysis of variance by ranks.
Contents |
[edit] Definition
If X1, …, Xk are independent, normally distributed random variables with mean 0 and variance 1, then the random variable
is distributed according to the chi-square distribution with k degrees of freedom. This is usually written
The chi-square distribution has one parameter: k — a positive integer that specifies the number of degrees of freedom (i.e. the number of Xis)
[edit] Characteristics
Further properties of the chi-square distribution can be found in the box at right.
[edit] Probability density function
The probability density function (pdf) of the chi-square distribution is
where Γ(k/2) denotes the Gamma function, which has closed-form values at the half-integers.
For derivations of the pdf in the cases of one and two degrees of freedom, see Proofs related to chi-square distribution.
[edit] Cumulative distribution function
Its cumulative distribution function is:
where γ(k,z) is the lower incomplete Gamma function and P(k,z) is the regularized Gamma function.
Tables of this distribution — usually in its cumulative form — are widely available and the function is included in many spreadsheets and all statistical packages.
[edit] Additivity
It follows from the definition of the chi-square distribution that the sum of independent chi-square variables is also chi-square distributed. Specifically, if
are independent chi-square variables with
degrees of freedom, respectively, then
is chi-square distributed with
degrees of freedom.
[edit] Information entropy
The information entropy is given by
where ψ(x) is the Digamma function.
[edit] Noncentral moments
The moments about zero of a chi-square distribution with k degrees of freedom are given by[6][7]
[edit] Cumulants
The cumulants are readily obtained by a (formal) power series expansion of the logarithm of the characteristic function:
[edit] Asymptotic properties
By the central limit theorem, because the chi-square distribution is the sum of k independent random variables, it converges to a normal distribution for large k (k > 50 is “approximately normal” according to [8]). Specifically, if X ~ χ²(k), then as k tends to infinity, the distribution of
tends to a standard normal distribution. However, convergence is slow as the skewness is
and the excess kurtosis is 12 / k.
Other functions of the chi-square distribution converge more rapidly to a normal distribution. Some examples are:
- If X ~ χ²(k) then
is approximately normally distributed with mean
and unit variance (result credited to R. A. Fisher). - If X ~ χ²(k) then
is approximately normally distributed with mean
and variance
(Wilson and Hilferty, 1931)
[edit] Related distributions
A chi-square variable with k degrees of freedom is defined as the sum of the squares of k independent standard normal random variables.
More generally, the chi-square distribution is related to any Gaussian random vector of length k as follows. If Y is a Gaussian random vector having mean vector μ and covariance matrix C, then X = (Y−μ)′C−1(Y−μ) is chi-square distributed with k degrees of freedom. This is because the subtraction of μ and the multiplication by C−1/2 effectively transforms the Gaussian vector to an i.i.d., zero-mean distribution.
The sum of squares of statistically independent unit-variance Gaussian variables which do not have mean zero yields a generalization of the chi-square distribution called the noncentral chi-square distribution.
If Y is a vector of k i.i.d. standard normal random variables and A is a k×k idempotent matrix with rank k−n then the quadratic form Y′AY is chi-square distributed with k−n degrees of freedom.
The chi-square distribution is also naturally related to other distributions arising from the Gaussian. In particular,
- Y is F-distributed, Y ∼ F(k1,k2) if
where X1 ~ χ²(k1) and X2 ~ χ²(k2) are statistically independent.
- If X is chi-square distributed, then
is chi distributed.
[edit] Generalizations
The chi-square distribution is obtained from the sum of k independent, zero-mean, unit-variance Gaussian random variables. Generalizations of this distribution can be obtained by summing the squares of other types of Gaussian random variables. Several such distributions are described below.
[edit] Noncentral chi-square distribution
The noncentral chi-square distribution is obtained from the sum of the squares of independent Gaussian random variables having unit variance and nonzero means.
[edit] Generalized chi-square distribution
The generalized chi-square distribution is obtained from the quadratic form z′Az where z is a zero-mean Gaussian vector having an arbitrary covariance matrix, and A is an arbitrary matrix.
[edit]
The chi-square distribution X ~ χ²(k) is a special case of the gamma distribution, in that X ~ Γ(k/2, 2).
Because the exponential distribution is also a special case of the Gamma distribution, we also have that if X ~ χ²(2), then X ~ Exp(1/2) is an exponential distribution.
The Erlang distribution is also a special case of the Gamma distribution and thus we also have that if X ~ χ²(k) with even k, then X is Erlang distributed with shape parameter k/2 and scale parameter 1/2.
[edit] Applications
The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student’s t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables divided by their respective degrees of freedom.
Following are some of the most common situations in which the chi-square distribution arises from a Gaussian-distributed sample.
- if
are i.i.d. N(μ,σ2) random variables, then
where
.
- The box below shows probability distributions with name starting with chi for some statistics based on
independent random variables:
| Name | Statistic |
|---|---|
| chi-square distribution | ![]() |
| noncentral chi-square distribution | ![]() |
| chi distribution | ![]() |
| noncentral chi distribution | ![]() |
[edit] See also
- Proofs related to chi-square distribution
- Cochran's theorem
- Inverse-chi-square distribution
- Degrees of freedom (statistics)
- Fisher's method for combining independent tests of significance
- Noncentral chi-square distribution
- Normal distribution
- Wishart distribution
- High-dimensional space
[edit] References
- ^ M.A. Sanders. "Characteristic function of the central chi-square distribution". http://www.planetmathematics.com/CentralChiDistr.pdf. Retrieved 2009-03-06.
- ^ Abramowitz, Milton; Stegun, Irene A., eds. (1965), "Chapter 26", Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, New York: Dover, ISBN 0-486-61272-4, http://www.math.sfu.ca/~cbm/aands/page_940.htm.
- ^ NIST (2006). Engineering Statistics Handbook - Chi-Square Distribution
- ^ Jonhson, N.L.; S. Kotz, , N. Balakrishnan (1994). Continuous Univariate Distributions (Second Ed., Vol. 1, Chapter 18). John Willey and Sons. ISBN 0-471-58495-9.
- ^ Mood, Alexander; Franklin A. Graybill, Duane C. Boes (1974). Introduction to the Theory of Statistics (Third Edition, p. 241-246). McGraw-Hill. ISBN 0-07-042864-6.
- ^ Chi-square distribution, from MathWorld, retrieved Feb. 11, 2009
- ^ M. K. Simon, Probability Distributions Involving Gaussian Random Variables, New York: Springer, 2002, eq. (2.35), ISBN 978-0-387-34657-1
- ^ Box, Hunter and Hunter. Statistics for experimenters. Wiley. p. 46.
- Wilson, E.B. Hilferty, M.M. (1931) The distribution of chi-square. Procedings of the National Academy of Sciences, Washington, 17, 684–688.
[edit] External links
- Earliest Uses of Some of the Words of Mathematics: entry on Chi square has a brief history
- Comparison of noncentral and central distributions Density plot, critical value, cumulative probability, etc., online calculator based on R embedded in Mediawiki.
- Course notes on Chi-Square Goodness of Fit Testing from Yale University Stats 101 class. Example includes hypothesis testing and parameter estimation.
- On-line calculator for the significance of chi-square, in Richard Lowry's statistical website at Vassar College.
- Distribution Calculator Calculates probabilities and critical values for normal, t-, chi2- and F-distribution
- Chi-Square Calculator for critical values of Chi-Square in R. Webster West's applet website at University of South Carolina
- Chi-Square Calculator from GraphPad
- Table of Chi-squared distribution
- Mathematica demonstration showing the chi-squared sampling distribution of various statistics, e.g. Σx², for a normal population
- Simple algorithm for approximating cdf and inverse cdf for the chi-square distribution with a pocket calculator
|
|||||||||||






for | t | ≤ ½










