Hotelling's T-squared distribution

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Mgcampb (talk | contribs) at 19:07, 28 September 2011 (→‎Hotelling's T-squared statistic). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In statistics Hotelling's T-squared distribution is important because it arises as the distribution of a set of statistics which are natural generalisations of the statistics underlying Student's t distribution. In particular, the distribution arises in multivariate statistics in undertaking tests of the differences between the (multivariate) means of different populations, where tests for univariate problems would make use of a t-test. It is proportional to the F distribution.

The distribution is named for Harold Hotelling, who developed it[1] as a generalization of Student's t distribution.

The distribution

If the notation is used to denote a random variable having Hotelling's T-squared distribution with parameters and then, if a random variable has Hotelling's T-squared distribution,

then[1]

where is the F-distribution with parameters and .

Hotelling's T-squared statistic

Hotelling's T-squared statistic is a generalization of Student's t statistic that is used in multivariate hypothesis testing, and is defined as follows.[1]

Let denote a -variate normal distribution with location and covariance . Let

be independent random variables, which may be represented as column vectors of real numbers. Define

to be the sample mean. It can be shown that[citation needed]

where is the chi-squared distribution with degrees of freedom. However, is often unknown and we wish to do hypothesis testing on the location .

Define

to be the sample covariance. Here we denote transpose by an apostrophe. It can be shown that is positive-definite and follows a -variate Wishart distribution with degrees of freedom.[2] Hotelling's T-squared statistic is then defined to be

because it can be shown that[citation needed]

i.e.

where is the F-distribution with parameters and . In order to calculate a p value, multiply the statistic by the above constant and use the F distribution.

Hotelling's two-sample T-squared statistic

If and , with the samples independently drawn from two independent multivariate normal distributions with the same mean and covariance, and we define

as the sample means, and

as the unbiased pooled covariance matrix estimate, then Hotelling's two-sample T-squared statistic is

and it can be related to the F-distribution by[2]

The non-null distribution of this statistic is the noncentral F-distribution (the ratio of a non-central Chi-squared random variable and an independent central Chi-squared random variable)

with

where is the difference vector between the population means.

See also

References

  1. ^ a b c Hotelling, H. (1931). "The generalization of Student's ratio". Annals of Mathematical Statistics. 2 (3): 360–378. doi:10.1214/aoms/1177732979.
  2. ^ a b K.V. Mardia, J.T. Kent, and J.M. Bibby (1979) Multivariate Analysis, Academic Press.

External links