# Hotelling's T-squared distribution

In statistics Hotelling's T-squared distribution is a univariate distribution proportional to the F-distribution and arises importantly as the distribution of a set of statistics which are natural generalizations of the statistics underlying Student's t-distribution. In particular, the distribution arises in multivariate statistics in undertaking tests of the differences between the (multivariate) means of different populations, where tests for univariate problems would make use of a t-test.

The distribution is named for Harold Hotelling, who developed it[1] as a generalization of Student's t-distribution.

## The distribution

If the vector pd1 is Gaussian multivariate-distributed with zero mean and unit covariance matrix N(p01,pIp) and mMp is a p x p matrix with a Wishart distribution with unit scale matrix and m degrees of freedom W(pIp,m) then m(1d' pM−1pd1) has a Hotelling T2 distribution with dimensionality parameter p and m degrees of freedom.[2]

If the notation $T^2_{p,m}$ is used to denote a random variable having Hotelling's T-squared distribution with parameters p and m then, if a random variable X has Hotelling's T-squared distribution,

$X \sim T^2_{p,m}$

then[1]

$\frac{m-p+1}{pm} X\sim F_{p,m-p+1}$

where $F_{p,m-p+1}$ is the F-distribution with parameters p and m−p+1.

## Hotelling's T-squared statistic

Hotelling's T-squared statistic is a generalization of Student's t statistic that is used in multivariate hypothesis testing, and is defined as follows.[1]

Let $\mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma})$ denote a p-variate normal distribution with location $\boldsymbol{\mu}$ and covariance ${\mathbf \Sigma}$. Let

${\mathbf x}_1,\dots,{\mathbf x}_n\sim \mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma})$

be n independent random variables, which may be represented as $p\times1$ column vectors of real numbers. Define

$\overline{\mathbf x}=\frac{\mathbf{x}_1+\cdots+\mathbf{x}_n}{n}$

to be the sample mean. It can be shown that

$n(\overline{\mathbf x}-\boldsymbol{\mu})'{\mathbf \Sigma}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})\sim\chi^2_p ,$

where $\chi^2_p$ is the chi-squared distribution with p degrees of freedom. To show this use the fact that $\overline{\mathbf x}\sim \mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma}/n)$ and then derive the characteristic function of the random variable $\mathbf y=n(\overline{\mathbf x}-\boldsymbol{\mu})'{\mathbf \Sigma}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})$. This is done below,

$\phi_{\mathbf y}(\theta)=\operatorname{E} e^{i \theta \mathbf y},$
$=\operatorname{E} e^{i \theta n(\overline{\mathbf x}-\boldsymbol{\mu})'{\mathbf \Sigma}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})}$
$= \int e^{i \theta n(\overline{\mathbf x}-\boldsymbol{\mu})'{\mathbf \Sigma}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})} (2\pi)^{-\frac{p}{2}}|\boldsymbol\Sigma/n|^{-\frac{1}{2}}\, e^{ -\frac{1}{2}n(\overline{\mathbf x}-\boldsymbol\mu)'\boldsymbol\Sigma^{-1}(\overline{\mathbf x}-\boldsymbol\mu) }\,dx_{1}...dx_{p}$
$= \int (2\pi)^{-\frac{p}{2}}|\boldsymbol\Sigma/n|^{-\frac{1}{2}}\, e^{ -\frac{1}{2}n(\overline{\mathbf x}-\boldsymbol\mu)'(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})(\overline{\mathbf x}-\boldsymbol\mu) }\,dx_{1}...dx_{p},$
$= |(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1}/n|^{\frac{1}{2}} |\boldsymbol\Sigma/n|^{-\frac{1}{2}} \int (2\pi)^{-\frac{p}{2}} |(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1}/n|^{-\frac{1}{2}} \, e^{ -\frac{1}{2}n(\overline{\mathbf x}-\boldsymbol\mu)'(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})(\overline{\mathbf x}-\boldsymbol\mu) }\,dx_{1}...dx_{p},$
$= |(\mathbf I_p-2 i \theta \mathbf I_p)|^{-\frac{1}{2}},$
$= (1-2 i \theta)^{-\frac{p}{2}}.~~\blacksquare$

However, ${\mathbf \Sigma}$ is often unknown and we wish to do hypothesis testing on the location $\boldsymbol{\mu}$.

### Sum of p squared t's

Define

${\mathbf W}=\frac{1}{n-1}\sum_{i=1}^n (\mathbf{x}_i-\overline{\mathbf x})(\mathbf{x}_i-\overline{\mathbf x})'$

to be the sample covariance. Here we denote transpose by an apostrophe. It can be shown that $\mathbf W$ is positive-definite and $(n-1)\mathbf W$ follows a p-variate Wishart distribution with n−1 degrees of freedom.[3] Hotelling's T-squared statistic is then defined[4] to be

$t^2=n(\overline{\mathbf x}-\boldsymbol{\mu})'{\mathbf W}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})$

and, also from above,

$t^2 \sim T^2_{p,n-1}$

i.e.

$\frac{n-p}{p(n-1)}t^2 \sim F_{p,n-p} ,$

where $F_{p,n-p}$ is the F-distribution with parameters p and n−p. In order to calculate a p value, multiply the t2 statistic by the above constant and use the F-distribution.

## Hotelling's two-sample T-squared statistic

If ${\mathbf x}_1,\dots,{\mathbf x}_{n_x}\sim N_p(\boldsymbol{\mu},{\mathbf V})$ and ${\mathbf y}_1,\dots,{\mathbf y}_{n_y}\sim N_p(\boldsymbol{\mu},{\mathbf V})$, with the samples independently drawn from two independent multivariate normal distributions with the same mean and covariance, and we define

$\overline{\mathbf x}=\frac{1}{n_x}\sum_{i=1}^{n_x} \mathbf{x}_i \qquad \overline{\mathbf y}=\frac{1}{n_y}\sum_{i=1}^{n_y} \mathbf{y}_i$

as the sample means, and

${\mathbf W}= \frac{\sum_{i=1}^{n_x}(\mathbf{x}_i-\overline{\mathbf x})(\mathbf{x}_i-\overline{\mathbf x})' +\sum_{i=1}^{n_y}(\mathbf{y}_i-\overline{\mathbf y})(\mathbf{y}_i-\overline{\mathbf y})'}{n_x+n_y-2}$

as the unbiased pooled covariance matrix estimate, then Hotelling's two-sample T-squared statistic is

$t^2 = \frac{n_x n_y}{n_x+n_y}(\overline{\mathbf x}-\overline{\mathbf y})'{\mathbf W}^{-1}(\overline{\mathbf x}-\overline{\mathbf y}) \sim T^2(p, n_x+n_y-2)$

and it can be related to the F-distribution by[3]

$\frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 \sim F(p,n_x+n_y-1-p).$

The non-null distribution of this statistic is the noncentral F-distribution (the ratio of a non-central Chi-squared random variable and an independent central Chi-squared random variable)

$\frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 \sim F(p,n_x+n_y-1-p;\delta),$

with

$\delta = \frac{n_x n_y}{n_x+n_y}\boldsymbol{\nu}'\mathbf{V}^{-1}\boldsymbol{\nu},$

where $\boldsymbol{\nu}$ is the difference vector between the population means.