Jump to content

Scatter matrix: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Definition: Another definition of S, more convenient computationally when you make a small change to the data and wish to update S
Line 12: Line 12:
The '''scatter matrix''' is the ''m''-by-''m'' [[positive semi-definite]] matrix
The '''scatter matrix''' is the ''m''-by-''m'' [[positive semi-definite]] matrix


:<math>S = \sum_{j=1}^n (\mathbf{x}_j-\overline{\mathbf{x}})(\mathbf{x}_j-\overline{\mathbf{x}})^T</math>
:<math>S = \sum_{j=1}^n (\mathbf{x}_j-\overline{\mathbf{x}})(\mathbf{x}_j-\overline{\mathbf{x}})^T = \left( \sum_{j=1}^n \mathbf{x}_j \mathbf{x}_j^T \right) - n \overline{\mathbf{x}} \overline{\mathbf{x}}^T </math>


where <math>T</math> denotes [[matrix transpose]]. The scatter matrix may be expressed more succinctly as
where <math>T</math> denotes [[matrix transpose]]. The scatter matrix may be expressed more succinctly as

Revision as of 15:21, 24 July 2013

For the notion in quantum mechanics, see scattering matrix.

In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix of the multivariate normal distribution.

Definition

Given n samples of m-dimensional data, represented as the m-by-n matrix, , the sample mean is

where is the jth column of .

The scatter matrix is the m-by-m positive semi-definite matrix

where denotes matrix transpose. The scatter matrix may be expressed more succinctly as

where is the n-by-n centering matrix.

Application

The maximum likelihood estimate, given n samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix

When the columns of are independently sampled from a multivariate normal distribution, then has a Wishart distribution.

See also