Schur product theorem

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In mathematics, particularly in linear algebra, the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur[1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik.[2][3])

Proof[edit]

Proof using the trace formula[edit]

It is easy to show that for matrices M and N, the Hadamard product M \circ N considered as a bilinear form acts on vectors a, b as

a^T (M \circ N) b = \operatorname{Tr}(M \operatorname{diag}(a) N \operatorname{diag}(b))

where \operatorname{Tr} is the matrix trace and \operatorname{diag}(a) is the diagonal matrix having as diagonal entries the elements of a.

Since M and N are positive definite, we can consider their square-roots M^{1/2} and N^{1/2} and write

\operatorname{Tr}(M \operatorname{diag}(a) N \operatorname{diag}(b)) = \operatorname{Tr}(M^{1/2} M^{1/2} \operatorname{diag}(a) N^{1/2} N^{1/2} \operatorname{diag}(b)) = \operatorname{Tr}(M^{1/2} \operatorname{diag}(a) N^{1/2} N^{1/2} \operatorname{diag}(b) M^{1/2})

Then, for a=b, this is written as \operatorname{Tr}(A^T A) for A = N^{1/2} \operatorname{diag}(a) M^{1/2} and thus is positive. This shows that (M \circ N) is a positive definite matrix.

Proof using Gaussian integration[edit]

Case of M = N[edit]

Let X be an n-dimensional centered Gaussian random variable with covariance \langle X_i X_j \rangle = M_{ij}. Then the covariance matrix of X_i^2 and X_j^2 is

\operatorname{Cov}(X_i^2, X_j^2) = \langle X_i^2 X_j^2 \rangle - \langle X_i^2 \rangle \langle X_j^2 \rangle

Using Wick's theorem to develop \langle X_i^2 X_j^2 \rangle = 2 \langle X_i X_j \rangle^2 + \langle X_i^2 \rangle \langle X_j^2 \rangle we have

\operatorname{Cov}(X_i^2, X_j^2) = 2 \langle X_i X_j \rangle^2 = 2 M_{ij}^2

Since a covariance matrix is positive definite, this proves that the matrix with elements M_{ij}^2 is a positive definite matrix.

General case[edit]

Let X and Y be n-dimensional centered Gaussian random variables with covariances \langle X_i X_j \rangle = M_{ij}, \langle Y_i Y_j \rangle = N_{ij} and independt from each other so that we have

\langle X_i Y_j \rangle = 0 for any i, j

Then the covariance matrix of X_i Y_i and X_j Y_j is

\operatorname{Cov}(X_i Y_i, X_j Y_j) = \langle X_i Y_i X_j Y_j \rangle - \langle X_i Y_i \rangle \langle X_j Y_j \rangle

Using Wick's theorem to develop

\langle X_i Y_i X_j Y_j \rangle = \langle X_i X_j \rangle \langle Y_i Y_j \rangle +  \langle X_i Y_i \rangle \langle X_i Y_j \rangle + \langle X_i Y_j \rangle \langle X_j Y_i \rangle[citation needed]

and also using the independence of X and Y, we have

\operatorname{Cov}(X_i Y_i, X_j Y_j) = \langle X_i X_j \rangle \langle Y_i Y_j \rangle = M_{ij} N_{ij}

Since a covariance matrix is positive definite, this proves that the matrix with elements M_{ij} N_{ij} is a positive definite matrix.

Proof using eigendecomposition[edit]

Proof of positivity[edit]

Let M = \sum \mu_i m_i m_i^T and N = \sum \nu_i n_i n_i^T. Then

M \circ N = \sum_{ij} \mu_i \nu_j (m_i m_i^T) \circ (n_j n_j^T) = \sum_{ij} \mu_i \nu_j (m_i \circ n_j) (m_i \circ n_j)^T

Each (m_i \circ n_j) (m_i \circ n_j)^T is positive (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices) and \mu_i \nu_j > 0, thus the sum giving M \circ N is also positive.

Complete proof[edit]

To show that the result is positive definite requires further proof. We shall show that for any vector a \neq 0, we have a^T (M \circ N) a > 0. Continuing as above, each a^T (m_i \circ n_j) (m_i \circ n_j)^T a \ge 0, so it remains to show that there exist i and j for which the inequality is strict. For this we observe that

a^T (m_i \circ n_j) (m_i \circ n_j)^T a = \left(\sum_k m_{i,k} n_{j,k} a_k\right)^2

Since N is positive definite, there is a j for which n_{j,k} a_k is not 0 for all k, and then, since M is positive definite, there is an i for which m_{i,k} n_{j,k} a_k is not 0 for all k. Then for this iand j we have \left(\sum_k m_{i,k} n_{j,k} a_k\right)^2 > 0. This completes the proof.

References[edit]

  1. ^ "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik (Crelle's Journal) 1911 (140): 1–00. 1911. doi:10.1515/crll.1911.140.1.  edit
  2. ^ Zhang, Fuzhen, ed. (2005). "The Schur Complement and Its Applications". Numerical Methods and Algorithms 4. doi:10.1007/b105056. ISBN 0-387-24271-6.  edit, page 9, Ch. 0.6 Publication under J. Schur
  3. ^ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society 15 (2): 97–106. doi:10.1112/blms/15.2.97.  edit

External links[edit]