In mathematics , particularly in linear algebra , the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur [1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik .[2] [3] )
Proof
Proof using the trace formula
For any matrices
M
{\displaystyle M}
and
N
{\displaystyle N}
, the Hadamard product
M
∘
N
{\displaystyle M\circ N}
considered as a bilinear form acts on vectors
a
,
b
{\displaystyle a,b}
as
a
∗
(
M
∘
N
)
b
=
tr
(
M
T
diag
(
a
∗
)
N
diag
(
b
)
)
{\displaystyle a^{*}(M\circ N)b=\operatorname {tr} (M^{T}\operatorname {diag} (a^{*})N\operatorname {diag} (b))}
where
tr
{\displaystyle \operatorname {tr} }
is the matrix trace and
diag
(
a
)
{\displaystyle \operatorname {diag} (a)}
is the diagonal matrix having as diagonal entries the elements of
a
{\displaystyle a}
.
Suppose
M
{\displaystyle M}
and
N
{\displaystyle N}
are positive definite, and so Hermitian . We can consider their square-roots
M
1
2
{\displaystyle M^{\frac {1}{2}}}
and
N
1
2
{\displaystyle N^{\frac {1}{2}}}
, which are also Hermitian, and write
tr
(
M
T
diag
(
a
∗
)
N
diag
(
b
)
)
=
tr
(
M
¯
1
2
M
¯
1
2
diag
(
a
∗
)
N
1
2
N
1
2
diag
(
b
)
)
=
tr
(
M
¯
1
2
diag
(
a
∗
)
N
1
2
N
1
2
diag
(
b
)
M
¯
1
2
)
{\displaystyle \operatorname {tr} (M^{T}\operatorname {diag} (a^{*})N\operatorname {diag} (b))=\operatorname {tr} ({\overline {M}}^{\frac {1}{2}}{\overline {M}}^{\frac {1}{2}}\operatorname {diag} (a^{*})N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b))=\operatorname {tr} ({\overline {M}}^{\frac {1}{2}}\operatorname {diag} (a^{*})N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b){\overline {M}}^{\frac {1}{2}})}
Then, for
a
=
b
{\displaystyle a=b}
, this is written as
tr
(
A
∗
A
)
{\displaystyle \operatorname {tr} (A^{*}A)}
for
A
=
N
1
2
diag
(
a
)
M
¯
1
2
{\displaystyle A=N^{\frac {1}{2}}\operatorname {diag} (a){\overline {M}}^{\frac {1}{2}}}
and thus is strictly positive for
A
≠
0
{\displaystyle A\neq 0}
, which occurs if and only if
a
≠
0
{\displaystyle a\neq 0}
. This shows that
(
M
∘
N
)
{\displaystyle (M\circ N)}
is a positive definite matrix.
Proof using Gaussian integration
Case of M = N
Let
X
{\displaystyle X}
be an
n
{\displaystyle n}
-dimensional centered Gaussian random variable with covariance
⟨
X
i
X
j
⟩
=
M
i
j
{\displaystyle \langle X_{i}X_{j}\rangle =M_{ij}}
.
Then the covariance matrix of
X
i
2
{\displaystyle X_{i}^{2}}
and
X
j
2
{\displaystyle X_{j}^{2}}
is
Cov
(
X
i
2
,
X
j
2
)
=
⟨
X
i
2
X
j
2
⟩
−
⟨
X
i
2
⟩
⟨
X
j
2
⟩
{\displaystyle \operatorname {Cov} (X_{i}^{2},X_{j}^{2})=\langle X_{i}^{2}X_{j}^{2}\rangle -\langle X_{i}^{2}\rangle \langle X_{j}^{2}\rangle }
Using Wick's theorem to develop
⟨
X
i
2
X
j
2
⟩
=
2
⟨
X
i
X
j
⟩
2
+
⟨
X
i
2
⟩
⟨
X
j
2
⟩
{\displaystyle \langle X_{i}^{2}X_{j}^{2}\rangle =2\langle X_{i}X_{j}\rangle ^{2}+\langle X_{i}^{2}\rangle \langle X_{j}^{2}\rangle }
we have
Cov
(
X
i
2
,
X
j
2
)
=
2
⟨
X
i
X
j
⟩
2
=
2
M
i
j
2
{\displaystyle \operatorname {Cov} (X_{i}^{2},X_{j}^{2})=2\langle X_{i}X_{j}\rangle ^{2}=2M_{ij}^{2}}
Since a covariance matrix is positive definite, this proves that the matrix with elements
M
i
j
2
{\displaystyle M_{ij}^{2}}
is a positive definite matrix.
General case
Let
X
{\displaystyle X}
and
Y
{\displaystyle Y}
be
n
{\displaystyle n}
-dimensional centered Gaussian random variables with covariances
⟨
X
i
X
j
⟩
=
M
i
j
{\displaystyle \langle X_{i}X_{j}\rangle =M_{ij}}
,
⟨
Y
i
Y
j
⟩
=
N
i
j
{\displaystyle \langle Y_{i}Y_{j}\rangle =N_{ij}}
and independent from each other so that we have
⟨
X
i
Y
j
⟩
=
0
{\displaystyle \langle X_{i}Y_{j}\rangle =0}
for any
i
,
j
{\displaystyle i,j}
Then the covariance matrix of
X
i
Y
i
{\displaystyle X_{i}Y_{i}}
and
X
j
Y
j
{\displaystyle X_{j}Y_{j}}
is
Cov
(
X
i
Y
i
,
X
j
Y
j
)
=
⟨
X
i
Y
i
X
j
Y
j
⟩
−
⟨
X
i
Y
i
⟩
⟨
X
j
Y
j
⟩
{\displaystyle \operatorname {Cov} (X_{i}Y_{i},X_{j}Y_{j})=\langle X_{i}Y_{i}X_{j}Y_{j}\rangle -\langle X_{i}Y_{i}\rangle \langle X_{j}Y_{j}\rangle }
Using Wick's theorem to develop
⟨
X
i
Y
i
X
j
Y
j
⟩
=
⟨
X
i
X
j
⟩
⟨
Y
i
Y
j
⟩
+
⟨
X
i
Y
i
⟩
⟨
X
j
Y
j
⟩
+
⟨
X
i
Y
j
⟩
⟨
X
j
Y
i
⟩
{\displaystyle \langle X_{i}Y_{i}X_{j}Y_{j}\rangle =\langle X_{i}X_{j}\rangle \langle Y_{i}Y_{j}\rangle +\langle X_{i}Y_{i}\rangle \langle X_{j}Y_{j}\rangle +\langle X_{i}Y_{j}\rangle \langle X_{j}Y_{i}\rangle }
and also using the independence of
X
{\displaystyle X}
and
Y
{\displaystyle Y}
, we have
Cov
(
X
i
Y
i
,
X
j
Y
j
)
=
⟨
X
i
X
j
⟩
⟨
Y
i
Y
j
⟩
=
M
i
j
N
i
j
{\displaystyle \operatorname {Cov} (X_{i}Y_{i},X_{j}Y_{j})=\langle X_{i}X_{j}\rangle \langle Y_{i}Y_{j}\rangle =M_{ij}N_{ij}}
Since a covariance matrix is positive definite, this proves that the matrix with elements
M
i
j
N
i
j
{\displaystyle M_{ij}N_{ij}}
is a positive definite matrix.
Proof using eigendecomposition
Proof of positive semidefiniteness
Let
M
=
∑
μ
i
m
i
m
i
T
{\displaystyle M=\sum \mu _{i}m_{i}m_{i}^{T}}
and
N
=
∑
ν
i
n
i
n
i
T
{\displaystyle N=\sum \nu _{i}n_{i}n_{i}^{T}}
. Then
M
∘
N
=
∑
i
j
μ
i
ν
j
(
m
i
m
i
T
)
∘
(
n
j
n
j
T
)
=
∑
i
j
μ
i
ν
j
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
{\displaystyle M\circ N=\sum _{ij}\mu _{i}\nu _{j}(m_{i}m_{i}^{T})\circ (n_{j}n_{j}^{T})=\sum _{ij}\mu _{i}\nu _{j}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}}
Each
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
{\displaystyle (m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}}
is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices). Also,
μ
i
ν
j
>
0
{\displaystyle \mu _{i}\nu _{j}>0}
thus the sum
M
∘
N
{\displaystyle M\circ N}
is also positive semidefinite.
Proof of definiteness
To show that the result is positive definite requires further proof. We shall show that for any vector
a
≠
0
{\displaystyle a\neq 0}
, we have
a
T
(
M
∘
N
)
a
>
0
{\displaystyle a^{T}(M\circ N)a>0}
. Continuing as above, each
a
T
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
a
≥
0
{\displaystyle a^{T}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}a\geq 0}
, so it remains to show that there exist
i
{\displaystyle i}
and
j
{\displaystyle j}
for which the inequality is strict. For this we observe that
a
T
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
a
=
(
∑
k
m
i
,
k
n
j
,
k
a
k
)
2
{\displaystyle a^{T}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}a=\left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}}
Since
N
{\displaystyle N}
is positive definite, there is a
j
{\displaystyle j}
for which
n
j
,
k
a
k
{\displaystyle n_{j,k}a_{k}}
is not 0 for all
k
{\displaystyle k}
, and then, since
M
{\displaystyle M}
is positive definite, there is an
i
{\displaystyle i}
for which
m
i
,
k
n
j
,
k
a
k
{\displaystyle m_{i,k}n_{j,k}a_{k}}
is not 0 for all
k
{\displaystyle k}
. Then for this
i
{\displaystyle i}
and
j
{\displaystyle j}
we have
(
∑
k
m
i
,
k
n
j
,
k
a
k
)
2
>
0
{\displaystyle \left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}>0}
. This completes the proof.
References
^ "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik (Crelle's Journal) . 1911 (140): 1–28. 1911. doi :10.1515/crll.1911.140.1 .
^ Zhang, Fuzhen, ed. (2005). "The Schur Complement and Its Applications". Numerical Methods and Algorithms. 4 . doi :10.1007/b105056 . ISBN 0-387-24271-6 . , page 9, Ch. 0.6 Publication under J. Schur
^ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society . 15 (2): 97–106. doi :10.1112/blms/15.2.97 .
External links