From Wikipedia, the free encyclopedia
In mathematics, the log sum inequality is an inequality which is useful for proving several theorems in information theory .
Statement
Let
a
1
,
…
,
a
n
{\displaystyle a_{1},\ldots ,a_{n}}
and
b
1
,
…
,
b
n
{\displaystyle b_{1},\ldots ,b_{n}}
be nonnegative numbers. Denote the sum of all
a
i
{\displaystyle a_{i}\;}
s by
a
{\displaystyle a}
and the sum of all
b
i
{\displaystyle b_{i}\;}
s by
b
{\displaystyle b}
. The log sum inequality states that
∑
i
=
1
n
a
i
log
a
i
b
i
≥
a
log
a
b
,
{\displaystyle \sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}\geq a\log {\frac {a}{b}},}
with equality if and only if
a
i
b
i
{\displaystyle {\frac {a_{i}}{b_{i}}}}
are equal for all
i
{\displaystyle i}
.
Proof
Notice that after setting
f
(
x
)
=
x
log
x
{\displaystyle f(x)=x\log x}
we have
∑
i
=
1
n
a
i
log
a
i
b
i
=
∑
i
=
1
n
b
i
f
(
a
i
b
i
)
=
b
∑
i
=
1
n
b
i
b
f
(
a
i
b
i
)
≥
b
f
(
∑
i
=
1
n
b
i
b
a
i
b
i
)
=
b
f
(
1
b
∑
i
=
1
n
a
i
)
=
b
f
(
a
b
)
=
a
log
a
b
,
{\displaystyle {\begin{aligned}\sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}&{}=\sum _{i=1}^{n}b_{i}f\left({\frac {a_{i}}{b_{i}}}\right)=b\sum _{i=1}^{n}{\frac {b_{i}}{b}}f\left({\frac {a_{i}}{b_{i}}}\right)\\&{}\geq bf\left(\sum _{i=1}^{n}{\frac {b_{i}}{b}}{\frac {a_{i}}{b_{i}}}\right)=bf\left({\frac {1}{b}}\sum _{i=1}^{n}a_{i}\right)=bf\left({\frac {a}{b}}\right)\\&{}=a\log {\frac {a}{b}},\end{aligned}}}
where the inequality follows from Jensen's inequality since
b
i
b
≥
0
{\displaystyle {\frac {b_{i}}{b}}\geq 0}
,
∑
i
b
i
b
=
1
{\displaystyle \sum _{i}{\frac {b_{i}}{b}}=1}
, and
f
{\displaystyle f}
is convex.
Applications
The log sum inequality can be used to prove several inequalities in information theory such as Gibbs' inequality or the convexity of Kullback-Leibler divergence .
For example, to prove Gibbs' inequality it is enough to substitute
p
i
{\displaystyle p_{i}\;}
s for
a
i
{\displaystyle a_{i}\;}
s, and
q
i
{\displaystyle q_{i}\;}
s for
b
i
{\displaystyle b_{i}\;}
s to get
D
K
L
(
P
‖
Q
)
≡
∑
i
=
1
n
p
i
log
2
p
i
q
i
≥
1
log
1
1
=
0.
{\displaystyle D_{\mathrm {KL} }(P\|Q)\equiv \sum _{i=1}^{n}p_{i}\log _{2}{\frac {p_{i}}{q_{i}}}\geq 1\log {\frac {1}{1}}=0.}
Generalizations
The inequality remains valid for
n
=
∞
{\displaystyle n=\infty }
provided that
a
<
∞
{\displaystyle a<\infty }
and
b
<
∞
{\displaystyle b<\infty }
.
The proof above holds for any function
g
{\displaystyle g}
such that
f
(
x
)
=
x
g
(
x
)
{\displaystyle f(x)=xg(x)}
is convex, such as all continuous non-decreasing functions. Generalizations to convex functions other than the logarithm is given in Csiszár, 2004.
References
T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. ISBN 0-8218-0534-7.
Information Theory course materials, Utah State University [1] . Retrieved on 2009-06-14.
Csiszár, I. ; Shields, P. (2004). "Information Theory and Statistics: A Tutorial" (PDF) . Foundations and Trends in Communications and Information Theory . 1 (4): 417–528. doi :10.1561/0100000004 . Retrieved 2009-06-14 .