# Data processing inequality

The Data processing inequality is an information theoretic concept which states that the information content of a signal cannot be increased via a local physical operation. This can be expressed concisely as 'post-processing cannot increase information'.

## Definition

Let three random variables form the Markov chain $X\rightarrow Y\rightarrow Z$ , implying that the conditional distribution of $Z$ depends only on $Y$ and is conditionally independent of $X$ . Specifically, we have such a Markov chain if the joint probability mass function can be written as

$p(x,y,z)=p(x)p(y|x)p(z|y)$ In this setting, no processing of Y , deterministic or random, can increase the information that Y contains about X. Using the mutual information, this can be written as :

$I(X;Y)\geqslant I(X;Z)$ With the equality $I(X;Y)=I(X;Z)$ if and only if $I(X;Y|Z)=0$ , i.e. $Z$ and $Y$ contain the same information about $X$ , and $X\rightarrow Z\rightarrow Y$ also forms a Markov chain.