# Dual total correlation

In information theory, dual total correlation (Han 1978) or excess entropy (Olbrich 2008) is one of the two known non-negative generalizations of mutual information. While total correlation is bounded by the sum entropies of the n elements, the dual total correlation is bounded by the joint-entropy of the n elements. Although well behaved, dual total correlation has received much less attention than the total correlation. A measure known as "TSE-complexity" defines a continuum between the total correlation and dual total correlation (Ay 2001).

## Definition

For a set of n random variables $\{X_1,\ldots,X_n\}$, the dual total correlation $D(X_1,\ldots,X_n)$ is given by

$D(X_1,\ldots,X_n) = H\left( X_1, \ldots, X_n \right) - \sum_{i=1}^n H\left( X_i | X_1, \ldots, X_{i-1}, X_{i+1}, \ldots, X_n \right) ,$

where $H(X_{1},\ldots,X_{n})$ is the joint entropy of the variable set $\{X_{1},\ldots,X_{n}\}$ and $H(X_{i}| ... )$ is the conditional entropy of variable $X_{i}$, given the rest.

## Normalized

The dual total correlation normalized between [0,1] is simply the dual total correlation divided by its maximum value $H(X_{1}, \ldots, X_{n})$,

$ND(X_1,\ldots,X_n) = \frac{D(X_1,\ldots,X_n)}{H(X_1,\ldots,X_n)} .$

## Bounds

Dual total correlation is non-negative and bounded above by the joint entropy $H(X_1, \ldots, X_n)$.

$0 \leq D(X_1, \ldots, X_n) \leq H(X_1, \ldots, X_n) .$

Secondly, Dual total correlation has a close relationship with total correlation, $C(X_1, \ldots, X_n)$. In particular,

$\frac{C(X_1, \ldots, X_n)}{n-1} \leq D(X_1, \ldots, X_n) \leq (n-1) \; C(X_1, \ldots, X_n) .$

## History

Han (1978) originally defined the dual total correlation as,

\begin{align} & {} \qquad D(X_1,\ldots,X_n) \\[10pt] & \equiv \left[ \sum_{i=1}^n H(X_1, \ldots, X_{i-1}, X_{i+1}, \ldots, X_n ) \right] - (n-1) \; H(X_1, \ldots, X_n) \; . \end{align}

However Abdallah and Plumbley (2010) showed its equivalence to the easier-to-understand form of the joint entropy minus the sum of conditional entropies via the following:

\begin{align} & {} \qquad D(X_1,\ldots,X_n) \\[10pt] & \equiv \left[ \sum_{i=1}^n H(X_1, \ldots, X_{i-1}, X_{i+1}, \ldots, X_n ) \right] - (n-1) \; H(X_1, \ldots, X_n) \\ & = \left[ \sum_{i=1}^n H(X_1, \ldots, X_{i-1}, X_{i+1}, \ldots, X_n ) \right] + (1-n) \; H(X_1, \ldots, X_n) \\ & = H(X_1, \ldots, X_n) + \left[ \sum_{i=1}^n H(X_1, \ldots, X_{i-1}, X_{i+1}, \ldots, X_n ) - H(X_1, \ldots, X_n) \right] \\ & = H\left( X_1, \ldots, X_n \right) - \sum_{i=1}^n H\left( X_i | X_1, \ldots, X_{i-1}, X_{i+1}, \ldots, X_n \right)\; . \end{align}

## References

• Han T. S. (1978). Nonnegative entropy measures of multivariate symmetric correlations, Information and Control 36, 133–156.
• Fujishige Satoru (1978). Polymatroidal Dependence Structure of a Set of Random Variables, Information and Control 39, 55–72. doi:10.1016/S0019-9958(78)91063-X.
• Olbrich, E. and Bertschinger, N. and Ay, N. and Jost, J. (2008). How should complexity scale with system size?, The European Physical Journal B - Condensed Matter and Complex Systems. doi:10.1140/epjb/e2008-00134-9.
• Abdallah S. A. and Plumbley, M. D. (2010). A measure of statistical complexity based on predictive information, ArXiv e-prints. arXiv:1012.1890v1.
• Nihat Ay, E. Olbrich, N. Bertschinger (2001). A unifying framework for complexity measures of finite systems. European Conference on Complex Systems. pdf.