# Transfer entropy

Transfer entropy is a non-parametric statistic measuring the amount of directed (time-asymmetric) transfer of information between two random processes.[1][2][3] Transfer entropy from a process X to another process Y is the amount of uncertainty reduced in future values of Y by knowing the past values of X given past values of Y. More specifically, if ${\displaystyle X_{t}}$ and ${\displaystyle Y_{t}}$ for ${\displaystyle t\in \mathbb {N} }$ denote two random processes and the amount of information is measured using Shannon's entropy, the transfer entropy can be written as:

${\displaystyle T_{X\rightarrow Y}=H\left(Y_{t}\mid Y_{t-1:t-L}\right)-H\left(Y_{t}\mid Y_{t-1:t-L},X_{t-1:t-L}\right),}$

where H(X) is Shannon's entropy of X. The above definition of transfer entropy has been extended by other types of entropy measures such as Rényi entropy.[3][4]

Transfer entropy is conditional mutual information,[5][6] with the history of the influenced variable ${\displaystyle Y_{t-1:t-L}}$ in the condition:

${\displaystyle T_{X\rightarrow Y}=I(Y_{t};X_{t-1:t-L}\mid Y_{t-1:t-L}).}$

Transfer entropy reduces to Granger causality for vector auto-regressive processes.[7] Hence, it is advantageous when the model assumption of Granger causality doesn't hold, for example, analysis of non-linear signals.[8][9] However, it usually requires more samples for accurate estimation.[10] The probabilities in the entropy formula can be estimated using different approaches (binning, nearest neighbors) or, in order to reduce complexity, using a non-uniform embedding.[11] While it was originally defined for bivariate analysis, transfer entropy has been extended to multivariate forms, either conditioning on other potential source variables[12] or considering transfer from a collection of sources,[13] although these forms require more samples again.

Transfer entropy has been used for estimation of functional connectivity of neurons,[13][14][15] social influence in social networks[8] and statistical causality between armed conflict events.[16] Transfer entropy is a finite version of the directed information which was defined in 1990 by James Massey[17] as ${\displaystyle I(X^{n}\to Y^{n})=\sum _{i=1}^{n}I(X^{i};Y_{i}|Y^{i-1})}$, where ${\displaystyle X^{n}}$ denotes the vector ${\displaystyle X_{1},X_{2},...,X_{n}}$ and ${\displaystyle Y^{n}}$ denotes ${\displaystyle Y_{1},Y_{2},...,Y_{n}}$. The directed information places an important role in characterizing the fundamental limits (channel capacity) of communication channels with or without feedback[18][19] and gambling with causal side information.[20]

## References

1. ^ Schreiber, Thomas (1 July 2000). "Measuring information transfer". Physical Review Letters. 85 (2): 461–464. arXiv:nlin/0001042. Bibcode:2000PhRvL..85..461S. doi:10.1103/PhysRevLett.85.461. PMID 10991308. S2CID 7411376.
2. ^ Seth, Anil (2007). "Granger causality". Scholarpedia. 2 (7): 1667. Bibcode:2007SchpJ...2.1667S. doi:10.4249/scholarpedia.1667.
3. ^ a b Hlaváčková-Schindler, Katerina; Palus, M; Vejmelka, M; Bhattacharya, J (1 March 2007). "Causality detection based on information-theoretic approaches in time series analysis". Physics Reports. 441 (1): 1–46. Bibcode:2007PhR...441....1H. CiteSeerX 10.1.1.183.1617. doi:10.1016/j.physrep.2006.12.004.
4. ^ Jizba, Petr; Kleinert, Hagen; Shefaat, Mohammad (2012-05-15). "Rényi's information transfer between financial time series". Physica A: Statistical Mechanics and Its Applications. 391 (10): 2971–2989. arXiv:1106.5913. Bibcode:2012PhyA..391.2971J. doi:10.1016/j.physa.2011.12.064. ISSN 0378-4371. S2CID 51789622.
5. ^ Wyner, A. D. (1978). "A definition of conditional mutual information for arbitrary ensembles". Information and Control. 38 (1): 51–59. doi:10.1016/s0019-9958(78)90026-8.
6. ^ Dobrushin, R. L. (1959). "General formulation of Shannon's main theorem in information theory". Uspekhi Mat. Nauk. 14: 3–104.
7. ^ Barnett, Lionel (1 December 2009). "Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables". Physical Review Letters. 103 (23): 238701. arXiv:0910.4514. Bibcode:2009PhRvL.103w8701B. doi:10.1103/PhysRevLett.103.238701. PMID 20366183. S2CID 1266025.
8. ^ a b Ver Steeg, Greg; Galstyan, Aram (2012). "Information transfer in social media". Proceedings of the 21st international conference on World Wide Web (WWW '12). ACM. pp. 509–518. arXiv:1110.2724. Bibcode:2011arXiv1110.2724V.
9. ^ Lungarella, M.; Ishiguro, K.; Kuniyoshi, Y.; Otsu, N. (1 March 2007). "Methods for quantifying the causal structure of bivariate time series". International Journal of Bifurcation and Chaos. 17 (3): 903–921. Bibcode:2007IJBC...17..903L. CiteSeerX 10.1.1.67.3585. doi:10.1142/S0218127407017628.
10. ^ Pereda, E; Quiroga, RQ; Bhattacharya, J (Sep–Oct 2005). "Nonlinear multivariate analysis of neurophysiological signals". Progress in Neurobiology. 77 (1–2): 1–37. arXiv:nlin/0510077. Bibcode:2005nlin.....10077P. doi:10.1016/j.pneurobio.2005.10.003. PMID 16289760. S2CID 9529656.
11. ^ Montalto, A; Faes, L; Marinazzo, D (Oct 2014). "MuTE: A MATLAB Toolbox to Compare Established and Novel Estimators of the Multivariate Transfer Entropy". PLOS ONE. 9 (10): e109462. Bibcode:2014PLoSO...9j9462M. doi:10.1371/journal.pone.0109462. PMC 4196918. PMID 25314003.
12. ^ Lizier, Joseph; Prokopenko, Mikhail; Zomaya, Albert (2008). "Local information transfer as a spatiotemporal filter for complex systems". Physical Review E. 77 (2): 026110. arXiv:0809.3275. Bibcode:2008PhRvE..77b6110L. doi:10.1103/PhysRevE.77.026110. PMID 18352093. S2CID 15634881.
13. ^ a b Lizier, Joseph; Heinzle, Jakob; Horstmann, Annette; Haynes, John-Dylan; Prokopenko, Mikhail (2011). "Multivariate information-theoretic measures reveal directed information structure and task relevant changes in fMRI connectivity". Journal of Computational Neuroscience. 30 (1): 85–107. doi:10.1007/s10827-010-0271-2. PMID 20799057. S2CID 3012713.
14. ^ Vicente, Raul; Wibral, Michael; Lindner, Michael; Pipa, Gordon (February 2011). "Transfer entropy—a model-free measure of effective connectivity for the neurosciences". Journal of Computational Neuroscience. 30 (1): 45–67. doi:10.1007/s10827-010-0262-3. PMC 3040354. PMID 20706781.
15. ^ Shimono, Masanori; Beggs, John (October 2014). "Functional clusters, hubs, and communities in the cortical microconnectome". Cerebral Cortex. 25 (10): 3743–57. doi:10.1093/cercor/bhu252. PMC 4585513. PMID 25336598.
16. ^ Kushwaha, Niraj; Lee, Edward D (July 2023). "Discovering the mesoscale for chains of conflict". PNAS Nexus. 2 (7): pgad228. doi:10.1093/pnasnexus/pgad228. ISSN 2752-6542. PMC 10392960. PMID 37533894.
17. ^ Massey, James (1990). "Causality, Feedback And Directed Information" (ISITA). CiteSeerX 10.1.1.36.5688. {{cite journal}}: Cite journal requires |journal= (help)
18. ^ Permuter, Haim Henry; Weissman, Tsachy; Goldsmith, Andrea J. (February 2009). "Finite State Channels With Time-Invariant Deterministic Feedback". IEEE Transactions on Information Theory. 55 (2): 644–662. arXiv:cs/0608070. doi:10.1109/TIT.2008.2009849. S2CID 13178.
19. ^ Kramer, G. (January 2003). "Capacity results for the discrete memoryless network". IEEE Transactions on Information Theory. 49 (1): 4–21. doi:10.1109/TIT.2002.806135.
20. ^ Permuter, Haim H.; Kim, Young-Han; Weissman, Tsachy (June 2011). "Interpretations of Directed Information in Portfolio Theory, Data Compression, and Hypothesis Testing". IEEE Transactions on Information Theory. 57 (6): 3248–3259. arXiv:0912.4872. doi:10.1109/TIT.2011.2136270. S2CID 11722596.