Node influence metric

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In graph theory and network analysis, node influence metrics are measures that rank or quantify the influence of every node (also called vertex) within a graph. They are related to centrality indices. Applications include measuring the influence of each person in a social network, understanding the role of infrastructure nodes in transportation networks, the Internet, or urban networks, and the participation of a given node in disease dynamics.

Origin and development[edit]

The traditional approach to understanding node importance is via centrality indicators. Centrality indices are designed to produce a ranking which accurately identifies the most influential nodes. Since the mid 2000s, however, social scientists and network physicists have begun to question the suitability of centrality indices for understanding node influence. Centralities may indicate the most influential nodes, but they are rather less informative for the vast majority of nodes which are not highly influential.

Borgatti and Everett's 2006 review article[1] showed that the accuracy of centrality indices is highly dependent on network topology. This finding has been repeatedly observed since then. (e.g.[2][3]). In 2012, Bauer and colleagues reminded us that centrality indices only rank nodes but do not quantify the difference between them.[4] In 2013, Sikic and colleagues presented strong evidence that centrality indices considerably underestimate the power of non-hub nodes.[5] The reason is quite clear. The accuracy of a centrality measure depends on network topology, but complex networks have heterogenous topology. Hence a centrality measure which is appropriate for identifying highly influential nodes will most likely be inappropriate for the remainder of the network.[3]

This has inspired the development of novel methods designed to measure the influence of all network nodes. The most general of these are the accessibility, which uses the diversity of random walks to measure how accessible the rest of the network is from a given start node,[6] and the expected force, derived from the expected value of the force of infection generated by a node.[3] Both of these measures can be meaningfully computed from the structure of the network alone.

Accessibility[edit]

The Accessibility is derived from the theory of random walks. It measures the diversity of self-avoiding walks which start from a given node. A walk on a network is a sequence of adjacent vertices; a self-avoiding walk lists visits each vertex at most once. The original work used simulated walks of length 60 to characterize the network of urban streets in a Brazilian city.[6] It was later formalized as a modified form of hierarchical degree which controls for both transmission probabilities and the diversity of walks of a given fixed length.[7]

Definition[edit]

The hierarchical degree measures the number of nodes reachable from a start node by performing walks of length . For a fixed and walk type, each of these neighbors is reached with a (potentially different) probability . Given a vector of such probabilities, the accessibility of node at scale is defined

The probabilities can be based on uniform-probability random walks, or additionally modulated by edge weights and/or explicit (per edge) transmission probabilities.[7]

Applications[edit]

The accessibility has been shown to reveal community structure in urban networks,[6] corresponds to the number of nodes which can be visited in a defined time period,[7] and is predictive of the outcome of epidemiological SIR model spreading processes on networks with large diameter and low density.[2]

Expected force[edit]

The expected force measures node influence from an epidemiological perspective. It is the expected value of the force of infection generated by the node after two transmissions.

Definition[edit]

The expected force of a node is given by

where the sum is taken over the set of all possible transmission clusters resulting from two transmissions starting from , and is the normalized cluster degree of cluster .

The definition naturally extends to directed networks by limiting the enumeration by edge direction. Likewise, extension to weighted networks, or networks with heterogeneous transmission probabilities, is a matter of adjusting the normalization of to include the probability that that cluster forms. It is also possible to use more than two transmissions to define the set .[3]

Applications[edit]

The expected force has been shown to strongly correlate with SI, SIS, and SIR epidemic outcomes over a broad range of network topologies, both simulated and empirical.[3][8] It has also been used to measure the pandemic potential of world airports,[9] and mentioned in the context of digital payments,[10] ecology,[11] fitness,[12] and project management.[13]

Other approaches[edit]

Others suggest metrics which explicitly encode the dynamics of a specified process unfolding on the network. The dynamic influence is the proportion of infinite walks starting from each node, where walk steps are scaled such that the linear dynamics of the system are expected to converge to a non-null steady state.[14] The Impact sums, over increasing walk lengths, the probability of transmission to the end node of the walk and that the end node has not been previously visited by a shorter walk.[4] While both measures well predict the outcome of the dynamical systems they encode, in each case the authors admit that results from one dynamic do not translate to other dynamics.

References[edit]

  1. ^ Borgatti, Steve; Everett, Martin (2006). "A graph-theoretic perspective on centrality". Social Networks. 28: 466–484. doi:10.1016/j.socnet.2005.11.005. 
  2. ^ a b da Silva, Renato; Viana, Matheus; da F. Costa, Luciano (2012). "Predicting epidemic outbreak from individual features of the spreaders". J. Stat Mech Theor Exp. 2012 (07): P07005. arXiv:1202.0024Freely accessible. Bibcode:2012JSMTE..07..005A. doi:10.1088/1742-5468/2012/07/p07005. 
  3. ^ a b c d e Lawyer, Glenn (2015). "Understanding the spreading power of all nodes in a network: a continuous-time perspective". Sci Rep. 5: 8665. arXiv:1405.6707Freely accessible. Bibcode:2015NatSR...5E8665L. doi:10.1038/srep08665. PMC 4345333Freely accessible. PMID 25727453. 
  4. ^ a b Bauer, Frank; Lizier, Joseph (2012). "Identifying influential spreaders and efficiently estimating infection numbers in epidemic models: A walk counting approach". Europhys Lett. 99 (6): 68007. arXiv:1203.0502Freely accessible. Bibcode:2012EL.....9968007B. doi:10.1209/0295-5075/99/68007. 
  5. ^ Sikic, Mile; Lancic, Alen; Antulov-Fantulin, Nino; Stefanic, Hrvoje (2013). "Epidemic centrality -- is there an underestimated epidemic impact of network peripheral nodes?". The European Physical Journal B. 86 (10): 1–13. arXiv:1110.2558Freely accessible. doi:10.1140/epjb/e2013-31025-5. 
  6. ^ a b c Travencolo, B. a. N.; da F. Costa, Luciano (2008). "Accessibility in complex networks". Phys Lett A. 373 (1): 89–95. doi:10.1016/j.physleta.2008.10.069. 
  7. ^ a b c Viana, Matheus; Batista, Joao; da F. Costa, Luciano (2012). "Effective number of accessed nodes in complex networks". Phys Rev E. 85 (3 pt 2): 036105. 
  8. ^ Lawyer, Glenn (2014). "Technical Report: Performance of the Expected Force on AS-level Internet topologies". arXiv:1406.4785Freely accessible. Bibcode:2014arXiv1406.4785L. 
  9. ^ Lawyer, Glenn (2016). "Measuring the potential of individual airports for pandemic spread over the world airline network". BMC Infectious Diseases. 16: 70. doi:10.1186/s12879-016-1350-4. PMC 4746766Freely accessible. PMID 26861206. 
  10. ^ Milkau, Udo; Bott, Jürgen (2015). "Digitalisation in payments: From interoperability to centralised models?". Journal of Payments Strategy & Systems. 9 (3). 
  11. ^ Jordan, Lyndon; Maguire, Sean; Hofmann, Hans; Kohda, Masanori (2016). "The social and ecological costs of an 'over-extended' phenotype". Proceedings of the Royal Society B. 283 (1822): 20152359. doi:10.1098/rspb.2015.2359. PMC 4721094Freely accessible. PMID 26740619. 
  12. ^ Pereira, Vanessa; Gama, Maria; Sousa, Filipe; Lewis, Theodore; Gobatto, Claudio; Manchado-Gobatto, Fúlvia (2015). "Complex network models reveal correlations among network metrics, exercise intensity and role of body changes in the fatigue process". Scientific Reports. 5: 10489. Bibcode:2015NatSR...510489P. doi:10.1038/srep10489. PMC 4440209Freely accessible. PMID 25994386. 
  13. ^ Ellinas, Christos; Allan, Neil; Durugbo, Christopher; Johansson, Anders (2015). "How Robust Is Your Project? From Local Failures to Global Catastrophes: A Complex Networks Approach to Project Systemic Risk". PLoS One. 10: e0142469. Bibcode:2015PLoSO..1042469E. doi:10.1371/journal.pone.0142469. PMC 4659599Freely accessible. PMID 26606518. 
  14. ^ Klemm, Konstantin; Serrano, M Angeles; Eguiluz, Victor; Miguel, Maxi San (2012). "A measure of individual role in collective dynamics". Sci Rep. 2: 292. Bibcode:2012NatSR...2E.292K. doi:10.1038/srep00292.