Betweenness centrality

From Wikipedia, the free encyclopedia
  (Redirected from Betweenness Centrality)
Jump to: navigation, search

Betweenness centrality is an indicator of a node's centrality in a network. It is equal to the number of shortest paths from all vertices to all others that pass through that node. A node with high betweenness centrality has large influence to the transfer of items through the network, under the assumption that transfer follows shortest paths. The concept finds wide application, including computer and social networks,[1][2] biology,[3][4] transport,[5] [6] scientific cooperation.[7] Development of betweenness centrality is generally attributed to sociologist Linton Freeman.[8] The idea was earlier proposed by mathematician J. Anthonisse, but his work was never published.[9]

Definition[edit]

The betweenness centrality of a node v is given by the expression:

g(v)= \sum_{s \neq v \neq t}\frac{\sigma_{st}(v)}{\sigma_{st}}

where \sigma_{st} is the total number of shortest paths from node s to node t and \sigma_{st}(v) is the number of those paths that pass through v.

Note that the betweenness centrality of a node scales with the number of pairs of nodes as implied by the summation indices. Therefore the calculation may be rescaled by dividing through by the number of pairs of nodes not including v, so that g \in [0,1]. The division is done by (N-1)(N-2) for directed graphs and (N-1)(N-2)/2 for undirected graphs, where N is the number of nodes in the giant component. Note that this scales for the highest possible value, where one node is crossed by every single shortest path. This is often not the case, and a normalization can be performed without a loss of precision

\mbox{normal}(g(v)) = \frac{g(v) - \min(g)}{\max(g) - \min(g)}

which results in:

\max(normal) = 1
\min(normal) = 0

Note that this will always be a scaling from a smaller range into a larger range, so no precision is lost.

The load distribution in real and model networks[edit]

Model networks[edit]

It has been shown that the load distribution of a scale-free network follows a power law given by a load exponent \delta,[10]

P(g) \approx g^{-\delta} (1)

this implies the scaling relation to the degree of the node,

g(k) \approx k^\eta.

Where g(k) is the average load of vertices with degree k. The exponents \delta and \eta are not independent since equation (1) implies [11]

P(g)= \int dk P(k) \delta (g-k^\eta)

For large g, and therefore large k, the expression becomes

P(g\gg1)= \int dk k^{-\gamma} \delta (g-k^\eta)
\sim g^{-1-\frac{\gamma -1}{\eta}}

which proves the following equality:

\eta=\frac{\gamma -1}{\delta -1}

The important exponent appears to be \eta which describes how the betweenness centrality depends on the connectivity. The situation which maximizes the betweenness centrality for a vertex is when all shortest paths are going through it, which corresponds to a tree structure (a network with no clustering). In the case of a tree network the maximum value of \eta = 2 is reached.[11]

\eta = 2 \rarr \delta = \frac{\gamma +1}{2}

This maximal value of \eta (and hence minimum of \delta) puts bounds on the load exponents for networks with non-vanishing clustering.

\eta \le 2 \rarr \delta \ge \frac{\gamma +1}{2}

In this case, the exponents \delta , \eta are not universal and depend on the different details (average connectivity, correlations, etc.)

Real networks[edit]

Real world scale free networks, such as the internet, also follow a power law load distribution.[12] This is an intuitive result. Scale free networks arrange themselves to create short path lengths across the network by creating a few hub nodes with much higher connectivity than the majority of the network. These hubs will naturally experience much higher loads because of this added connectivity.

Weighted networks[edit]

In a weighted network the links connecting the nodes are no longer treated as binary interactions, but are weighted in proportion to their capacity, influence, frequency, etc., which adds another dimension of heterogeneity within the network beyond the topological effects. A node's strength in a weighted network is given by the sum of the weights of its adjacent edges.

s_{i} = \sum_{j=1}^{N} a_{ij}w_{ij}

With a_{ij} and w_{ij} being adjacency and weight matricies between nodes i and j, respectively. Analogous to the power law distribution of degree found in scale free networks, the strength of a given node follows a power law distribution as well.

s(k) \approx k^\beta \,

A study of the average value s(b) of the strength for vertices with betweenness b shows that the functional behavior can be approximated by a scaling form [13]

s(b)\approx b^{\alpha}

Algorithms[edit]

Calculating the betweenness and closeness centralities of all the vertices in a graph involves calculating the shortest paths between all pairs of vertices on a graph, which takes \Theta(|V|^3) time with the Floyd–Warshall algorithm, modified to not only find one but count all shortest paths between two nodes. On a sparse graph, Johnson's algorithm may be more efficient, taking O(|V|^2 \log |V| + |V| |E|) time. On unweighted graphs, calculating betweenness centrality takes O(|V| |E|) time using Brandes' algorithm.[14]

In calculating betweenness and closeness centralities of all vertices in a graph, it is assumed that graphs are undirected and connected with the allowance of loops and multiple edges. When specifically dealing with network graphs, often graphs are without loops or multiple edges to maintain simple relationships (where edges represent connections between two people or vertices). In this case, using Brandes' algorithm will divide final centrality scores by 2 to account for each shortest path being counted twice.[14]

Another algorithm generalizes the Freeman's betweenness computed on geodesics and Newman's betweenness computed on all paths, by introducing a hyper-parameter controlling the trade-off between exploration and exploitation. The time complexity is the number of edges times the number of nodes in the graph.[15]

The concept of centrality was extended to a group level as well.[16] Group betweenness centrality shows the proportion of geodesics connecting pairs of non-group members that pass through a group of nodes. Brandes' algorithm for computing the betweenness centrality of all vertices was modified to compute the group betweenness centrality of one group of nodes with the same asymptotic running time.[16]

Related concepts[edit]

Betweenness centrality is related to a network's connectivity). Network connectivity is the minimum number of elements which need to be removed to disconnect the remaining nodes from each other.

See also[edit]

References[edit]

  1. ^ Brandes, Ulrik (2008). "On variants of shortest-path betweenness centrality and their generic computation". Social Networks 30: 136–145. doi:10.1016/j.socnet.2007.11.001. 
  2. ^ Cuzzocrea, Alfredo; Papadimitriou, Alexis; Katsaros, Dimitrios; Manopoulus, Yanis (2012). "Edge betweenness centrality: A novel algorithm for QoS-based topology control over wireless sensor networks". Journal of Network and Computer Applications 35: 1210–1217. doi:10.1016/j.jnca.2011.06.001. 
  3. ^ Estrada, Ernesto (2007). "Characterization of topological keystone species Local, global and meso-scale centralities in food webs". Ecological Complexity 4: 48–57. 
  4. ^ Martin Gonzalez, Ana M.; Dalsgaard, Bo; Olesen, Jens M. (2010). "Centrality measures and the importance of generalist species in pollination networks". Ecological Complexity 7: 36–43. doi:10.1016/j.ecocom.2009.03.008. 
  5. ^ Wang, Jiaoe; Huihui, Mo; Wang, Fahui; Jin, Fengjun (2011). "Exploring the network structure and nodal centrality of China’s air transport network: A complex network approach". Journal of Transport Geography 19: 712–721. doi:10.1016/j.jtrangeo.2010.08.012. 
  6. ^ Rodriguez-Deniz, Hector (2012). "Using SAS® to Measure Airport Connectivity: An Application of Weighted Betweenness Centrality for the FAA National Plan of Integrated Airport Systems (NPIAS)". Proceedings of the SAS Global Forum 2012, Paper 162-2012. 
  7. ^ Abassi, Alireza; Hossain, Liaquat; Leydesdorff, Loet (2012). "Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks". Journal of Informetrics 6: 403–412. doi:10.1016/j.joi.2012.01.002. 
  8. ^ Freeman, Linton (1977). "A set of measures of centrality based on betweenness". Sociometry 40: 35–41. doi:10.2307/3033543. 
  9. ^ Newman, M.E.J. (2010). Networks: An Introduction. Oxford, UK: Oxford University Press. ISBN 978-0199206650. 
  10. ^ K. I. Goh, B. Kahng, D. Kim (12 Dec 2001). "Universal Behavior of Load Distribution in Scale-Free Networks". Physical Review Letters. 87.278701 (27). doi:10.1103/physrevlett.87.278701. 
  11. ^ a b M. Barthélemy Eur. Phys. J. B 38, 163–168 (2004)
  12. ^ Kwang-Il Goh, Eulsik Oh, Hawoong Jeong, Byungnam Kahng, and Doochul Kim. PNAS (2002) vol. 99 no. 2
  13. ^ A. Barrat, M. Barthelemy, R. Pastor-Satorras, and A. Vespignani. PNAS (2004) vol. 101 no. 11
  14. ^ a b Ulrik Brandes. A faster algorithm for betweenness centrality (PDF). 
  15. ^ Amin Mantrach et al. The sum-over-paths covariance kernel: A novel covariance measure between nodes of a directed graph", Pattern Analysis and Machine Intelligence, IEEE Transactions, 32(6), pages 1112–1126, 2010. 
  16. ^ a b Puzis, R., Yagil, D., Elovici, Y., Braha, D. (2009)Collaborative attack on Internet users’ anonymity, Internet Research 19(1)