Adjusted mutual information

From Wikipedia, the free encyclopedia
  (Redirected from Adjusted Mutual Information)
Jump to: navigation, search

Adjusted mutual information (AMI) is used for comparing clusterings.[1] It corrects the effect of agreement solely due to chance between clusterings, similar to the way the adjusted rand index corrects the Rand index. It is closely related to variation of information[2]: when a similar adjustment is made to the VI index, it becomes equivalent to the AMI.[1] The adjusted measure however is no longer metrical.[3]

Contents

[edit] Preliminaries: Mutual Information

Let S be a set of N data points \{s_1, s_2,\ldots s_N\}. We consider the case of hard clustering. Given two clusterings of S, namely U=\{U_1, U_2,\ldots, U_R\} with R clusters, and V=\{V_1, V_2,\ldots, V_C\} with C clusters (\cap_{i=1}^RU_i=\cap_{j=1}^C V_j=\emptyset,
\cup_{i=1}^RU_i=\cup_{j=1}^C V_j=S), the information on cluster overlap between U and V can be summarized in the form of a RxC contingency table M=[n_{ij}]^{i=1 \ldots
R}_{j=1 \ldots C} where nij denotes the number of objects that are common to clusters Ui and Vj.

Suppose that we pick an object at random from S, then the probability that the object falls into cluster Ui is:

P(i)=\frac{|U_i|}{N}

We define the entropy associated with the clustering U as:

H(U)=-\sum_{i=1}^R P(i)\log P(i)

H(U) is non-negative and takes the value 0 only when there is no uncertainty determining an object's cluster membership, i.e., there is only one cluster. Similarly, the entropy of the clustering V can be calculated as:

H(V)=-\sum_{j=1}^C P'(j)\log P'(j) where P'(j) = | Vj | / n.

Now we arrive at the mutual information (MI) between two clusterings[citation needed]:

MI(U,V)=\sum_{i=1}^R \sum_{j=1}^C P(i,j)\log \frac{P(i,j)}{P(i)P'(j)}

where P(i,j) denotes the probability that a point belongs to cluster Ui in U and cluster Vj in V:

P(i,j)=\frac{|U_i \cap V_j|}{N}

MI is a non-negative quantity upper bounded by the entropies H(U) and H(V). It quantifies the information shared by the two clusterings and thus can be employed as a clustering similarity measure.

[edit] Adjustment for chance

Like the Rand index, the base line value of mutual information between two random clusterings does not take on a constant value, and tends to be larger when the two clusterings have a larger number of clusters (with a fixed number of data point N). By adopting a hypergeometric model of randomness, it can be shown that the expected mutual information between two random clusterings is:

E\{MI(U,V)\}=\sum_{i=1}^R \sum_{j=1}^C \sum_{n_{ij}=(a_i+b_j-N)^+
}^{\min(a_i, b_j)} \frac{n_{ij}}{N}\log ( \frac{ N.n_{ij}}{a_i b_j})
\frac{a_i!b_j!(N-a_i)!(N-b_j)!}{N!n_{ij}!(a_i-n_{ij})!(b_j-n_{ij})!(N-a_i-b_j+n_{ij})!}

where (ai + bjN) + denotes max(0,ai + bjN).

Then the adjusted measure[1] for the mutual information can be:

 AMI(U,V)= \frac{MI(U,V)-E\{MI(U,V)\}} {\max{\{H(U),H(V)\}}-E\{MI(U,V)\}}

which takes a value of 1 when the two clusterings are identical and 0 when the MI between two clusterings equals to that expected by chance.

[edit] References

  1. ^ a b c Vinh, N. X.; Epps, J.; Bailey, J. (2009). "Information theoretic measures for clusterings comparison". Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09. pp. 1. doi:10.1145/1553374.1553511. ISBN 9781605585161.  edit
  2. ^ Meila, M. (2007). "Comparing clusterings—an information based distance". Journal of Multivariate Analysis 98 (5): 873–895. doi:10.1016/j.jmva.2006.11.013.  edit
  3. ^ Vinh, Nguyen Xuan; Epps, Julien; Bailey, James (2010), "Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance", The Journal of Machine Learning Research 11 (oct): 2837–54, http://jmlr.csail.mit.edu/papers/volume11/vinh10a/vinh10a.pdf 

[edit] Others

Matlab code for computing the adjusted mutual information

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export