# Tversky index

Jump to navigation Jump to search

The Tversky index, named after Amos Tversky, is an asymmetric similarity measure on sets that compares a variant to a prototype. The Tversky index can be seen as a generalization of the Sørensen–Dice coefficient and the Tanimoto coefficient (aka Jaccard index).

For sets X and Y the Tversky index is a number between 0 and 1 given by

$S(X,Y)={\frac {|X\cap Y|}{|X\cap Y|+\alpha |X-Y|+\beta |Y-X|}}$ Here, $X-Y$ denotes the relative complement of Y in X.

Further, $\alpha ,\beta \geq 0$ are parameters of the Tversky index. Setting $\alpha =\beta =1$ produces the Tanimoto coefficient; setting $\alpha =\beta =0.5$ produces the Sørensen–Dice coefficient.

If we consider X to be the prototype and Y to be the variant, then $\alpha$ corresponds to the weight of the prototype and $\beta$ corresponds to the weight of the variant. Tversky measures with $\alpha +\beta =1$ are of special interest.

Because of the inherent asymmetry, the Tversky index does not meet the criteria for a similarity metric. However, if symmetry is needed a variant of the original formulation has been proposed using max and min functions .

$S(X,Y)={\frac {|X\cap Y|}{|X\cap Y|+\beta \left(\alpha a+(1-\alpha )b\right)}}$ $a=\min \left(|X-Y|,|Y-X|\right)$ ,

$b=\max \left(|X-Y|,|Y-X|\right)$ ,

This formulation also re-arranges parameters $\alpha$ and $\beta$ . Thus, $\alpha$ controls the balance between $|X-Y|$ and $|Y-X|$ in the denominator. Similarly, $\beta$ controls the effect of the symmetric difference $|X\,\triangle \,Y\,|$ versus $|X\cap Y|$ in the denominator.