Jump to content

Overlap coefficient

From Wikipedia, the free encyclopedia

The overlap coefficient,[note 1] or Szymkiewicz–Simpson coefficient,[citation needed][3][4][5] is a similarity measure that measures the overlap between two finite sets. It is related to the Jaccard index and is defined as the size of the intersection divided by the size of the smaller of two sets:

Note that . If set A is a subset of B or the converse, then the overlap coefficient is equal to 1.

See Also[edit]


  1. ^ The use of the term "overlap" appears in the comment for formula #27 in Table 2 of McGill et al. (1979),[1] which references Sager & Lockemann (1976).[2]


  1. ^ McGill, M.; Koll, M.; Noreault, T. (October 1979). An Evaluation of Factors Affecting Document Ranking by Information Retrieval Systems, Syracuse, NY: School of Information Studies, Syracuse University.
  2. ^ Sager, W. K. H.; Lockemann, P. C. (1976). "Classification of Ranking Algorithms". International Forum on Information and Documentation. 1 (4): 12–25.
  3. ^ Simpson, G. G. (January 1943). "Mammals and the Nature of Continents". American Journal of Science. 241 (1): 1–31.
  4. ^ Simpson, G. G. (July 1947). "Holarctic Mammalian Faunas and the Continental Relationships During the Cenozoic". Bulletin of the Geological Society of America. 58 (7): 613–688.
  5. ^ Simpson, G. G. (1960). "Notes on the Measurement of Faunal Resemblance". American Journal of Science. 258-A: 300–311.