Canberra distance

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Canberra distance is a numerical measure of the distance between pairs of points in a vector space, introduced in 1966[1] and refined in 1967[2] by Godfrey N. Lance and William T. Williams. It is a weighted version of L₁ (Manhattan) distance.[3] The Canberra distance has been used as a metric for comparing ranked lists[3] and for intrusion detection in computer security.[4] It has also been used to analyze the gut microbiome in different disease states.[5]

Definition[edit]

The Canberra distance d between vectors p and q in an n-dimensional real vector space is given as follows:

where

are vectors.

The Canberra metric, Adkins form, divides the distance d by (n-Z) where Z is the number of attributes that are 0 for p and q.

See also[edit]

Notes[edit]

  1. ^ Lance, Godfrey N.; Williams, William T. (1966). "Computer programs for hierarchical polythetic classification ("similarity analysis")". Computer Journal. 9 (1): 60–64. doi:10.1093/comjnl/9.1.60.
  2. ^ Lance, Godfrey N.; Williams, William T. (1967). "Mixed-data classificatory programs I.) Agglomerative Systems". Australian Computer Journal: 15–20.
  3. ^ a b Giuseppe Jurman; Samantha Riccadonna; Roberto Visintainer; Cesare Furlanello; "Canberra Distance on Ranked Lists", in Shivani Agrawal; Chris Burges; Koby Crammer (editors); Proceedings, Advances in Ranking – NIPS 09 Workshop, 2009, p. 22–27
  4. ^ Emran, Syed Masum; Ye, Nong (2002). "Robustness of chi-square and Canberra distance metrics for computer intrusion detection". Quality and Reliability Engineering International. 18 (1): 19–28. doi:10.1002/qre.441.
  5. ^ Hill-Burns, Erin M.; Debelius, Justine W.; Morton, James T.; Wissemann, William T.; Lewis, Matthew R.; Wallen, Zachary D.; Peddada, Shyamal D.; Factor, Stewart A.; Molho, Eric; Zabetian, Cyrus P.; Knight, Rob; Payami, Haydeh (May 2017). "Parkinson's disease and Parkinson's disease medications have distinct signatures of the gut microbiome". Movement Disorders. 32 (5): 739–749. doi:10.1002/mds.26942. PMC 5469442. PMID 28195358.

References[edit]