Edit distance

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In information theory and computer science, the edit distance between two strings of characters generally refers to the Levenshtein distance.[1] However, according to Nico Jacobs, “The term ‘edit distance’ is sometimes used to refer to the distance in which insertions and deletions have equal cost and replacements have twice the cost of an insertion”.[2]

It may also refer to the whole class of string metrics that measure distance as the (weighted or unweighted) number of operations required to transform a string into another. There are several different ways to define an edit distance, depending on which edit operations are allowed: replace, delete, insert, transpose, and so on. There are algorithms to calculate its value under various definitions:

Notes [edit]

  1. ^ Algorithms and Theory of Computation Handbook, CRC Press LLC, 1999, "Levenshtein distance", in Dictionary of Algorithms and Data Structures, Paul E. Black, ed., U.S. National Institute of Standards and Technology, 14 August 2008 (accessed 31 October 2011).
  2. ^ See p. 190 in Jacobs, Nico “Relational Sequence Learning and User Modelling”. Katholieke Universiteit Leuven, Faculteit Wetenschappen, Faculteit Toegepaste Wetenschappen; Departement Computerwetenschappen, Leuven, Oct. 2004, xvi + 235 pp.

See also [edit]

External links [edit]