Levenshtein transducer
In computer science, Levenshtein automata for a formal language are the family of finite state automata that can recognize the set V of all words in the language for which the Levenshtein distance to an arbitrary word w does not exceed a particular constant. Levenshtein transducers are extended Levenshtein automata that can in addition generate all strings in V at a fixed Levenshtein distance from a given w.
A Levenshtein automaton for W can be constructed in linear time with respect to the length of W, and can identify V in less time than would be needed if the Levenshtein distance to W was calculated for each word in the language (a problem with quadratic time complexity).
Since Levenshtein automata are finite-state machines, they can be described in (some) regular expression frameworks and finite-state algebra applies to them. In particular, Levenshtein transducers can be composed with other finite-state transducers.
Experimental implementations of the Levenshtein automaton exist in Python[1][2] and Java[3]
[edit] Footnotes
- ^ Barrette-Lapierre, Jean-Philippe. Moman. May 14, 2009. Retrieved on 2010-10-29.
- ^ Johnson, Nick. Damn Cool Algorithms: Levenshtein Automata. July 28, 2010. Retrieved on 2010-10-29.
- ^ McCandless, Mike. Lucene's FuzzyQuery is 100 times faster in 4.0. March 24, 2011. Retrieved on 2011-06-23.
[edit] External links
- Ahmed Hassan, Sara Noeman and Hany Hassan (2008). Language Independent Text Correction using Finite State Automata. Proc. IJCNLP.
- Petar N. Mitankin (2005). Universal Levenshtein Automata. Building and Properties. Thesis for the Sofia University St. Kliment Ohridski.
- Klaus U. Schulz and Stoyan Mihov (2002). Fast String Correction with Levenshtein-Automata. International Journal of Document Analysis and Recognition, 5(1):67--85, 2002.
| This computer science article is a stub. You can help Wikipedia by expanding it. |