The algorithm begins with an implicit suffix tree containing the first character of the string. Then it steps through the string adding successive characters until the tree is complete. This order addition of characters gives Ukkonen's algorithm its "on-line" property. The original algorithm presented by P. Weiner proceeded backward from the last character to the first one from the shortest to the longest suffix. A simpler algorithm was found by Edward M. McCreight, going from the longest to the shortest suffix.
The naive implementation for generating a suffix tree going forward requires O(n2) or even O(n3) time complexity in big O notation, where n is the length of the string. By exploiting a number of algorithmic techniques, Ukkonen reduced this to O(n) (linear) time, for constant-size alphabets, and O(n log n) in general, matching the runtime performance of the earlier two algorithms.
- Ukkonen, E. (1995). "On-line construction of suffix trees". Algorithmica 14 (3): 249–260. doi:10.1007/BF01206331.
- Weiner, P. (1973). "Linear pattern matching algorithms". 14th Annual Symposium on Switching and Automata Theory (swat 1973). pp. 1–9. doi:10.1109/SWAT.1973.13.
- McCreight, E. M. (1976). "A Space-Economical Suffix Tree Construction Algorithm". Journal of the ACM 23 (2): 262. doi:10.1145/321941.321946.
- Original Ukkonen's paper PDF | PDF with figures
- McCreight's paper in PDF
- Weiner's paper in PDF
- Detailed explanation in plain English
- Fast String Searching With Suffix Trees Mark Nelson's tutorial. Has an implementation example written with C++.
- Implementation in C with detailed explanation
- Lecture slides by Guy Blelloch
- Ukkonen's homepage
- Text-Indexing project (Ukkonen's linear-time construction of suffix trees)
- Implementation in C Part 1 Part 2 Part 3 Part 4 Part 5 Part 6
|This computer science article is a stub. You can help Wikipedia by expanding it.|