Talk:Trie

Trie has been listed as one of the Engineering and technology good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it.
Review: April 18, 2022. (Reviewed version).

Computer science High‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Erroneous Python example[edit]

The Python example seems to be wrong as it lacks a recursive "find". — Preceding unsigned comment added by 178.2.130.142 (talk) 10:12, 15 February 2014 (UTC)[reply]

I don't see any erroneous-ness: the example does not purport to use recursion, and it does find key when key is in node. There was an issue with insert not functioning properly because the condition to build new nodes was predicated on a non-zero last-index for the string to be inserted, but I just fixed that. Zacharysyoung (talk) 06:00, 11 May 2018 (UTC)[reply]

Disadvantages[edit]

I think there should be a section describing the disadvantages of tries. Currently, there are a few disadvantages listed under the "comparison with other data structures" section.

And one disadvantage that I don't see mentioned anywhere is the space/time tradeoff of managing child nodes: either you need a (potentially large and sparse) array of indexes at each node, or you need to implement a secondary search algorithm to find the appropriate child(eg, the Python example uses a hash lookup over a dictionary of children). While the indexed approach may not be an issue for relatively shallow tries using ASCII (ie, the example at the top of the page), it will quickly become an issue with deep tries and/or large value spaces (eg, Unicode). Perhaps someone with more practical experience can comment on this? —Preceding unsigned comment added by 64.94.198.98 (talk) 17:21, 25 May 2010 (UTC)[reply]

Misleading Diagram[edit]

A better diagram for the page would show a trie where there are more than two edges from each node. The current diagram makes a trie seem too much like a binary search tree, and the idea that each node can have multiple edges depending on the lexicon is lost. —Preceding unsigned comment added by 134.197.40.199 (talk) 13:29, 11 February 2009 (UTC)[reply]

This has now been fixed. me_and (talk) 09:39, 20 May 2010 (UTC)[reply]

As replacement of other data structures (hash table)[edit]

The article currently says that "The worst-case lookup speed in an imperfect hash table is O(log(N)) time." I would think that the worst case would be O(N) in the case where all keys were mapping to the same value, and the resulting linked list of length N must be traversed. Tom Hubbard 02:27, 11 September 2007 (UTC)[reply]

OK, it looks like someone fixed this -- thanks. Tom Hubbard (talk) 14:00, 28 November 2007 (UTC)[reply]

Complexity of tries[edit]

I am just wondering, Wolfskeeper, if you yet understood complexity theory and tries, or if you just go around breaking articles because you did not (yet). Perhaps we should sort this out before you make further changes, that might need to be undone as well. --ISee 10:28, 28 July 2005 (UTC)[reply]

The pseudo code for lookup looks wrong. This will never finds strings that are prefixes of other strings that are also stored in the trie. Am I mistaken?

That's exactly what I thought. I think the other algorithm was actually wrong entirely, since it would only match true if there was a NULL node (no node at all) for the end of the word. I've made the correction of adding a "terminal" field which denotes the end of a word. —EatMyShortz 13:52, 30 April 2006 (UTC)[reply]

"Average lookup speed is theoretically the same, but a trie is faster in practice." That seems like a highly dubious claim, unsupported by evidence! In my experience the opposite is usually true (the main benefit of tries is thus that it can grow easily, and it may take less space since prefixes of keys are shared).

The performance is highly related to the usage pattern. In a hashtable, you can retrieve a value while touching only in 1-2 cache lines on average, whereas in a trie this can be upto its depth (length of the prefix). However, a HT will almost always require these cache fetches whereas especially in a tight loop the trie will often have most or all of the lines cached (and esp for negative results). --66.7.145.210 18:27, 5 October 2006 (UTC)[reply]
- Also as I know, a hashtable lookup requires the search-key to be hashed before while the lookup in a trie can start almost instantly. Does anybody else know the Judy Arrays or Google sparsehash? I think these are an implementation of tries (at least its JudySL class) altough the term "trie" is never used to describe both of them. --195.3.81.25 10:15, 27 August 2007 (UTC)[reply]

I'm not certain-enough of this to edit the document without research, but... If you are going to take the length of keys into account when discussing the complexity of lookup with tries then you should do the same with the other methods --otherwise you are comparing apples to oranges. Normally in discussing binary trees and hash tables, it is assumed that the keys are small constant-sized objects. When discussing tries, you have to think of strings. In this case, each comparison in a binary-tree search is a O(m) string compare so the complexity of the lookup is not O(log n) but O(log n*m). Similarly, the expected lookup time for a hash table is O(m), not O(1) because forming the hash typically takes O(m) time (you can avoid looking at all characters, but that probably effects the expected lookup time). --David Gudeman —Preceding unsigned comment added by 69.181.140.228 (talk) 04:22, 14 July 2008 (UTC)[reply]

You are correct about apples and oranges. However, your calculation of the complexity of a trie lookup is wrong. It is O(m), not O(m logn). In fact, to be more precise, it is O(m*p), where p is the number of letters in the alphabet. The number of words that are already in the trie (n) is completely irrelevant. More so than for hash tables, whose lookup complexity is not completely independent from n, due to the possibility of collisions. This cannot be captured nicely in a formula, so we simply say that hash table lookups are O(m).

In any case, I was very surprised to see that complexity information is completely missing from the article! 83.166.177.69 (talk) 07:29, 13 September 2015 (UTC)[reply]

Diagram Numbering[edit]

In the diagram of trie on the right hand side of http://en.wikipedia.org/w/index.php?title=Trie&oldid=65534606 the elements in the trie are numbered. What do these numbers represent? I can't for the life of me work it out - could an explanation also be added to the article?

I don't think they are intended to have any meaning. The use of numbers may not have been the most effective way to make their point, which is that only nodes with valid keys store values. Notice that the "t" and "te" nodes are not marked with numbers. This is because "t" and "te" are not real English words; therefore, they cannot serve as keys to any value (this _seems_ to be the idea). I don't think readers will be confused by this in general though, because the third paragraph provides an explanation. Maybe this was added after you made your request. Danielx 08:19, 21 August 2006 (UTC)[reply]

I agree with Danielx about the meaning of the numbers (i.e. they are merely arbitrary "values" associated with the string "keys"). However I have a problem with the diagram caption. Since, as Danielx points out, "t" and "te" are not marked with numbers then they are not "keys", but just key prefixes. The caption lists them as if they were keys and I believe that may be confusing. Paritybit (talk) 03:48, 10 September 2008 (UTC)[reply]

Binary?[edit]

It looks like Tries are not binary (Ie. A node doesn't have at most two children). This would be much more clear if the diagram featured a node with more than two children. (Whoops, forgot to sign, here it goes:) --Thenickdude 05:03, 2 September 2006 (UTC)[reply]

True, that. I should fix the diagram. Deco 02:33, 2 September 2006 (UTC)[reply]

I updated it so a couple nodes have three children. Superm401 - Talk 23:03, 8 March 2009 (UTC)[reply]

Knuth reference[edit]

When looking up the reference by Knuth (The Art of Computer Programming, Volume 3: Sorting and Searching). I couldn't find a third edition, only a second edition from March 1998 (first print)

More information can also be found on the publishers website: http://www.aw-bc.com/catalog/academic/product/0,1144,0201896850,00.html

Maybe someone with more experience (this is my first Wikipedia post) could verify this and modify the article.

Thanks in advance Jwk3402 13:04, 2 April 2007 (UTC)[reply]

Other names[edit]

Is discrimination tree another name for a trie? Wikipedia doesn't have an entry for D. T. and Google isn't helping. -- Ralph Corderoy 11:50, 28 May 2007 (UTC)[reply]

Thanks for the hint. I added one reference that seems to say a discrimination tree uses a trie data structure. Better references would be appreciated. --DavidCary (talk) 15:15, 9 July 2016 (UTC)[reply]

Huffman coding[edit]

Maybe Huffman coding should be added to "See also", since the basic idea seems to be very similar to me. Subwy (talk) 20:22, 13 August 2008 (UTC)[reply]

Yeah sort of. You can interpret a Huffman tree as being a trie, and Huffman coding as describing the search paths followed when you look up a sequence of characters in that trie. It's not a bad analogy. Dcoetzee 20:43, 13 August 2008 (UTC)[reply]

I too think so, I am adding it to the article. --Jyoti (talk) 06:46, 15 November 2008 (UTC)[reply]

Why did Edward Fredkin choose that word?[edit]

Since he pronounced it homophonous to ‘tree’, didn't he realize that it was a pretty stupid choice, because that would make it impossible to distinguish the words in speech? If he was so desperate to combine ‘tree’ and ‘retrieve’, surely he could have done better? Shinobu (talk) 22:06, 5 October 2008 (UTC)[reply]

Example for implicit branch labels?[edit]

The article currently states that the label of the branches is often implicit in the ordering of the branches. I'm trying to think of a good example of this but all I can think of are binary prefix codes, which sort of use binary lookup trees anyway. All the examples in the article seem to assume every node has a map of its children (though my Haskell isn't all that up to scratch). Perhaps an explicit example would be useful... Wppds (talk) 09:05, 8 January 2009 (UTC)[reply]

If keys were made of us US-ASCII lower-case letters, and each node had an array of 26 child pointers, you wouldn't have to say "this character is a 'c' ", because you could know that from the pointer to the node being in the 3rd position. — Preceding unsigned comment added by JimJJewett (talk • contribs) 21:51, 7 May 2013 (UTC)[reply]

Section on lexicographic sorting using a trie is unclear and in some cases, wrong[edit]

Only a pre-order traversal can be used to sort the keys in lexicographic order. Furthermore, a lexicographic ordering of edge pointers within each node is required in order to use the pre-order traversal. In other words, the children of a node would have to be traversed in lexicographic order.

Please see this link for a reference: http://books.google.com/books?id=G_U8ghxhT4wC&pg=PA379&dq=lexographic+sorting+using+trie

Calvin (talk) 17:22, 18 February 2009 (UTC)[reply]

Some people overgeneralize from seeing examples where strings are only associated with the leaf nodes in a trie to believing, incorrectly, that strings can only be associated with the leaf nodes in a trie. When a trie represents overlapping strings, such as, "the", and, "there", the word, "the", will be associated with an interior node of the trie and will be encountered first and be printed first in a preorder traversal before the preorder traversal reaches the node that is associated with the word, "there", which itself might be associated with an interior node if there is a longer overlapping string represented by the trie, such as, "therefore".

Whether or not a postorder traversal of a trie will produce text in reverse lexicographic order depends on how the strings are represented. If the trie is acting an an index to strings with a pointer to each string, then a postorder traversal of a trie will produce text that is in reverse lexicographic order.

67.167.247.90 (talk) 20:40, 25 October 2020 (UTC) Edward Lee[reply]

Lexicographic ordering and post-order traversal[edit]

Calvin is right: the article mistakenly claims that post-order traversal of the trie will result in lexicographically-reversed output. That's quite misleading and wrong. A better, pedagogical approach might be to consider an Euler traversal (Euler traversal)(generalized to trees with more than 2 children) in this part of the discussion. An Euler tour of a tree structure covers both pre-order and post-order traversal of a tree structure in the same pass. Pre-order events can be used to construct and output the strings in lexicographical order but you also need to capture post-order events in order to know when to remove characters from the prefix string.

--Here2L8 (talk) 20:26, 5 April 2009 (UTC)[reply]

trie = DFA ?[edit]

Other than the confusing name, how is this different from a deterministic finite automaton? 12.108.187.222 (talk) 20:25, 7 May 2009 (UTC)[reply]

It's a special case of a DFA -- precisely, it's a DFA which is a tree. --196.215.6.126 (talk) 14:35, 17 July 2010 (UTC)[reply]

sorting[edit]

In the sorting section, the two statements

Pre-order traversal is a kind of depth-first traversal. In-order traversal is another kind of depth-first traversal that is more appropriate for outputting the values that are in a binary search tree rather than a trie.

are confusing. Should one of them be breath-first traversal? Kowloonese (talk) 17:37, 30 October 2009 (UTC)[reply]

Breadth-first traversal of a trie would work if you are representing the digits of integers with variable lengths as paths in a trie. I have no citation for this, just my original insight. Otherwise, a depth-first traversal would be appropriate.

In a binary search tree, any descendant nodes down to the left branch of a given parent node contain values that are lower than the value in the parent node, and any descendant nodes down to the right branch of a given parent node contain values that are higher than the value in the parent node. For the strings, "The", "There", and, "Therefore", the word, "There", might be in the root node, and the word, "The", might be in the left child node, and, the word, "Therefore", might be in the right child node. Imagine there is a line connecting, "There", with, "The". Imagine there is another line connecting, "There", with, "Therefore":

There

The Therefore

An in-order traversal is usually implemented as a recursive depth-first traversal algorithm that attempts to go as far down the left branches of a binary tree as a possible. Upon returning from a visit down a left branch, the contents of the current node are printed. Then, the algorithm attempts to recursively go down the right branch. Pseudocode for an in-order traversal is:

Procedure InOrder(p as Pointer)

Begin

If p is Null Then Return

InOrder(p->Left)

Print p->String

InOrder(p->Right)

End

So, in this example, the InOrder() procedure would be called with a pointer to the root node, which contains the value, "There." The procedure recursively calls itself with a pointer to the left node. The second instance of the procedure corresponds to the left child node that contains, "The". This instance of the procedure then calls itself recursively with a Null pointer for the left node. The third instance of the procedure then immediately returns, because the pointer is Null. Upon returning from the left branch, the second instance of the procedure prints the contents of the current node, "The". The second instance of the procedure then recursively calls itself with a Null pointer to the right branch. The fourth instance of the procedure immediately returns to the second instance of the procedure, because the pointer is Null. Upon returning from the right branch, the second instance of the procedure then returns to the first instance of the procedure which corresponds to the root node. Upon returning from the left branch, the first instance of the procedure prints the contents of the root node, "There." The first instance of the procedure then recursively calls a fifth instance of the procedure with a pointer to the right node, which contains, "Therefore". The fifth instance of the procedure then recursively calls itself with a Null pointer to the left node. The sixth instance of the procedure immediately returns, since the pointer is Null. Upon returning from the left branch, the fifth instance of the procedure then prints the value in the current node, "Therefore". The fifth instance of the procedure then recursively calls itself with a Null pointer to the right node. The seventh instance of the procedure then immediately returns, since the pointer is Null. Upon returning from the right branch, the fifth instance of the procedure that corresponds to the word, "Therefore", then returns to the first instance of the procedure corresponding to the root node. Upon returning from visiting the right branch, the first instance of the procedure corresponding to the root node then returns, finishing the in-order traversal.

A preorder traversal prints or otherwise processes the value associated with the current node in a tree first before visiting any descendant nodes. That is what has to be done to print in ascending lexicographic order the values represented by a trie where the strings:

The

There

Therefore

are arranged as overlapping paths in the trie where the paths represent the individual digits or individual characters of the string prefixes. There are multiple options for how to represent the strings. For example, you can concatenate the string prefixes that are represented by a path that has been traversed from the root node down to a given node or have a pointer to a string in each node that corresponds to the path that has been traversed from the root node down to a given node. A trie doesn't have to be a binary trie, but if it were a binary trie, the pseudocode for a preorder traversal would be something like:

Procedure PreOrder(p as Pointer)

Begin

If p is Null then Return

Print p->String

PreOrder(p->Pointer[0]) 'Recursively visit any nodes down the branch corresponding to string prefix digit zero.

PreOrder(p->Pointer[1]) 'Recursively visit any nodes down the branch corresponding to string prefix digit one.

End

A postorder traversal routine goes as deep as possible down a particular branch of a tree until no further descent is possible and then switches to the next available branch, recursively going down until no further descent is possible for any branch, and prints or otherwise processes the contents of the current node prior to returning. The text book version of postorder traversal that I learned depicted visiting the branches from the left to right, but for displaying the values associated with a trie in descending lexicographic order, the branches have to be visited from right to left, corresponding to the higher digital values to the lower digital values. The pseudocode for a postorder traveral of a binary trie would be something like:

Procedure PostOrder(p as Pointer)

Begin

if p is Null then Return

PostOrder(p->Pointer[1]) 'Recursively visit any nodes down the branch corresponding to string prefix digit one.

PostOrder(p->Pointer[0]) 'Recursively visit any nodes down the branch corresponding to string prefix digit zero.

Print p->String

End

In-order traversal, preorder traversal, and postorder traversal are all depth-first traversal algorithms. I don't like using hyphens for spelling pre-order or post-order traversal, because it takes too long to type. If you visualize in your mind the sequence of nodes that are visited by these recursive algorithms, then you will see that depth-first traversal is equivalent to the wall following algorithm for traversing a maze by keeping one of your hands on a wall at all times, which will make the maze traveler visit all of the maze if there is no exit. You can see examples of real preorder and postorder traversal routines for a binary trie in the C language source code for my BRADSORT program, which is one of the first implementations of a trie-based radix sorting algorithm, which you can probably find with Google. The code has been commented out with an #ifdef but can be made active again.

-Edward Lee 67.167.247.90 (talk) 00:18, 15 November 2021 (UTC)[reply]

Compared to hash tables[edit]

The article currently says, “[…] tries have the peculiar feature that the time to insert, or to delete or to find is almost identical … (unlike say hash tables where deletion is extremely fast, but finding and especially insertion is considerably slower).“ The part in parentheses makes no sense. First of all, finding must be performed before deletion so deletion cannot be faster (and isn’t). It also shouldn’t be faster than insertion, neither in open nor in closed addressing. I’m removing that part since it’s wrong (or at best easily misunderstood), and also unnecessary. —Preceding unsigned comment added by 77.186.145.151 (talk) 15:51, 18 July 2010 (UTC)[reply]

To be fair I could interpret this as referring to deletion via an iterator, which could maintain a pointer to the data item, in which case it does appear to represent an actual (but relatively unimportant) distinction between tries and hash tables. I support removing it. The distinction of actual importance is the prefix sharing by tries. Dcoetzee 18:33, 18 July 2010 (UTC)[reply]

I think the phrase in parentheses is simply reversed -- Hash tables may take extraordinarily long to delete an element if using a closed hash, because the entire table needs to be re-hashed to settle any collisions that will no longer occur. That is, subsequent searches will need to know that they should consider the deleted table entry a possible collision. This can be fixed by using tombstone values, or instead using a chained/open hash table. —Preceding unsigned comment added by 216.239.45.4 (talk) 19:24, 25 August 2010 (UTC)[reply]

Is a trie a data structure or an algorithm?[edit]

The opening sentence says a trie is a data structure. But then the first section is headed Advantages relative to other search algorithms, and it begins, "Unlike most other algorithms, tries...".

So is a trie a data structure or an algorithm? On this talk page under trie = DFA ? an anonymous poster concludes that a trie is a DFA. That seems to be more accurate than either data structure or algorithm. IOLJeff (talk) 14:27, 3 July 2011 (UTC)[reply]

It's a data structure of course. It stores data. Like other data structures, it comes with algorithms for accessing (searching) and updating it - that's what first section seems to be talking about. You could view a trie as a DFA. But note that you can insert or remove strings from tries at any time and this results in different DFA's, so this viewpoint is restricted only to immutable tries. -- X7q (talk) 18:18, 3 July 2011 (UTC)[reply]

Advantages/Disadvantages mess[edit]

The article is currently a real mess when it comes to trying to understand why and when Tries are used instead of hash tables, for example. There is the "advantages" section, and there is the "instead of other data structures" section, both trying to say the same thing, but both completely missing the point:

Tries are mostly (or only?) used to to hold a relatively-dense list of words - it could be a spell-checking dictionary (the only good example given in this article), a routing table (see Patricia Tree, a variant of trie), and so on. The advantages of tries over hash tables here are: 1. They often take less memory (as whole words aren't stored), 2. They are MUCH faster to build given a sorted word list (as well as easily saving a sorted word list), 3. While hash-tables can only do exact word matching, tries make it easy to do longest prefix match, approximate string matching, and so on.

195.110.40.7 (talk) 08:47, 13 March 2012 (UTC)[reply]

There are also some statements in there for which we have to let's say make charitable assumptions in order for them to be true at all - for instance, the claim that searching a binary tree takes O(m log n) time, which assumes that comparing a key takes m time. Elsewhere the article describes hash tables as constant-time, which requires the opposite (and more customary) assumption that key-sized operations are constant-time. I suggest removing the entire advantages/disadvantages section. It does not seem to reflect nor contribute to a good understanding of what's actually going on in these data structures, and it is is uncited. 130.179.29.61 (talk) 19:18, 15 March 2013 (UTC)[reply]

Please verify the "A compiling Python version" version[edit]

The code uses a defaultdict, but then the algorithms sometimes rely on getting a KeyError, which is precisely what a defaultdict with a factory prevents. Perhaps setdefault would work better.

JimJJewett (talk) 21:59, 7 May 2013 (UTC)[reply]

The "compiling Python version" compiles but doesn't work indeed :) Like you say, the "remove" never hits the False clause because of the factory passed to defaultdict, and prefix goes out of range when going down to empty string. I've re-written it to work as folows, although the prefix method should be removed, or the lookup method should be re-written not to cache searches so that they make sense together. BACbKA (talk) 19:04, 12 May 2013 (UTC)[reply]

Since nobody seems to care, I'm moving my code into the article, even though I haven't addressed the prefix vs lookup issue. At least it works now... Thanks for the alert, JimJJewett! BACbKA (talk) 21:07, 11 June 2013 (UTC)[reply]

Confusion about keys[edit]

Would be clearer to change: Unlike a binary search tree, no node in the tree stores the key associated with that node

to

Unlike a binary search tree, no node in the tree stores the entire key associated with that node --129.42.208.183 (talk) 22:00, 20 January 2014 (UTC)[reply]

Your formulation would suggest that nodes store part of the key. That's not true: it's the edges (links) that encode the keys, not the nodes. The nodes only carry values or "final" bits. QVVERTYVS (hm?) 22:33, 20 January 2014 (UTC)[reply]

Sorting section uses wrong traversal[edit]

I think the sentence "Output all keys in the trie by means of pre-order traversal, which results in output that is in lexicographically increasing order." uses the wrong traversal. After having a look at Tree_traversal i would use the depth-first|in-order variant. Additionaly there is no sorting of the branches of a node. And the traversal article uses strict left to right traversing. I can't see how you will get an ordered list this way. — Preceding unsigned comment added by 178.192.228.27 (talk) 08:11, 10 November 2014 (UTC)[reply]

It is part of the definition of a trie that the branches are sorted: they correspond to symbols (characters) and those are assumed to be ordered, so you need to do a traversal that follows this ordering. As for in-order vs. pre-order, that makes no difference if you want to output only the keys, but if you want to output all their prefixes, you need to do it pre-order. The algorithm is as follows:

traverse(n : Trie node, s : string)
    if n is a final node:
        output s
    for each symbol c of the trie's alphabet, in order:
        if there is a child node of t associated with c:
            traverse(child c of n, s + c)

Skipping the check for final nodes produces the prefixes as well. QVVERTYVS (hm?) 11:09, 10 November 2014 (UTC)[reply]

Sorry i missinterpreted the examples at the tree-traversal because they didn't work with strings. However in the example image at the top of the page a can not recognize a sorting. In the algorithms section i also can not see any comparisons to sort the branches, however i know nothing about haskell so i could be wrong. (Must the input be ordered before applying the algorithm?) --178.192.228.27 (talk) 15:17, 10 November 2014 (UTC) (Why can't i use my de-login for en-pages? Arrrrgh)[reply]

The picture is a bit confusing as it indeed shows the children of some nodes in unsorted order. As for sorting the branches, that's usually done implicitly. A baseline implementation of a trie node in C would be

       struct Node {
           int final;
           struct Node *children[256];
       };

A node n has a child for symbol c iff n.children[c] != NULL. The Haskell example uses a Map, i.e., a BST, to store the children. That sorts them automatically.

The code examples should really be replaced by pseudocode, since now you have to know Haskell and Python to read them. The Python example also doesn't store the children in sorted order, because it puts them in a hash table. QVVERTYVS (hm?) 15:45, 10 November 2014 (UTC)[reply]

DFA without loops[edit]

Quote from the article:

A trie can be seen as a deterministic finite automaton without loops.

I think this too weak, and it should be "a DFA which is a tree". Consider a DFA which is a simple diamond: e.g. states {1,2,3,4}, transitions 1->2 on "A", 1->3 on "B", 2->4 on "A", 3->4 on "B", with state 4 as the only accepting state. This definitely a loop-free DFA which accepts the language {"AA","BB"}. The graph is a DAG, but not a tree, so the DFA is not a trie. --Saforrest (talk) 02:00, 21 May 2015 (UTC)[reply]

You're right. I updated the article to read "tree-shaped". In fact, every finite state machine that accepts a finite language is acyclic (because every cycle allows pumping), but not necessarily tree-shaped. QVVERTYVS (hm?) 13:48, 21 May 2015 (UTC)[reply]

"Lookup and membership are easily described"[edit]

..but unfortunately it's in Haskell. Crazy. How about pseudocode, or a more commonly used language?! 110.20.194.252 (talk) 08:33, 10 February 2017 (UTC)[reply]

Came here to say the same - I don't see other algorithms / data structures articles using this language -John Lunney (talk) 19:29, 3 December 2017 (UTC)[reply]

I rewrote it, hope that's better now. SamaelBeThouMyAlly (talk) 15:49, 3 January 2018 (UTC)[reply]

bitwise tries[edit]

The bitwise trie section indeed is a kind of incomplete. A link to this article may help: [[1]] — Preceding unsigned comment added by Pela3141 (talk • contribs) 13:43, 15 February 2021 (UTC)[reply]

In the example, why are there no arrows to "ten" or "inn"?[edit]

I don't understand why "ten" and "inn" are isolated in the example. I would expect arrows to point to them from "te" and "in", respectively.

A bit more description for `self.values` in Node strucutre?[edit]

First time reader of trie. I got stuck what the self.values meant in class Node.

From what I understood it can be self.isEnd to indicate if the word was inserted. Like if "apple" was inserted, the node for the letter "e" would indicate isEnd=True. In this way we can tell the word "app" wasn't inserted, it's just there as a prefix.

Can it also be used to indicate quantity like there are 4 apples and 2 oranges using an attribute called self.quantity?

If the above are true, can I add the following text under https://en.wikipedia.org/wiki/Trie#Algorithms section?

"self.values can be used to indicate an attribute like quantity (like there are 4 apples). We can also add an attribute called self.isEnd to indicate if a word was inserted (insert function needs to be modified accordingly for this work)." — Preceding unsigned comment added by Twoplants (talk • contribs) 09:46, 8 October 2021 (UTC)[reply]

Floating point as bit string[edit]

@David Eppstein: This is a good catch, thanks. Yeah, it can be represented as a IEEE 754 binary string. I've copy-edited per Reema (2014); I'm searching for sources that support the new revision (representation as binary string) to cite there. WikiLinuz {talk} 🍁 03:12, 13 March 2022 (UTC)[reply]

GA Review[edit]

This review is transcluded from Talk:Trie/GA1. The edit link for this section can be used to add comments to the review.

Reviewer: The Antediluvian (talk · contribs) 14:41, 17 April 2022 (UTC)[reply]

Rate	Attribute	Review Comment
1. Well-written:
	1a. the prose is clear, concise, and understandable to an appropriately broad audience; spelling and grammar are correct.	The article needs an "Overview" section that summarizes the functioning of the data structure, including why it should be used, and where it should be used and where not to. The lists within the "Applications" section needs to be expanded a little. Article should also need to provide information regarding time-space complexities on the body.
	1b. it complies with the Manual of Style guidelines for lead sections, layout, words to watch, fiction, and list incorporation.	Currently it needs minor copy editing for expressions like "don't", "doesn't", etc. A few sentences should also needs to be written in encyclopedic tone, specifically sections "Implementation strategies" and "Replacing other data structures". "Operations" section also needs copy editing.
2. Verifiable with no original research:
	2a. it contains a list of all references (sources of information), presented in accordance with the layout style guideline.	Article meets the verifiability criteria.
	2b. reliable sources are cited inline. All content that could reasonably be challenged, except for plot summaries and that which summarizes cited content elsewhere in the article, must be cited no later than the end of the paragraph (or line if the content is not in prose).	Article incorporates academic source.
	2c. it contains no original research.	Article meets WP:NOR criteria.
	2d. it contains no copyright violations or plagiarism.	Article meets this criteria.
3. Broad in its coverage:
	3a. it addresses the main aspects of the topic.	Article covers the general details regarding a trie that's required for a data structure article.
	3b. it stays focused on the topic without going into unnecessary detail (see summary style).	I think it gives undue weight to the "External memory tries" section, which should be merged or expanded. The phrasing is also unclear. Like I pointed out already, the article needs an "overview" section. The "Applications" section should also needs to be split into separate paragraphs.
	4. Neutral: it represents viewpoints fairly and without editorial bias, giving due weight to each.	Article is neutrally worded.
	5. Stable: it does not change significantly from day to day because of an ongoing edit war or content dispute.	No active disputes or edit-wars lately.
6. Illustrated, if possible, by media such as images, video, or audio:
	6a. media are tagged with their copyright statuses, and valid non-free use rationales are provided for non-free content.	The three images used in the article are tagged accordingly, and a summary is provided.
	6b. media are relevant to the topic, and have suitable captions.	Article meets this criteria.
	7. Overall assessment.	Currently the article does not meet the WP:GACR.

I am putting this article WP:GAN/I#HOLD. I will reassess the article once the following changes were made. The Antediluvian 15:25, 17 April 2022 (UTC)[reply]

Thanks for taking this. I'll make the requested changes soon. --WikiLinuz {talk} 🍁 15:57, 17 April 2022 (UTC)[reply]

I've addressed the points 1 and 3b. --WikiLinuz {talk} 🍁 22:52, 17 April 2022 (UTC)[reply]

I have reassessed the article. Can you please mention the time and space complexities of tries in the lead section? Currently, there doesn't seem to be a sentence talking about that information. The Antediluvian 09:27, 18 April 2022 (UTC)[reply]

Yeah, I've added that. --WikiLinuz {talk} 🍁 22:31, 18 April 2022 (UTC)[reply]

Thankyou. I will approve the article. Best regards, The Antediluvian 22:38, 18 April 2022 (UTC)[reply]

Erroneous Python example[edit]

Disadvantages[edit]

Misleading Diagram[edit]

As replacement of other data structures (hash table)[edit]

Complexity of tries[edit]

Diagram Numbering[edit]

Binary?[edit]

Knuth reference[edit]

Other names[edit]

Huffman coding[edit]

Why did Edward Fredkin choose that word?[edit]

Example for implicit branch labels?[edit]

Section on lexicographic sorting using a trie is unclear and in some cases, wrong[edit]

Lexicographic ordering and post-order traversal[edit]

trie = DFA ?[edit]

sorting[edit]

Compared to hash tables[edit]

Is a trie a data structure or an algorithm?[edit]

Advantages/Disadvantages mess[edit]

Please verify the "A compiling Python version" version[edit]

Confusion about keys[edit]

Sorting section uses wrong traversal[edit]

DFA without loops[edit]

"Lookup and membership are easily described"[edit]

bitwise tries[edit]

In the example, why are there no arrows to "ten" or "inn"?[edit]

A bit more description for self.values in Node strucutre?[edit]

Floating point as bit string[edit]

GA Review[edit]

A bit more description for `self.values` in Node strucutre?[edit]