Talk:Radix tree

From Wikipedia, the free encyclopedia
Jump to: navigation, search
          This article is of interest to the following WikiProjects:
WikiProject Computing (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 
WikiProject Computer science (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

tree[edit]

I'm deleting a couple of claims that predecessor/successor are O(1). Consider a radix tree with contents {0, 00, 000, ..., 0...0, 1}, where we have strings of 0's of length 1 through n. Then the straightforward predecessor algorithm would take O(n) time to find the predecessor of "1". (You could do it in O(1) if you also threaded a doubly-linked list through all your leaves, but the same could be said for any binary search tree.) If there's some clever algorithm I'm not thinking of, could you cite it (or at least describe the algorithm, possibly here on the Talk page) before reinstating these claims? Thanks, Cwitty 21:31, 7 October 2006 (UTC)

This counterexample applies to all dictionary data structures. For example, suppose you add n keys of length n to a hash table, and then do a lookup. Even after you've found the right hash bucket, you still need n operations to compare your key to the existing key. So it's O(n). Except it's not: hash table lookup is O(1). The mistake is to confound the key length and the number of elements. In typical practice key lengths are effectively constant; they don't dependent on the number of elements in your dictionary.
In a radix tree the obvious predecessor and successor algorithms take time proportional to the key length, but this is effectively constant, so they are O(1). Reilly 14:16, 14 November 2006 (UTC)
The claim that predecessor and successor algorithms run in constant time is probably based on a similar claim for trees in general. These algorithms actually run in *amortized* O(1) time. For radix trees, the number of nodes is linear in the number of entries, so it is correct to say that predecessor and successor can be computed in *amortized* O(1) time and not some runtime dependant on the length of the entries. For prefix trees this would not be true. 68.191.211.204 (talk) 13:44, 15 December 2009 (UTC)
Not really. Hash algorithm takes a key and gives you the address, no intermediate lookups required, it's just one lookup operation (unless there are clashes). In a radix tree you do active lookups, without intermediate lookups you are going nowhere. Compare this with slow disk lookups: hash would compute in active memory (or even CPU registers) and give you the address immediately (just go and fetch it), while with radix tree you will have to access the disk (slowly!) many times because index may not fit into active memory (certainly not CPU registers). And please remember that O() notation is used when n is big (hence assumption that index does not fit in memory). Note that famous O(1) scheduler in Linux is a misnomer, it is really of O(log(n)) complexity (it's just process id space is limited to 16bit space -- the same argument you tried here).
Actually, the argument is valid: actually computing the hash function is O(k), where k is the key length, same as the radix tree lookup — this is just glossed over when people usually think about hash tables, because dictionary key lengths are usually small(ish); but they're definitely a lot more expensive than doing bitwise operations and integer addition which are the basic operations in radix tree algorithms. Wtrmute (talk) 23:11, 23 September 2008 (UTC)

Could anyone please explain the meaning of zeros and ones in the picture of Overview? There's is also "song" word which is not in the tree at the top of the article which messes everything. —Preceding unsigned comment added by 130.225.195.72 (talk) 13:23, 17 July 2008 (UTC)

This page have many errors. One of the most blatant is that PATRICIA-trie is not a RADIX tree and it does not look the way it is described here. PATRICIA-tree uses lossy path-compression. See, e.g. the original paper of Morrison, the work Gwehenberger, where he describes a similar data-structure and the 3d Book of the Arts of Computer Programming. Moreover, the RADIX tree that is the main name of this entry is not widely-accepted computer term. Instead, it is a colloquialism that occurred in the mid-eighties.

I agree - patricia trees are definitely something else than regular tries with strings labeling the edges instead of characters. After I got my hands on the original paper, this article started looking downright misleading. This article either needs a complete rewrite, or a new article that describes the real PATRICIA index structure is needed.
--Cynebeald (talk) 18:30, 16 March 2009 (UTC)
I also agree, this page is very misleading in that it implies that a PATRICIA trie stores strings verbatim, it does not so does not allow you to enumerate stored strings, or to be sure you have an exact match (see here for an example: www.cs.umd.edu/~meesh/420/Notes/MountNotes/lecture25-tries.pdf). Also the pseudo-code is so vague in how 'is a prefix of' is determined, which is the critical part of the algorithm, to be useless. You would need some secondary index for the outbound edges of a node for this to be efficient, and it is is completely unclear what that is. In fact I have been unable to find anything that looks very much like this in the literature; a burst trie stores multi-character nodes in containers. I charged ahead and implemented something very odd on the basis of reading this article. A complete rewrite would be good, at the moment this article's presence does more harm than good. Silasdavis (talk) 12:30, 5 June 2014 (UTC)

Diagram[edit]

I have added a simple diagram (and I removed reqdiagram) . User:rocchini 2007-05-17

The nice diagram given at the front gives the misleading impression that each node always has two children. That would be true if you branch on bits; but if branching on characters there can be as many children as there are letters in your character set. Bhyde (talk) 15:52, 13 March 2010 (UTC)

Agreed. The diagram at the top of the article has a poor choice of sample strings, suggesting there must always be two children. -- Ralph Corderoy (talk) 11:34, 8 August 2010 (UTC)

How would the strings that don't end at a leaf eg: "roman" be indicated? 210.48.82.11 (talk) 00:38, 14 December 2010 (UTC)

Second Diagram[edit]

The second diagram depicts a structure that is not a tree. It is also not a trie, prefix tree, or radix tree. The article provides no explanation for the image. 68.191.211.204 (talk) 13:44, 15 December 2009 (UTC)

Nomenclature[edit]

This article should be named "Radix trie" not "Radix tree", because Patricia tries and radix tries are tries, not (search) trees. This article never once mentions that Patricia tries are also called Radix-2 tries. Indeed, the article never once mentions why radix tries are called radix tries: you can choose binary, quaternary, octary, and so forth through the higher-order powers of two radixes. At each node you can diverge to two children via the 2 values of the next 1 bit (i.e., Patricia trie = Radix-2 trie) or you can diverge to four children via 4 values of the next 2 bits (i.e., Radix-4 trie) or you can diverge to eight children via 8 values of the next 4 bits (i.e., Radix-8 trie). This article does not clearly present the fact that the entire concept of the Internet's routing is based on radix tries, especially inter-autonomous-system protocols such as BGP. —optikos (talk) 06:38, 5 May 2010 (UTC)

I am not sure why an invented name (eg. crit-bit trees) for these tries has been injected into this article. It is not a common name, nor particularly important among other re-inventions of this structure over the years. Oyigit (talk) 14:50, 30 July 2010 (UTC)

Rather, one should ask why a mostly *unrelated* tree structure has been injected into this article. Crit-bit trees are **not** tries, but an optimized flavour of binary search trees. A crit-bit tree is a binary search tree which only branches at every bit position where two subsets of previously inserted keys differed, according to the bit's value in the newly inserted key. Unlike a trie, it therefore does not contain the full key in its structure (it is perfectly possible, and valid, for a key to differ in one or several bits with any previously seen key, before the "critical" bit) and needs to store the key in leaf nodes for an unambiguous lookup. — Preceding unsigned comment added by 92.192.19.64 (talk) 14:32, 24 July 2012 (UTC)