Talk:AA tree

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated Start-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.
 
WikiProject Computer science (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

is this right?[edit]

I'm not sure this is right: "AA trees require O(log(N)) bits of metadata per node, in the form of an integer 'level'." If this were true, then to store 32 billion nodes or so I'd need a 32-bit level. But what is level, anyway? It's basically how many child nodes you've got until you hit a leaf node (with leaf nodes having a value of 1), if I'm not mistaken. Don't take my word for it since I am learning this for the first time, but I'm thinking a balanced binary search tree with 4 billion nodes can have a maximum level of 64 or so, i.e. roughly double log(N). If we ignore the estimated factor of 2 (I could be way off here, but this is what I think) then the number of bits required by the level should be O(log(log(N))). This article is in critical need of the attention of an expert. What kind of complexity guarantee's do AA trees have?? It is not always perfectly balanced, but in what way does this affect performance? Is this article the same as the trees listed here? http://eternallyconfuzzled.com/tuts/datastructures/jsw_tut_andersson.aspx

aa[edit]

There seems to be lots of errors with this article : Insert does not have a while loop. Insert compares the new data item with ones already in the tree, so it does not cater for data types without an ordering. -Nic Roets 09:53, 15 July 2006 (UTC)

Recursion[edit]

I see one of the editors argued that recursion should be freely used in the example code, and the code I replaced used recursion. My opinion is that recursion should NOT be used the sample code because a reader may just want to implement the code in assembly. Recursion may also significantly increase the space requirements of the code.

Comment on above: AA-tree major feature is algorithm simplicity. Current "assembler-friendly non-recursive" example code does major disservice to public by obfuscating that. I recently implemented AA-tree using online resources and code in current wikipedia page was totally useless in either understanding the algorithm or helping to write the code. I think whether the example code is recursive/non-recursive does not matter at all. Only thing that matters if the code illustrates underlying algorithm. And there the current code fails absolutely. —Preceding unsigned comment added by 217.159.130.170 (talk) 12:25, 16 October 2007 (UTC)

If people want to implement in assembly there are other resources on the webnishantjr (talk) 18:39, 11 December 2008 (UTC)

Psuedocode[edit]

I totally agree with the above. Look at the Wikipedia page on algorithms. This stuff should be explained at a high level. I think this page should be rewritten with pseudocode, which is what I may or may not be about to do, depending on whether or not I can follow through on it. I just implemented the data structure in Haskell, and this page's description of the algorithm basically conflicts with the description of the original inventor when it comes to node deletion. —Preceding unsigned comment added by 18.248.3.158 (talk) 20:59, 9 December 2007 (UTC)

I rewrote it according to the above. I deleted the example deletion, as pretty as it was, because it was no longer consistent with the algorithm described on the page.

Rewrite[edit]

I think the recent edits (176876788 onwards) are a step backwards. I approve of the move to pseudocode in principle but I don't think it's been done well. Criticisms of the page rewrite include:

  • Removal of the brief definition at the beginning of the article is unhelpful
  • The 'five invariants' given do not insist that the level of a node must be at most one less than the level of its parent
  • The pseudocode for skew and split implies that the entire tree is passed as the parameter
  • Insertion: 'the increment of level(T)" is not the only thing that could necessitate recursion up the tree
  • I don't like the pseudocode in general. It's not really pseudocode anyway as it's full of implied functions. e.g. nil(T) should be phrased as "if T is the empty tree". Also what does node(X, 1, Nil, Nil) mean? See http://en.wikipedia.org/wiki/Pseudocode#Examples_of_pseudocode.
  • Insert pseudocode: there's no need to assume the existence of the functions 'successor' and 'predecessor'. AA trees need not be threaded.
  • The preamble before the deletion pseudocode is totally misleading. There is certainly no need to skew and split "the entire level" - remember the operations are all O(log(n)) for a tree of n nodes. It's unnecessary to criticism Andersson's deletion algorithm as being unoptimised. The sentence "This approach was favored, because when laid down conceptually, it has three easily understood separate steps:" sounds like it's taken from a dissertation and is not neutral. Also "easily understood" is likely to make the reader feel inadequate. Even for people who have a thorough grounding in advanced data structures, it is by no means obvious that you need at most three skews and two splits during each iteration of the rebalance after deletion.

I suggest a revert. Dragonmeat (talk) 23:43, 10 December 2007 (UTC)

Rewrite can be done[edit]

The pseudocode is difficult to understand, I implemented the tree in Haskell (which seems to outperform Data.Map!, still testing), but it wasn't simple as stated. This wasn't because the tree is very complex. The pseudocode contains a lot of implicit assumptions. For example in function split:

        We have two horizontal right links.  Take the middle node, elevate it, and return it.
        R = right(T) 
        right(T) := left(R)
        left(R) := T
        level(R) := level(R) + 1
        return R

Note: := is a assignment (the pointer is dereferenced), and = is a pointer assignment. (A hint can be found in skew)

R = right T -- save pointer to right
right T  = left R -- The right node of T is replaced by the left node of the right node of T
left R  = T -- The left node of T is replaced by T
level R  = level R + 1 -- The level of the right node is incremented by 1
return R -- Return the modified right node

I didn't spotted := and = at first, so especially the level R = level R + 1 line became somewhat difficult to grasp. Do I have to change the level of right(T), the original value, or R, the modified value? This partially because in the case of Haskell := and = are the same operation. They replace a value. I had to implement the tree to understand the code. And tested it rather thoroughly to spot the bug.

Here is the situation in haskell:

 
                                                                                                       if you didn't catch the difference 
                                                                                                           between :=  and =. the level (R) + 1
                                                                                                           could also be placed here: 
split :: a :- b -- AATree a b -> AATree a b         level(right(right(T))     level(R) := level (R) + 1           /
                                                    /                                    /                       /
split t@(Node d a b l (Node dr ra rb rl rr@(Node drr _ _ _ _))) | d == drr  = Node (dr + 1) ra rb (Node d a b l rl) rr
                    | |              |   |                      | otherwise = t
                    | right (T) = R  |   right (right (T)) = right(R)
                    | left (T)       | left(right(T)) = left(R)
{-- Catch all other cases --}
split t = t

In haskell it is easy to not see := and =. There are probably more languages, which don't have pointers or references.

Besides that, the algorithm seems to be correct. It nicely matches the behaviour of a binary search tree.

I became somewhat overly verbose, but what I wanted to say was, that I don't think a revert is necessary. The page just need some explanation.

--195.241.100.248 (talk) 17:13, 19 July 2010 (UTC)

Name[edit]

The article should start mentioning what AA stands for, I am guessing Arne Andersson but I don't know. 80.203.66.120 (talk) 19:31, 8 August 2008 (UTC)

I added that as I saw that it had previously been removed from the article without stating reason. 80.203.66.120 (talk) 21:17, 8 August 2008 (UTC)

Wrong assumption regarding removal of internal nodes[edit]

"Because of the AA property of all nodes of level greater than one having two children, the successor or predecessor node will be in level 1, making their removal trivial."

This sentence is either false or misleading (depending on the meaning of "or"). The successor/predecessor does not have two children, but it might still have one, making removal nontrivial. Removing an internal node may involve two swaps with a successor/predecessor. For instance, in the following tree, removing 0 may require two swaps (0 with 1 and 1 with 2):

    0
   / \
 -1   1
       \
        2

Koning robot (talk) 08:05, 25 May 2011 (UTC)