AVL tree: Difference between revisions

Content deleted Content added

Inline

Revision as of 13:31, 22 September 2006

In computer science, an AVL tree is a self-balancing binary search tree, and the first such data structure to be invented. In an AVL tree the heights of the two child subtrees of any node differ by at most one, therefore it is also called height-balanced. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases. Additions and deletions may require the tree to be rebalanced by one or more tree rotations.

The AVL tree is named after its two inventors, G.M. Adelson-Velsky and E.M. Landis, who published it in their 1962 paper "An algorithm for the organization of information."

Principles

As nodes are added and deleted each node keeps track of how deep its left and right subtrees are.

So for example, a two node tree, with a root and one child (which we will say is on the left), wil have a root where the left side subtree depth is 1 and the right side subtree depth is 0. If we then add a child to the child, the root's left side subtree depth will then be 2. (Do not confuse the depth of the subtree with the count of the number of children.)

For any node, if the subtree depths are identical, or differ only by one, then the tree at that point is highly (although not perfectly - but as perfectly as AVL can provide) balanced. If however the difference becomes two (and remember that only one node is added at a time, so the tree depths always change in units of one) then it is possible to do some rebalancing work which will shift a node over from the deep side of the node to the shallower side.

The same tree after being height-balanced

Operations

The basic operations of an AVL tree involve carrying out the same algorithms as would be carried out on an unbalanced binary search tree, followed then by some fix up work which keeps the tree balanced.

Insertion

Insertion into an AVL tree is carried out by inserting the given value into the tree as if it were an unbalanced binary search tree. The new node will have affected the depth counts of some or all of the nodes above it in the tree, so we then move up the tree from the new node, correcting the depth count values.

Once this has been done, we then start once more at the new node and move up the tree; when we find a node which now has a difference between its left and right depths of 2, we need to do some rebalancing.

The purpose of rebalancing is to shift one node from the overly-deep subtree into the shallower subtree, which of course has the effect of rebalancing the tree.

We do this by performing tree rotation.

Now, for a node to have a difference of 2 between its left and right subtrees automatically implies it has a grandchild. This is because the most extreme case is where the node has no children on one side and a child and grandchild on the other. The new node is of course the grandchild, since all new nodes enter a tree as leaves.

The rotation we perform has the purpose of balancing the subtree involving the new node, it's parent and it's grandparent. We know this subtree is unbalanced because the grandparent has a different of 2 in the depths of the left and right children.

The rotations done to balance this subtree depends on the position of the new node and its parent, because the rotates you have to do to balance a subtree depend on the arrangement of that subtree.

The easiest case is when the parent and child are on the same side; which is to say, the new node is on the same side of its parent as the parent is to its parent.

Here, we simply rotate the grandparent right and we've balanced the subtree.

The more complicated case is when the new node is on the opposite side to its parent than the parent is to its grandparent.

Here, we have to rotate the parent left first, before we rotate the grandparent right. We have to do this first rotation because our goal is to rebalance the subtree, but the new node is greater than the parent and so needs to end up being the parent of the grandparent and the parent - so we need to rotate the parent left, so the new node moves up the tree, so that when rotate the grandparent right, the child ends up being the new grandparent, which is correct.

Once an unbalanced node has been located and the rotation(s) performed, checking can stop; it will only ever be necessary to perform a single rebalance.

Deletion

Deletion is performed as for a normal binary tree (which involves finding the in-order successor and moving it into the spot occupied by the node being deleted).

The depth counts are then updated (working upwards from the parent of the in-order successor) and then the depth counts are checked (again, working upwards from the parent of the in-order successor) to see if rotations are required to balance the tree.

Lookup

Lookup in an AVL tree is performed exactly as in an unbalanced binary search tree, and thus takes O(log n) time, since an AVL tree is always kept balanced. No special provisions need to be taken, and the tree's structure is not modified by lookups. (This is in contrast to splay tree lookups, which do modify their tree's structure.)

References

G. Adelson-Velskii and E.M. Landis, "An algorithm for the organization of information." Doklady Akademii Nauk SSSR, 146:263–266, 1962 (Russian). English translation by Myron J. Ricci in Soviet Math. Doklady, 3:1259–1263, 1962.
Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Pages 458–475 of section 6.2.3: Balanced Trees. Note that Knuth calls AVL trees simply "balanced trees".

External links