WAVL tree

In computer science, a WAVL tree or weak AVL tree is a self-balancing binary search tree. WAVL trees are named after AVL trees, another type of balanced search tree, and are closely related both to AVL trees and red–black trees, which all fall into a common framework of rank balanced trees. Like other balanced binary search trees, WAVL trees can handle insertion, deletion, and search operations in time O(log n) per operation.

WAVL trees are designed to combine some of the best properties of both AVL trees and red–black trees. One advantage of AVL trees over red–black trees is that they are more balanced: they have height at most $\log _{\varphi }n\approx 1.44\log _{2}n$ (for a tree with n data items, where $\varphi$ is the golden ratio), while red–black trees have larger maximum height, $2\log _{2}n$ . If a WAVL tree is created using only insertions, without deletions, then it has the same small height bound that an AVL tree has. On the other hand, red–black trees have the advantage over AVL trees that they perform less restructuring of their trees. In AVL trees, each deletion may require a logarithmic number of tree rotation operations, while red–black trees have simpler deletion operations that use only a constant number of tree rotations. WAVL trees, like red–black trees, use only a constant number of tree rotations, and the constant is even better than for red–black trees.

WAVL trees were introduced by Haeupler, Sen & Tarjan (2015). The same authors also provided a common view of AVL trees, WAVL trees, and red–black trees as all being a type of rank-balanced tree.

Definition

As with binary search trees more generally, a WAVL tree consists of a collection of nodes, of two types: internal nodes and external nodes. An internal node stores a data item, and is linked to its parent (except for a designated root node that has no parent) and to exactly two children in the tree, the left child and the right child. An external node carries no data, and has a link only to its parent in the tree. These nodes are arranged to form a binary tree, so that for any internal node x the parents of the left and right children of x are x itself. The external nodes form the leaves of the tree. The data items are arranged in the tree in such a way that an inorder traversal of the tree lists the data items in sorted order.

What distinguishes WAVL trees from other types of binary search tree is its use of ranks. These are numbers, stored with each node, that provide an approximation to the distance from the node to its farthest leaf descendant. The ranks are required to obey the following properties:

• Every external node has rank 0
• If a non-root node has rank r, then the rank of its parent must be either r + 1 or r + 2.
• An internal node with two external children must have rank exactly 1.

Operations

Searching

Searching for a key k in a WAVL tree is much the same as in any balanced binary search tree data structure. One begins at the root of the tree, and then repeatedly compares k with the data item stored at each node on a path from the root, following the path to the left child of a node when k is smaller than the value at the node or instead following the path to the right child when k is larger than the value at the node. When a node with value equal to k is reached, or an external node is reached, the search stops.

If the search stops at an internal node, the key k has been found. If instead, the search stops at an external node, then the position where k would be inserted (if it were inserted) has been found.

Insertion

Insertion of a key k into a WAVL tree is performed by performing a search for the external node where the key should be added, replacing that node by an internal node with data item k and two external-node children, and then rebalancing the tree. The rebalancing step can be performed either top-down or bottom-up, but the bottom-up version of rebalancing is the one that most closely matches AVL trees.

In this rebalancing step, one assigns rank 1 to the newly created internal node, and then follows a path upward from each node to its parent, incrementing the rank of each parent node if necessary to make it greater than the new rank of its child, until one of three stopping conditions is reached.

• If the path of incremented ranks reaches the root of the tree, then the rebalancing procedure stops, without changing the structure of the tree.
• If the path of incremented ranks reaches a node whose parent's rank previously differed by two, and (after incrementing the rank of the node) still differs by one, then again the rebalancing procedure stops without changing the structure of the tree.
• If the procedure increases the rank of a node x, so that it becomes equal to the rank of the parent y of x, but the other child of y has a rank that is smaller by two (so that the rank of y cannot be increased) then again the rebalancing procedure stops. In this case, by performing at most two tree rotations, it is always possible to rearrange the tree nodes near x and y in such a way that the ranks obey the constraints of a WAVL tree, leaving the rank of the root of the rotated subtree unchanged.

Thus, overall, the insertion procedure consists of a search, the creation of a constant number of new nodes, a logarithmic number of rank changes, and a constant number of tree rotations.

Deletion

As with binary search trees more broadly, deletion operations on an internal node x that has at least one external-node child may be performed directly, by removing x from the tree and reconnecting the other child of x to the parent of x. If, however, both children of a node x are internal nodes, then we may follow a path downward in the tree from x to the leftmost descendant of its right child, a node y that immediately follows x in the sorted ordering of the tree nodes. Then y has an external-node child (its left child). We may delete x by performing the same reconnection procedure at node y (effectively, deleting y instead of x) and then replacing the data item stored at x with the one that had been stored at y.

In either case, after making this change to the tree structure, it is necessary to rebalance the tree and update its ranks. As in the case of an insertion, this may be done by following a path upwards in the tree and changing the ranks of the nodes along this path until one of three things happens: the root is reached and the tree is balanced, a node is reached whose rank does not need to be changed, and again the tree is balanced, or a node is reached whose rank cannot be changed. In this last case a constant number of tree rotations completes the rebalancing stage of the deletion process.

Overall, as with the insertion procedure, a deletion consists of a search downward through the tree (to find the node to be deleted), a continuation of the search farther downward (to find a node with an external child), the removal of a constant number of new nodes, a logarithmic number of rank changes, and a constant number of tree rotations.

It is worthwhile to compare the result of a delete which would cause rotations at multiple levels in an AVL tree with the rotation and rank changes performed in a WAVL tree. In the second image the node with value 12 has been deleted followed by a right rotation and assigning of all external nodes rank zero.

Computational complexity

Each search, insertion, or deletion in a WAVL tree involves following a single path in the tree and performing a constant number of steps for each node in the path. In a WAVL tree with n items that has only undergone insertions, the maximum path length is $\log _{\varphi }n\approx 1.44\log _{2}n$ . If both insertions and deletions may have happened, the maximum path length is $2\log _{2}n$ . Therefore, in either case, the worst-case time for each search, insertion, or deletion in a WAVL tree with n data items is O(log n).

Related structures

WAVL trees are closely related to both AVL trees and red–black trees. Every AVL tree can have ranks assigned to its nodes in a way that makes it into a WAVL tree. And every WAVL tree can have its nodes colored red and black (and its ranks reassigned) in a way that makes it into a red–black tree. However, some WAVL trees do not come from AVL trees in this way and some red–black trees do not come from WAVL trees in this way.

AVL trees

An AVL tree is a kind of balanced binary search tree in which the two children of each internal node must have heights that differ by at most one. The height of an external node is zero, and the height of any internal node is always one plus the maximum of the heights of its two children. Thus, the height function of an AVL tree obeys the constraints of a WAVL tree, and we may convert any AVL tree into a WAVL tree by using the height of each node as its rank.

The key difference between an AVL tree and a WAVL tree arises when a node has two children with the same rank or height. In an AVL tree, if a node x has two children of the same height h as each other, then the height of x must be exactly h + 1. In contrast, in a WAVL tree, if a node x has two children of the same rank r as each other, then the rank of x can be either r + 1 or r + 2. This greater flexibility in ranks also leads to a greater flexibility in structures: some WAVL trees cannot be made into AVL trees even by modifying their ranks, because they include nodes whose children's heights differ by more than one.

If a WAVL tree is created only using insertion operations, then its structure will be the same as the structure of an AVL tree created by the same insertion sequence, and its ranks will be the same as the ranks of the corresponding AVL tree. It is only through deletion operations that a WAVL tree can become different from an AVL tree. In particular this implies that a WAVL tree created only through insertions has height at most $\log _{\varphi }n\approx 1.44\log _{2}n$ .

Red–black trees

A red–black tree is a balanced binary search tree in which each node has a color (red or black), satisfying the following properties:

• External nodes are black.
• If an internal node is red, its two children are both black.
• All paths from the root to an external node have equal numbers of black nodes.

red–black trees can equivalently be defined in terms of a system of ranks, stored at the nodes, satisfying the following requirements (different than the requirements for ranks in WAVL trees):

• The rank of an external node is always 0 and its parent's rank is always 1.
• The rank of any non-root node equals either its parent's rank or its parent's rank minus 1.
• No two consecutive edges on any root-leaf path have rank difference 0.

The equivalence between the color-based and rank-based definitions can be seen, in one direction, by coloring a node black if its parent has greater rank and red if its parent has equal rank. In the other direction, colors can be converted to ranks by making the rank of a black node equal to the number of black nodes on any path to an external node, and by making the rank of a red node equal to its parent.

The ranks of the nodes in a WAVL tree can be converted to a system of ranks of nodes, obeying the requirements for red–black trees, by dividing each rank by two and rounding up to the nearest integer. Because of this conversion, for every WAVL tree there exists a valid red–black tree with the same structure. Because red–black trees have maximum height $2\log _{2}n$ , the same is true for WAVL trees. However, there exist red–black trees that cannot be given a valid WAVL tree rank function.

Despite the fact that, in terms of their tree structures, WAVL trees are special cases of red–black trees, their update operations are different. The tree rotations used in WAVL tree update operations may make changes that would not be permitted in a red–black tree, because they would in effect cause the recoloring of large subtrees of the red–black tree rather than making color changes only on a single path in the tree. This allows WAVL trees to perform fewer tree rotations per deletion, in the worst case, than red-black trees.