Day–Stout–Warren algorithm

The Day–Stout–Warren (DSW) algorithm is a method for efficiently balancing binary search trees – that is, decreasing their height to O(log n) nodes, where n is the total number of nodes. Unlike a self-balancing binary search tree, it does not do this incrementally during each operation, but periodically, so that its cost can be amortized over many operations. The algorithm was designed by Quentin F. Stout and Bette Warren in a 1986 CACM paper,^[1] based on work done by Colin Day in 1976.^[2]

The algorithm requires linear (O(n)) time and is in-place. The original algorithm by Day generates as compact a tree as possible: all levels of the tree are completely full except possibly the bottom-most. It operates in two phases. First, the tree is turned into a linked list by means of an in-order traversal, reusing the pointers in the (threaded) tree's nodes. A series of left-rotations forms the second phase.^[3] The Stout–Warren modification generates a complete binary tree, namely one in which the bottom-most level is filled strictly from left to right. This is a useful transformation to perform if it is known that no more inserts will be done. It does not require the tree to be threaded, nor does it require more than constant space to operate.^[1] Like the original algorithm, Day–Stout–Warren operates in two phases, the first entirely new, the second a modification of Day's rotation phase.^[1]^[3]

A 2002 article by Timothy J. Rolfe brought attention back to the DSW algorithm;^[3] the naming is from the section title "6.7.1: The DSW Algorithm" in Adam Drozdek's textbook.^[4] Rolfe cites two main advantages: "in circumstances in which one generates an entire binary search tree at the beginning of processing, followed by item look-up access for the rest of processing" and "pedagogically within a course on data structures where one progresses from the binary search tree into self-adjusting trees, since it gives a first exposure to doing rotations within a binary search tree."

Pseudocode

The following is a presentation of the basic DSW algorithm in pseudocode, after the Stout–Warren paper.^[1]^{[note 1]} It consists of a main routine with three subroutines. The main routine is given by

Allocate a node, the "pseudo-root", and make the tree's actual root the right child of the pseudo-root.
Call tree-to-vine with the pseudo-root as its argument.
Call vine-to-tree on the pseudo-root and the size (number of elements) of the tree.
Make the tree's actual root equal to the pseudo-root's right child.
Dispose of the pseudo-root.

The subroutines are defined as follows:^{[note 2]}

routine tree-to-vine(root)
    // Convert tree to a "vine", i.e., a sorted linked list,
    // using the right pointers to point to the next node in the list
    tail ← root
    rest ← tail.right
    while rest ≠ nil
        if rest.left = nil
            tail ← rest
            rest ← rest.right
        else
            temp ← rest.left
            rest.left ← temp.right
            temp.right ← rest
            rest ← temp
            tail.right ← temp

routine vine-to-tree(root, size)
    leaves ← size + 1 − 2^{⌊log₂(size + 1)⌋}
    compress(root, leaves)
    size ← size − leaves
    while size > 1
        compress(root, ⌊size / 2⌋)
        size ← ⌊size / 2⌋

routine compress(root, count)
    scanner ← root
    for i ← 1 to count
        child ← scanner.right
        scanner.right ← child.right
        scanner ← scanner.right
        child.right ← scanner.left
        scanner.left ← child

Notes

^ This version does not produce perfectly balanced nodes; Stout and Warren present a modification that does, in which the first call to compress is replaced by a different subroutine.
^ In the original presentation, tree-to-vine computed the tree's size as it went. For the sake of brevity, we assume this number to be known in advance.

References

^ ^a ^b ^c ^d Stout, Quentin F.; Warren, Bette L. (September 1986). "Tree rebalancing in optimal space and time" (PDF). Communications of the ACM. 29 (9): 902–908. doi:10.1145/6592.6599. hdl:2027.42/7801. S2CID 18599490.
^ Day, A. Colin (1976). "Balancing a Binary Tree". Comput. J. 19 (4): 360–361. doi:10.1093/comjnl/19.4.360.
^ ^a ^b ^c Rolfe, Timothy J. (December 2002). "One-Time Binary Search Tree Balancing: The Day/Stout/Warren (DSW) Algorithm". SIGCSE Bulletin. 34 (4). ACM SIGCSE: 85–88. doi:10.1145/820127.820173. S2CID 14051647. Archived from the original on 2012-12-13.
^ Drozdek, Adam (1996). Data Structures and Algorithms in C++. PWS Publishing Co. pp. 173–175. ISBN 0-534-94974-6.

[5] This version does not produce perfectly balanced nodes; Stout and Warren present a modification that does, in which the first call to compress is replaced by a different subroutine.

[6] In the original presentation, tree-to-vine computed the tree's size as it went. For the sake of brevity, we assume this number to be known in advance.

[sw-1] Stout, Quentin F.; Warren, Bette L. (September 1986). "Tree rebalancing in optimal space and time" (PDF). Communications of the ACM. 29 (9): 902–908. doi:10.1145/6592.6599. hdl:2027.42/7801. S2CID 18599490.

[2] Day, A. Colin (1976). "Balancing a Binary Tree". Comput. J. 19 (4): 360–361. doi:10.1093/comjnl/19.4.360.

[rolfe-3] Rolfe, Timothy J. (December 2002). "One-Time Binary Search Tree Balancing: The Day/Stout/Warren (DSW) Algorithm". SIGCSE Bulletin. 34 (4). ACM SIGCSE: 85–88. doi:10.1145/820127.820173. S2CID 14051647. Archived from the original on 2012-12-13.

[4] Drozdek, Adam (1996). Data Structures and Algorithms in C++. PWS Publishing Co. pp. 173–175. ISBN 0-534-94974-6.

[1]

[2]

[3]

[4]

[note 1]

[note 2]