Jump to content

2–3–4 tree: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Line 10: Line 10:
To insert a value, we start at the root of the 2-3-4 tree:
To insert a value, we start at the root of the 2-3-4 tree:
# If the current node is a 4-node:
# If the current node is a 4-node:
#* Push the middle element of the 4-node up into the parent, leaving a 3-node.
#* Split the remaining 3-node up into a pair of 2-nodes (the now missing middle value is handled in the next step).
#* Split the remaining 3-node up into a pair of 2-nodes (the now missing middle value is handled in the next step).
#* If this is the root node (which thus has no parent):
#* If this is the root node (which thus has no parent):

Revision as of 04:00, 8 December 2010

A 2-3-4 tree (also called a 2-4 tree), in computer science, is a self-balancing data structure that is commonly used to implement dictionaries. 2-3-4 trees are B-trees of order 4; like B-trees in general, they can search, insert and delete in O(log n) time. One property of a 2-3-4 tree is that all external nodes are at the same depth.

2-3-4 trees are an isometry of red-black trees, meaning that they are equivalent data structures. In other words, for every 2-3-4 tree, there exists at least one red-black tree with data elements in the same order. Moreover, insertion and deletion operations on 2-3-4 trees that cause node expansions, splits and merges are equivalent to the color-flipping and rotations in red-black trees. Introductions to red-black trees usually introduce 2-3-4 trees first, because they are conceptually simpler. 2-3-4 trees, however, can be difficult to implement in most programming languages because of the large number of special cases involved in operations on the tree. Red-black trees are simpler to implement, so tend to be used instead.

Magic of the Lower Left Leaf. If you are a bit careful when doing fusions, the Lower Left Leaf is always the same node, from begin to end, when working with the tree. So the minimum element can be found in constant time O(1). By using in-order retrieval from that point of p elements the p lowest elements can be found in O(p log(n)) time.

Insertion

To insert a value, we start at the root of the 2-3-4 tree:

  1. If the current node is a 4-node:
    • Push the middle element of the 4-node up into the parent, leaving a 3-node.
    • Split the remaining 3-node up into a pair of 2-nodes (the now missing middle value is handled in the next step).
    • If this is the root node (which thus has no parent):
      • the middle value becomes the new root 2-node and the tree height increases by 1. Ascend into the root.
      • Otherwise, push the middle value up into the parent node. Ascend into the parent node.
  2. Find the child whose interval contains the value to be inserted.
  3. If the child is empty, insert the value into current node and finish.
    • Otherwise, descend into the child and repeat from step 1.[1][2]

Example

To insert the value "25" into this 2-3-4 tree:

  • Begin at the root (10, 20) and descend towards the rightmost child (22, 24, 29). (Its interval (20, ∞) contains 25.)
  • Node (22, 24, 29) is a 4-node, so its middle element 24 is pushed up into the parent node.
  • The remaining 3-node (22, 29) is split into a pair of 2-nodes (22) and (29). Ascend back into the new parent (10, 20, 24).
  • Descend towards the rightmost child (29). (Its interval (24, ∞) contains 25.)
  • Node (29) has no rightmost child. (The child for interval (29, ∞) is empty.) Stop here and insert value 25 into this node.

Deletion

Deletion is the more complex operation and involves many special cases.

First the element to be deleted needs to be found. The element must be in a node at the bottom of the tree; otherwise, it must be swapped with another element which precedes it in in-order traversal (which must be in a bottom node) and that element removed instead.

If the element is to be removed from a 2-node, then a node with no elements would result. This is called underflow. To solve underflow, an element is pulled from the parent node into the node where the element is being removed, and the vacancy created in the parent node is replaced with an element from a sibling node. (Sibling nodes are those which share the same parent node.) This is called transfer.

If the siblings are 2-nodes themselves, underflow still occurs, because now the sibling has no elements. To solve this, two sibling nodes are fused together (after pulling element from the parent node).

If the parent is a 2-node, underflow will occur on the parent node. This is solved by using the methods above. This may cause different parent node to sustain underflow as deletions and replacements are being made, referred to as underflow cascading.

Deletion in a 2-3-4 tree is O(log n), assuming transfer and fusion run in constant time ( O(1) ).[1][3]

See also

References

  1. ^ a b Ford, William; Topp, William (2002), Data Structures with C++ Using STL (2nd ed.), New Jersey: Prentice Hall, p. 683, ISBN 0-13-085850-1
  2. ^ Goodrich, Michael T; Tamassia, Roberto; Mount, David M (2002), Data Structures and Algorithms in C++, Wiley, ISBN 0-471-20208-8
  3. ^ Grama, Ananth (2004). "(2,4) Trees" (PDF). CS251: Data Structures Lecture Notes. Department of Computer Science, Purdue University. Retrieved 2008-04-10.

External links