Jump to content

Talk:Red–black tree

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 99.11.197.75 (talk) at 00:04, 5 May 2011. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

NIL nodes yet again

The current description of "NIL" nodes is confusing. First, I would suggest that it is more precise to call a "NIL node" "the empty tree". Then one could say that NIL is a symbol for the empty tree. Second, the use of NIL nodes is not peculiar to red-black trees. In fact, many tree implementations use the empty tree such that every non-empty tree has two children, and one doesn't need separate cases for nodes with two children, one left child, one right child, or no children. However, it should be clear that this is an implementation detail, not a requirement. Calling these NIL nodes leaves is unnecessarily confusing, as a "leaf node" could also refer to a node with both children empty. 99.11.197.75 (talk) 00:04, 5 May 2011 (UTC)[reply]

Insert description more complicated than necessary?

The following is from [1]:

                        B(z)
                        /  \
                      R(x)  d
                      /  \
                     a   R(y)
                         /  \
                        b    c
  
                         ||
                         \/

      B(z)              R(y)                 B(x)
      /  \              /  \                 /  \
    R(y)  d    =>     B(x) B(z)    <=       a   R(y)
    /  \              / \  / \                  /  \
  R(x)  c             a b  c d                 b   R(z)
  /  \                                             /  \
 a    b                                           c    d

                         /\
                         ||

                       B(x)
                       /  \
                      a   R(z)
                          /  \
                        R(y)  d
                        /  \
                       b    c

These changes are followed by recursion if R(y)'s new parent is also red. The only other step is to recolor R(y) Black if it is the new root. 128.146.32.245 (talk) 00:08, 29 April 2011 (UTC)[reply]

Mistake in the formal proof?

I think the following sentence from the formal proof is wrong (problematic part highlighted):

As such it has two children both of which have a black-height of either bh() or bh()-1 (depending on whether is red or black).

Given the definition of the black-height bh (relevant part highlighted):

bh(v) = the number of black nodes (not counting v if it is black) from v to any leaf in the subtree (called the black-height).

I think the black-height of the children is bh(v') or bh(v') - 1 depending on whether these children are red or black, and not depending on v' color.

Also, the sentence may be interpreted as if both children have the same color, which is not necessarily the case (the children may have different colors).

Can someone confirm? These details do not invalidate the proof, but they make it a bit harder to follow. Olivier Teuliere (talk) 10:16, 21 March 2011 (UTC)[reply]

Deletion error?

It cannot be correct to say that if N is the new root then the tree is balanced. In a three element tree where all elements are black deleting the root node produces a tree where the black height is different on either left or right unless the child element is coloured red in an additional step. Or have I missed something here? 79.69.42.27 (talk) 21:23, 17 July 2010 (UTC)[reply]

The various deletion cases all refer to a node with at most one (non-leaf) child. The case of nodes with 2 children (like in your example) is dealt with in the first paragraph of the "Removal" section. Olivier Teuliere (talk) 09:56, 21 March 2011 (UTC)[reply]

rotate missing

for completeness, adn to be able to see what is going on at one point, there should be implementations of the rotation functions available. —Preceding unsigned comment added by 213.61.9.75 (talk) 09:20, 9 June 2010 (UTC)[reply]

tree

I would like to say thank you for this great article. Also, it might be good to link to http://webpages.ull.es/users/jriera/Docencia/AVL/AVL%20tree%20applet.htm. It's a java applet that demonstrates a red-black tree, as well as AVL and splay trees. --Anonymous CS Student

I think the tree in the image is actually invalid, since the left-left-left path and the left-right-right path only passes through two black nodes. Κσυπ Cyp   01:46, 30 Nov 2003 (UTC)

Now right-right-right-... passes through 3 black nodes, the rest through 2. Swapping the colour of 15 and 17 should work. Nice tree, though... Κσυπ Cyp   01:41, 4 Dec 2003 (UTC)
Oops, I'm a complete idiot. I was trying not to copy an existing image, but I guess I didn't understand red-black trees as well as I thought. Sorry for the problems.
Derrick Coetzee 14:58, 4 Dec 2003 (UTC)
I fixed the image, but I'm wondering if it could be misleading now, since no node now has a red and a black child (not counting nil children). I don't want to give any misimpressions.
Derrick Coetzee 15:05, 4 Dec 2003 (UTC)
The tree seems correct now. To have a node with both a red and a black child, it would be possible to remove node 6, and invert the colour of node 1, 8 and 11. And then if the tree looks too small, maybe add a red child for 15. Nice tree, in either case. Κσυπ Cyp   16:58, 4 Dec 2003 (UTC)

tree does not seem correct. The shorted path is not all black nodes.

The tree image is not correct since nodes 6, 22, and 27 are red even though they are also leaves. Red-Black trees don't allow for red leaf nodes. —Preceding unsigned comment added by 99.5.101.121 (talk) 06:08, 30 May 2009 (UTC)[reply]

About that whole nil-leaves discussion: Why do you need to remember that there are nil leaves? What does this actually do to change the definition? The example tree without the nil leaves would not violate the definition, and I cannot think of any others that would. Tjdw 19:43, 30 May 2004 (UTC)[reply]

It would violate the requirement that all leaves are black. It may be a good idea to add this to the article. I think nil leaves aren't strictly necessary, but red-black trees are typically presented this way. Derrick Coetzee 22:18, 30 May 2004 (UTC)[reply]
More importantly, it would violate the rule that each direct path must have the same number of black nodes. If one disregards the leaves, a black root with a black left child would pass the rule (the only path is two black nodes). Include the leaves and one can more clearly see that there's actually 3 paths, one with 2 black nodes, and two with 3 black nodes (including the leaves).
The picture should be updated to include a black node that has a red and a black child, this would make it more clear why the leaves are important to include in the counting.
It is possible though to have a correctly balanced red black tree without taking leaves into account, by replacing the counting rule by three simpler rules:
1) Any black node, except for the root node, must always have a sibling (ie, it cannot be an only child).
2) A red node that has a sibling always has exactly two black children.
3) A red node that has no sibling always has no children. John Hendrikx (talk) 13:10, 4 March 2010 (UTC)[reply]

This article leaves me confused as to how the behaviour and performance of a red-black tree differs from other binary trees. Could someone who understands the subject please add a note explaining why one might choose this structure over e.g. AVL trees? Alternatively, might a new topic like "binary tree implementations: advantages and disadvantages" be helpful?

A comparison is a great idea; of course, we know their theoretical (asympototic worst-case) performances are all the same. I don't really know how they differ in practice. Maybe someone does? Derrick Coetzee 14:09, 27 Aug 2004 (UTC)

On another note, this page needs well-organized, detailed explanations of the operations and their performance. I'll do this eventually if no one else does. Derrick Coetzee 16:22, 27 Aug 2004 (UTC)

And sure enough, I did, complete with pretty diagrams. It'd be really helpful if someone could fact-check me, though — this stuff is pretty complicated. Derrick Coetzee 07:41, 18 Sep 2004 (UTC)

I'm not sure if the description of the insertion operation is complete. Consider the following tree(. is a placeholder, ignore it):
0(B)
.\
.45(R)

After inserting the value 207, the algorithm yields:
207(B)
./
45(B)
.\
..0(R)

Which isn't a valid Red-Black tree(it violates property 4). --Ryan Stone

You may find it easier to write your trees using s-expression syntax, like this:
(0 black (45 red nil nil) nil)
(0 black (45 red nil nil) (207 black nil nil))
This insertion does fall into Case 2, but this tree does not actually occur, because newly inserted nodes are always initially coloured red. The actual tree produced would look like this:
(0 black (45 red nil nil) (207 red nil nil))
This satisfies property 4. If you think the text is unclear about this, please edit. Perhaps a diagram for Case 2 would help. Derrick Coetzee 22:29, 20 Sep 2004 (UTC)
That tree isn't a binary search tree, is it? You have 0 at the root and 47 in the left subtree, which can't happen.
No, it isn't; just the example they gave. Good catch. I'm sure they meant to put a negative number, though, just as you meant to put 45. Derrick Coetzee 07:34, 21 Sep 2004 (UTC)
Ok, maybe I'm just applying the algorithm wrong, but here's what I get:
Step 1: Start with: (0 black nil (45 red nil nil))
Step 2: Insert value 207 into tree as in normal binary search tree: (0 black nil(45 red nil (207 red nil nil)))
Step 3: Parent is red, uncle is black; apply case 4(rotate around parent): (0 black nil(207 red (45 red nil nil) nil))
Step 4: Apply case 5(rotate around grandparent): (207 black (0 red nil (45 black nil nil)) nil)
--Ryan Stone

Ok, I think that I've found the algorithm that will handle the above case(and similar ones correctly):
Cases 1-3 are unchanged
Cases 4 and 5 only apply when the parent(P in the diagram) is a left child of the grandparent
Case 6: The parent(P) is red, the uncle(U) is black, the new node(N) is the left child of P and P is a right child of its parent(G). Perform a right rotation on P, and then apply case 7 to P.
Case 7: The parent(P) is red, the uncle(U) is black, the new node(N) is the right child of P and P is a right child of its parent(G). Perform a left rotation on G, paint G red and paint P black.
I've tried implementing this and it seems to work, but I'm hesitant to make any changes to the article until somebody who actual knows this stuff confirms that this should work.
--Ryan Stone

It's true that when P is the right child of its parent, we're in a different, symmetrical situation, where N must be the opposite child of its parent, and rotations are performed in the opposite direction. I admit this isn't obvious. One way of dealing with this is to add the steps you have described. Another way is to say that when P is on the right side, right and left are reversed throughout steps 4 and 5; we are effectively acting on the mirrored tree. I've added a note to this effect — do you think this is a good solution? Derrick Coetzee 15:56, 21 Sep 2004 (UTC)
A similar note was added for deletion, where left and right must be reversed if N is the right child instead of the left. Although I didn't put it in the article, reversing left and right throughout is always okay, because you're effectively acting on the mirrored red-black tree, which is a binary search tree when considered with the opposite order. Derrick Coetzee 16:03, 21 Sep 2004 (UTC)


I don't understand a couple of things about the deletion algorithm. The cases are supposed to handle the case deleting a black node with two black children, at least one of which is leaf. However, wouldn't property 4 be violated if a black node had one child who was a leaf and the other child was black but was not a leaf node?(in other words, wouldn't be simpler just to say the cases apply to a black node with two leaf nodes as children?)

My second question is about case 3 of the deletion algorithm. I can see that the algorithm will ensure that all paths through P will have one fewer black node. However, won't any path that does not pass through P be unaffected, and have one more black node than any path through P?
--Ryan Stone

Ok, I see the problem. If you look closely at case 2a of the San Diego State University: CS 660, after colouring the sibling red, you colour the parent double black and then perform the same procedure on it. I think that this also answers my first question, as well.
--Ryan Stone
I've changed the article to note that the parent node becomes double-black after case 3 is applied. Am I correct when I say that a double-black node has one fewer black node along any paths running through it? Also, is my wording clear enough?--Ryan Stone 19:03, 22 Sep 2004 (UTC)


Case 3 still mentions double-black nodes. I'm not sure how to reword that to avoid using the term double-black; anyone else have any ideas?--Ryan Stone 16:13, 24 Sep 2004 (UTC)

I've removed the reference to double-black nodes in case 3 of deletion. I've also removed the statement in case 1 that it applies when the root node was deleted and replaced. Because the rebalancing algorithm is recursive, it's possible to fall into case 1 even if the root node was not deleted.--Ryan Stone 21:24, 29 Sep 2004 (UTC)

I've slightly reworded deletion case 3 to make it clearer why property 4 is still violated. I've made a change to the operations paragraph. Case 3 of the insertion rebalancing procedure and case 3 of the deletion rebalancing procedure are the only two recursive cases. Both only involve colour changes; not rotations, so we can say that the insertion and deletion operations require O(log n) colour changes and no more than 2 rotations. Because colour changes are so quick(in practice, merely changing the value of a variable), it gives a better picture of the speed of the insertion and deletion algorithms IMO.

Now, the article states that insertion and deletion require O(log n) space. I assume that's because of the stack space needed when we hit a recursive case. However, aren't both algorithms tail-recursive?(ie wouldn't it be possible to convert them to iterative implementations?) The algorithms are complex enough that I'm not sure if I'd like to make that change, but I think that it would be quite possible to do so.--Ryan Stone 16:03, 30 Sep 2004 (UTC)

Ok, I understand now. Persistent implementations for Red-Black Trees require O(log n) space, but not the normal version.--Ryan Stone 14:50, 1 Oct 2004 (UTC)
Hey, thanks for your changes. I believe they're all correct. Sorry that the statement about the persistent implementations was unclear. I was confused why the deletion algorithm didn't seem to be recursive before, so that helps. Thanks again. Derrick Coetzee 22:03, 1 Oct 2004 (UTC)

I don't understand the purpose of property #2, "All leaves are black", if you're going to use "nil" nodes for all leaves, and then say you don't even need to draw them. If you just got rid of property #2, and "nil" nodes, would it not work just as well?

(Yes, the nil nodes confused me, too. You started out by saying nodes hold data, and leaf nodes are black, and then show a diagram where the leaves appear to be red -- and then 3 paragraphs later say that the boxes that are shaped differently from the other nodes and don't hold any data are actually the leaves.)

The terminology section does conflict with the use of nil leaves. Leaf nodes do not store data in this presentation. It is possible to eliminate nil leaves by removing rule 2 and changing rule 3 to "red nodes only have black children". Unfortunately, this introduces a large number of special cases in the later proofs, since we have to deal with each internal node having 1 or 2 children. By introducing nil leaves and colouring them black, we can now say there is a black child where before we'd have to say "black or no child". I would like to admit the simplest presentation possible of the main invariants, but not if it sacrifices the overall presentation too much. Thanks for your feedback though; I realise nil leaves are often confusing and will think about how to ameliorate this problem. Deco 12:15, 28 Dec 2004 (UTC)

Leaf nodes again

Note that the use of the leaf-nodes in the articles conflict with the link in the terminology section.--Loading 12:11, 11 May 2005 (UTC)[reply]

Perhaps these "Nil" nodes code be done away with altogether? I think it'd be easier just to say "black or missing" (or "non-red") than to introduce a funny construct to make certain special cases disappear. It's also not that hard in actual code to do without them.--ghakko

Well, it's easy to say "black or missing" but it's not so easy to draw black or missing in the diagrams. I copied the nil leaf presentation from an online source, but it could be presented either way, it would just be a lot of work to change over and I don't see a particularly compelling reason not to do it this way. Deco 19:42, 14 May 2005 (UTC)[reply]
A very important information is missing: Red Black Trees are isomorphics to 2-3-4 Trees (B-Trees of order 4) (http://pedia.nodeworks.com/2/2-/2-3/2-3-4_trees/). 22:15, 28 May 2005 (UTC)

I think, we should add an explanation to the Delete case about the possibility that N (the deleted node's child) may be a "Nil" node. In order that the tree remains well defined, N must remain a leaf node even after all transformations. You can verify (for example from the images) that it will, but I think it is not obvious. Should I add the explanation? Another thing is that maybe the text would be simplified if we used consistently "P" instead of "N's parent" throughout the text. What do you think? Drak 22:58, 6 January 2006 (UTC)[reply]

Go for it. I'll review your changes. Thanks for looking over this. Deco 00:39, 7 January 2006 (UTC)[reply]

Deleting a red node

In the early paragraphs of the Deletion section, you state "If we are deleting a red node, we can simply replace it with its black child, if any." This is indeed the case if there is at most one child. If this red node has two children, however, the algorithm becomes a lot more complicated, unless you simply replace the red node by one of the children and reinsert every element of the other child's subtree. Schellhammer

You can assume that the node has at most one child. The reason for this is explained in the previous paragraph. I'm sorry that it wasn't clear. The idea is that you delete the node exactly as you would delete a node in a BST. To do this, you must find a node that immediately precedes or follows the node to be deleted in inorder traversal order. We need to copy its value into the node to be deleted, then delete the node it was copied from. Such a node will always have zero or one children, which simplifies the remaining cases. Deco 23:52, 28 July 2005 (UTC)[reply]
Okay, got you now. Thanks!
Something else. I was just wondering: in deleting a node with two children, is there's a simple way to determine if replacing the value by the largest preceeding or the smallest following value is the faster alternative? Schellhammer
The simplest way I can think of is to add two values to each node indicating the number of steps needed to reach the smallest and largest node in that subtree. These are easy to maintain in log time and allow you to choose optimally. This requires quite a bit of space and time to maintain though; the most effective strategy is probably just to choose one at random. Deco 02:10, 3 August 2005 (UTC)[reply]

I think I encountered a problem in the deletion algorithm yet again connected with the leaf-nodes. You state: "If the node has no non-leaf children, we simply remove it." However, if the node is black and has only leaf children (which are also black), we reduce the black height in this branch if we simply remove the node. Consider the tree built by inserting the integers 1 to 4 and then delete the integer 1. Here's a nice applet which can be used to see the restructuring following the deletion. (It is German, but if you consider the translations Anfang=start Zurück=back Einfügen=insert Weiter=proceed and Löschen=delete (mark node to delete after hitting the button) it should be easy to handle.) Schellhammer

Thanks for catching this. I'm afraid I've forgotten far too much about red-black trees. I believe we proceed through all the cases whether or not the child is a leaf node or non-leaf node, and they all work out if the child is simply a leaf node since they don't assume that N has children. Deco 02:10, 3 August 2005 (UTC)[reply]
For what it's worth, to avoid further correctness issues I'm going to try to come up with some ML code that precisely mirrors the discussion. I should be able to test it and inject it into the article then to provide some concreteness. Deco 02:52, 3 August 2005 (UTC)[reply]
I've just finished implementing the red-black tree as Visual-Basic classes (don't ask why!) and I took the deletion algorithm from this wiki-page. Empiric evidence (i.e. deleting any one of 1000 nodes in the tree) shows that the algorithm now is correct. Schellhammer 13:16, 3 August 2005 (GMT+1)
Thanks for your feedback, that's reassuring. I've got some C code for insertion and as soon as I figure out how the article formatting will work, I'll add a bit of code to each case. Then I'll do the same for deletion. This should help clarify things further. Out of curiosity, does your code use parent pointers or do you get around this somehow? The description in this article seems to almost require it. Deco 19:19, 3 August 2005 (UTC)[reply]
Yes, my code uses parent pointers as well. Actually, I don't know how you could do without, seeing that Insertion Case 3 has a recursive call upwards. By the way, the note in the diagram for this case should read "May violate property 2". Schellhammer 11:13, 4 August 2005 (UTC)[reply]
It's possible to do without parent pointers by keeping a linked list during downward traversal; push a pointer to the current node onto the list just before descending into a child. The list elements can allocated on the stack. My C implementation was like this because I was being silly and trying to save space.
The only other way I can think of handling Insertion Case 3 without parent pointers by having all the insertion functions return boolean values; this would let the caller check if the rebalancing has propagated upwards and continue it if necessary. Deletion is like this, but much messier (because of the need to swap two elements just before starting). I had to do it this way in Haskell (which has no pointers), and it was very painful.
Ghakko 14:30, 4 August 2005 (UTC)[reply]

The code

I've now added some C code to this page. All this code has been compiled and thoroughly tested, not just against crashes but that the red-black properties are preserved after every single insert/delete operation. I tried to keep the code as simple and readable as possible, and in line with what the text and diagrams describe to the letter. I hope it adds some concreteness to the article. I chose C because it's good for this presentation, since it deals well with trees with parent pointers (cyclic data structures are kinda annoying in functional languages) and is familiar to most programmers. Deco 07:32, 4 August 2005 (UTC)[reply]

Nice job! That'll help a lot in future implementations. However, in the Deletion part, you state "the insertion code works with either representation", i.e. designing leaf-nodes as node objects or NULL pointers. I think you rather meant the insertion algorithm, since you use constructions like child->color in the delete_one_child function and child may be a leaf if the original n (the one we deleted) had no non-leaf children. Similarly, calling any of the delete_caseX functions with a NULL pointer as parameter will cause problems. The code probably still works if you'd add the (admittedly irritating) if-statements everywhere to catch this case. Schellhammer 11:32, 4 August 2005 (UTC)[reply]
Sorry, I meant the code in the insertion section. There was only one place in which a leaf could be encountered during insertion, but deletion turned out to be much trickier, because it may be that the parameter n is a leaf, which prevents me from accessing any of the surrounding nodes if it's represented with NULL. I would have to additionally pass in at least its parent, which requires updating the parent as we perform rotations, and it's just a mess. Deco 17:30, 4 August 2005 (UTC)[reply]

I find it confusing that the insertion code is using NULL-pointer as leaves, but the removal code suddenly implies that you use real nodes as leaves. Perhaps it should be either both null pointer, or both real nodes. 21:53, 12. March 2010 —Preceding unsigned comment added by 85.177.97.53 (talk) 20:53, 12 March 2010 (UTC)[reply]

Errors in algorithm

(copied from an e-mail sent to Deco by Pavel)

case 3. if G was a root, you violate rule 2.

Also, if its parent is red, it violates rule 4. This is mentioned in the text and the code, and used to be in the image, but I wanted to keep the image easy to update. I fixed the text to mention that it might violate rule 2 also, and to be clearer that we always recursively process G.

case 5. path from leaf 4 and 5 contains 2 black nodes, path from other leafs contains 1 black nodes, that violates rule 5.

The triangles are not meant to represent leaves but subtrees. Some subtrees have black circles at the top to indicate that their root must be black. The rotation does not alter the number of black nodes along any path (it is the same before and after), and so does not threaten rule 5. The reason rule 5 is not violated initially is that subtrees 4 and 5 are not as deep as subtrees 1, 2, and 3; in fact, the uncle may even be a leaf itself.
If there's any way I can clarify this further please ask. Thanks again. Deco 14:16, 18 August 2005 (UTC)[reply]

Top-Down Insertion

Can someone change this to the more effecient single pass top-down insertion method?

It sounds interesting. Can you give a reference? However, I am not completely sure if it should replace the algorithm explained in the article. One of the objectives of this article is to explain the well-established algorithm for red-black trees, and I think the current article does it very well (thank you, authors!). Of course a note about a more efficient algorithm would be helpful.
Here's a reference on the top-down insertion: [1]. The notes paraphrase Data Structures & Problem Solving Using Java by Mark Allen Weiss.
The top-down approach to insertion works by preventing insertion case 3. During the tree descent, any black node with two red children is recolored to red and its children recolored to black (like in case 3). This preserves property 5 but may violate property 4. When property 4 is violated, it is guaranteed that the uncle of the new red node is black, since the grandparent node would have been recolored during the descent if both of its children were black. (See the reference if this is confusing.) As a result, one or two rotations (like in case 5) can be applied to restore property 4. The search then continues.
Once the leaf is ready to be inserted, insertion case 3 is impossible. This prevents the need to recursively ascend back up the tree. This makes the entire operation top-down. Rotations may still be required to preserve properties 4 and 5 after inserting the leaf.
Top-down insertion seems to be less efficient than bottom-up insertion. More rotations are required since each insertion can require multiple sets of rotations. Rather than letting the leaf node "push up" on the tree to find the proper rotation point, the top-down approach rotates at every such rotation point along the path on the way down. I will try to make a diagram to explain this better.
I wrote a C program to test my hypothesis that the top-down insertion is less efficient, and it appears to be correct. A top-down insertion requires more rotations for the same data set. However the top-down algorithm does work and it may make sense to publish it anyway. Cintrom 22:39, 5 May 2006 (UTC)[reply]
Thank you, Cintrom! I see that even if the top-down algorithm is less efficient than the bottom-up one, it is sometimes useful for educational purposes because of its simplicity. In addition, it is probably good to keep in mind that there is also a top-down algorithm for red-black trees, because it may come in handy when you derive a new algorithm from red-black trees.

Error in Insertion Case 4?

According to the rules of a R-B tree, "both children of a red node are black". In the Case 4 example, N and P are both red after insertion.

Case 4 does not complete the insertion, as you can see from the sample code - the node P is then processed using Case 5. The text says, "then, we deal with the former parent node P using Case 5". Deco 17:33, 17 October 2006 (UTC)[reply]

In case 4 the code that calls insert_case5 is indented as if there were a line with an else missing, but from the narrative, I think the code is right, but the indention is misleading. Also, is it too obvious to say in the last sentence that property 4 is still violated? I found the parenthesized comment about the sub-tree labelled "1" confusing, so I tried to touch it up. The other two things I left alone. Bill Smith 23:35, 24 November 2006 (UTC)[reply]

I implemented the insertion algorithm in OCaml and verified that the indention was wrong, so I corrected it.Bill Smith 17:11, 26 November 2006 (UTC)[reply]
The misleading indention was probably a result of tabs. There is no else. Your clarifications are welcome. Deco 14:42, 27 November 2006 (UTC)[reply]

The code

Is the C code correct? For example:

node grandparent(node n) {
    return n->parent->parent;
}

I don't think that's correct: shouldn't dots (.) instead of arrows (->) for accessing members in a struct that isn't referenced by a pointer? And isn't the parent member a pointer to the parent struct? If it is, then the function definition should be:

node* grandparent(node n)

Tell me if I'm wrong.

--wj32 talk | contribs 07:22, 8 November 2006 (UTC)[reply]

You are correct that the arrow operator is used with pointers. I just assumed that somewhere in the code that isn't shown there was "typedef struct rbtreenode * node;". This explains the lack of the struct keyword, as well. Since the article is about the data structure rather than the C language, I don't think that noting C language syntax is strictly necessary. If we were going to present a play-by-play on how to implement a red/black tree in C, we'd have to note that code is required to keep track of which node is the root node, maybe reccomend the definition of a structure to hold the root and a comparator function pointer, etc.
Perhaps the wording should be changed to something like "We'll demonstrate each case with example C-like code"? Conor H. 06:54, 23 November 2006 (UTC)[reply]
This is actual C code copied from a real working C program. That was how I verified its correctness. Yes, node is a pointer type. I suppose my stylistic conventions are a bit unusual. Deco 05:53, 28 November 2006 (UTC)[reply]

I think that the assertion that deletions are O(1) complexity must be incorrect. If the deleted node has two subtrees, then it must find the rightmost node of the left subtree or leftmost node of the right subtree. This find is admittedly quick since it is just following pointers, but nevertheless could be the height of the tree, so it is O(log(n)). Any comments? Caviare 01:08, 25 January 2007 (UTC)[reply]

Yes, it should be log n... just check other reference sources and that's what they'll say. enochlau (talk) 02:47, 25 January 2007 (UTC)[reply]

npov

A- This article isn't neutral

B- It uses "we" an unnecessary number of times.

You need to actually provide instances of NPOV. You can't slap a template up there and expect us to guess at what you're talking about. I'm removing the template until you do so. TheWarlock 01:37, 9 May 2007 (UTC)[reply]
Oh, and I forgot to mention: sign your posts. Put four tildes after your post so we know who you are. TheWarlock 01:38, 9 May 2007 (UTC)[reply]
Should it still be changed to use a non-we structure? j.engelh 08:20, 11 May 2007 (UTC)[reply]
It is preferable to have it use "we" than for it to be unintelligible but use the third-person out of grammatical tradition. If you can rephrase out the "we" and still have the article at least as understandable as it is now, go for it. TheWarlock 01:01, 12 May 2007 (UTC)[reply]
The use of "we" has nothing to do with NPOV. If you don't like the style feel free to fix it, just make it readable. Dcoetzee 20:06, 11 May 2007 (UTC)[reply]

Mistake in proof of correctness in Step 3?

The article currently says

Note that this [call to insert_case1] is the only recursive call, and it occurs prior to any rotations, which proves that a constant number of rotations occur.

This is far from evident. I'm not saying it's false that a constant number of rotations occur, only that this can't properly be called proof. It's easy to imagine a recursive function that makes its recursive call before doing some chunk of work (like a rotation) and yields a linear number of chunks of work done. For example, consider

postorder_tree_print(node n)
{
    postorder_tree_print(n->left);
    postorder_tree_print(n->right);
    print_node(n);
}

The only recursive calls occur before print_node, but calling postorder_tree_print on the root of a tree results in a number of calls to print_node equal to the number of nodes in the tree, which is not a constant.

Like Heawood reading Kempe's four-color proof, I'm smart enough to see the mistake but I'm not smart enough (or at least not interested enough) to fix it.

Pqrstuv 23:13, 1 July 2007 (UTC)[reply]

The fact that it's the only recursive call, and that it returns immediately after calling it, is what makes this evident. Apologies for not being clearer - I'll attempt to reword it. Dcoetzee 00:31, 19 January 2008 (UTC)[reply]

Removed from article

Every simple path from a node to a descendant leaf contains the same number of black nodes, either counting or not counting the null black nodes. (Counting or not counting the null black nodes does not affect the structure as long as the choice is used consistently.). (Someone needs to fix this wording, since it is not true in its present form. It could be interpreted to mean that a path from node A to a descendant leaf would have the same number of black nodes as a path from node B to a descendent leaf. It must be made clear that the "from a node" is fixed but the "descendant leaf" is allowed to vary.)

The section from (Someone needs... has been removed because it's not part of the article, it's a comment on the article. Text moved here so that someone who knows about the subject can hopefully fix it. -- Roleplayer (talk) 11:49, 18 January 2008 (UTC)[reply]

Possible insert case 5 error

Insertion, case 5 says:

"In this case, a right rotation on the parent P is performed; the result is a tree where the former parent P is now the parent of both the new node N and the former grandparent G."

It should say

"...a right rotation on the GRANDparent G is performed;..."

Most implementations define the rotation functions RotateRight(Node* n) or RotateLeft(Node* n) such that n is the root of the subtree being rotated, rather than one of the children. The code to implement the line in question would then be:

RotateRight(g);

rather than

RotateRight(p);

This could be just a terminology issue, but it will confuse people when they read source code and see irreconcilable differences with what is listed here.

Iterative implementation

The current article contains the iterative version (the version without using recursion) of implemenation of the removal operation. However, I doubt its usefulness because of the following reasons. Before removing the code, I would like to know what others think.

  • I do not think a long code like this fits well in the Wikipedia. The aim of the article is not to build a ready-to-use library, but to explain clearly what a red-black tree is, including how it works and how it is used. The article does its job well without this code.
  • It contains a small bug. In the lines
    /* delete_case3(n): */
    
    //s = sibling(n);//not needed
    
    the assignment to s is needed, as is explained in the text of case 2. I hesitate to correct just this because I am not sure about the correctness of the other part.
  • IMHO this code is hard to understand because almost everything is written inside the single for(;;) loop. Cases 4–6 (and possibly also case 2) should be moved out of the loop for better readability.

In addition, I do not see any reasons to include the iterative implementation only for the removal operation and not for the insertion operation. If the current code ever helps understanding of how removal works, then I think the iterative implementation of insertion would also be helpful. --fcp2007 (talk) 14:31, 4 May 2008 (UTC)[reply]

I personally am using the iterative implementation in benchmarks vs treaps. I find it very useful. It's kind of funny... I only read this discussion because I wanted to post about the bug described above! The same bug exists in case 6. Smilindog2000 (talk) 14:48, 21 May 2008 (UTC)[reply]
I agree about removing the iterative implementation personally. Nobody should be using code from Wikipedia articles, they're not appropriately licensed - the code is only there to explain the concepts. Dcoetzee 02:07, 25 December 2008 (UTC)[reply]

C code vs. pseudo-code

I find that pseudo-code would be far more appropriate for this article. After all, the article is about a data structure and the theory behind implementing it. A reference implementation should be hosted elsewhere, not inside an encyclopedia. I would recommend replacing the C code with pseudo-code, however I didn't want to do this myself until certain everybody agrees. -- Jerome Baum —Preceding unsigned comment added by 77.176.73.67 (talk) 01:19, 15 December 2008 (UTC)[reply]

There used to be pseudocode. I replaced it with C code fragments originally because of a series of spurious "fixes" that made the cases incorrect; it was the only way I could verify with any certainty that the code was valid, since with runnable code these fixes could be tested. However, nowadays I've moved more towards using pseudocode and the code block templates to control spurious fixes. I think the article should also give more attention to the diversity of different methods for implementing the operations, and to various choices that can be made for the properties. Dcoetzee 01:50, 25 December 2008 (UTC)[reply]

citation for Rudolf Bayer

Paper title: Symmetric binary B-Trees: Data structure and maintenance algorithms there is a pdf available here for 34 bucks. http://www.springerlink.com/content/qh51m2014673513j/ —Preceding unsigned comment added by 140.177.205.91 (talk) 20:55, 14 August 2009 (UTC)[reply]

"Simple path"?

The fifth listed property under "Properties" gave me long pause. Red-black trees do not contain cycles. Qualifying "paths" with "simple" suggests otherwise, and merely confused me. Is there any reason "simple" should not be deleted? Strebe (talk) 20:42, 12 April 2010 (UTC)[reply]

I think the intended meaning is that the path does not repeat edges. For example, it doesn't go from the root down to the leaf, then back up, then back down to some other leaf (this is normally called a walk). In the context of binary trees I think this clarification is unnecessary. Dcoetzee 21:31, 14 April 2010 (UTC)[reply]

Is insertion case 4 mandatory? why?

I am very new to this subject, but with careful study, I am able to follow this article. However I found insertion case 4 to be quite confusing. It took a little too much pondering and a little guessing to get to where I think I understand it. A couple points:

1) Unlike all the other cases, this case doesn't appear to ever complete any insertions. This is kinda confusing in itself. So I suggest that it is explained that this step does not solve a particular case, but rather converts the case, so it's the same as case 5. (I would do this edit myself if I was sure I'm right about what's going on.)

2) The language makes this step sound optional. ie it says that the rotation "can be performed" not "must be performed" or "is performed". This coupled with my confusion as to why this is being done in the first place makes for a confused me. So I suggest that this step be made explicitly mandatory (assuming I'm right and it is.)

I hope the above feedback enables someone to make this article even more clear.

Thank you, - JasonWoof (talk) 02:12, 19 April 2010 (UTC)[reply]

Removal:case 3

>> Case 3: P, S, and S's children are black
We remove N and all N's childs are leafs, so if P ans S are black, S's children should be leafs too(becouse we must have the same number of black nodes on every path). On the picture we see them as nodes that have children... —Preceding unsigned comment added by 91.103.66.4 (talk) 16:32, 18 May 2010 (UTC)[reply]

Incomplete case 5

When inserting 3, 2, 1 in order you get the following tree:

 3[black] (child: 2[red] (left: 1[red]))

And then you run into case 5, which attempts to rotate using 1's grand parent (3) as pivot, which fails, because 3 has no parent. In this case, 2 should have been the pivot. So the wording instead of:

In this case, a right rotation on the parent of P is performed;

should be

In this case, a right rotation on the parent of P, G is performed, if G is not the root node. In that case, a rotation on P is performed.

And the code accordingly

 rotate_right( g->parent ? g : n->parent );
 rotate_left( g->parent ? g : n->parent );

Also it should be clarified that the "rotate_"-functions use the Pivot as the argument (which is the explained in Tree rotation)

--77.8.172.59 (talk) 02:59, 20 May 2010 (UTC)[reply]

You're right about the pivot being the argument, but your suggestion for a fix is wrong. I don't know why, but I've tried with and without your suggestion, and your suggestion causes problems. ;-) ArlenCuss (talk) 02:59, 20 April 2011 (UTC)[reply]

Insertion Case 3

According to the wiki:

Case 3: If both the parent P and the uncle U are red, then both nodes can be repainted black and the grandparent G becomes red (to maintain Property 5 (All paths from any given node to its leaf nodes contain the same number of black nodes)).

Shouldn't it be to maintain Property 4? As Property 5 isn't violated by the addition of a red node.

Edit: After proper analysis I think both Properties should be mentioned. Property 4 for the reason why P and U are turned black and Property 5 for the reason why G turns red. RevoltPT (talk) 15:04, 11 June 2010 (UTC)[reply]

C++ reference

The "often" in "In the C++ Standard Template Library, the containers std::set<Value> and std::map<Key,Value> are often based on red-black trees" does not make sense to me...134.58.42.46 (talk) 15:34, 22 December 2010 (UTC)[reply]

Most (possibly all) implementations of these STL containers use red-black trees, but any data structure that fulfills the behavioral requirements could be used. Hence, “often”. Strebe (talk) 19:04, 22 December 2010 (UTC)[reply]
I'll change it to "typically". (Prevents giving the impression that any red-black implementation might actually turn out to be based on a different data structure on Wednesdays and bank holidays.)  Card Zero  (talk) 08:26, 7 February 2011 (UTC)[reply]
  1. ^ http://en.wikibooks.org/wiki/F_Sharp_Programming/Advanced_Data_Structures. {{cite web}}: Missing or empty |title= (help)