Talk:B+ tree

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated Start-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 

inner insert breay, this diagram erringly splits nodes right when they reach capacity, thus precluding there ever being any "full" nodes. B-tree algorithms (and B+-tree) are supposed to split nodes only after capacity is reached and a new value is tried to be inserted in a full node. 132.198.12.98 21:30, 30 November 2007 (UTC)

The diagram is incorrect and has been removed. While there are some implementations of B+ trees that do split before the block is full (this allows you to perform the split after insertion is completed), this is not the "normal" procedure.78.91.39.182 15:13, 1 December 2007 (UTC)

Diagram[edit]

The diagram isn't very clear. It doesn't show that the data matches the keys, and it makes the 'linked list' pointers much different than the data pointers. —Preceding unsigned comment added by 24.26.131.178 (talk) 04:46, 4 December 2007 (UTC)


There should be step-by-step insertion/deletion diagrams showing a few common scenarios. --Aelon (talk) 10:32, 22 December 2007 (UTC)


The diagram contains a major mistake: in a B+ tree, right subtrees have keys greater than or equal to those in parent node. The root node must therefore be corrected to 4 - 6, and the issue is dealt with. Note: the very insertion algorithm described in this article ("Insert the new leaf's smallest key and address into parent") confirms the same. 26.09.2009

Merge suggested[edit]

It seems to me like the main difference between B-Trees and B+-Trees is a relatively minor one that has little effect on how operations proceed, which renders much of this article redundant with B-tree. I suggest this article be merged into B-tree and become a section of that article. Dcoetzee 03:58, 2 December 2008 (UTC)

Diagram[edit]

Am I not correct that the current diagram doesn't show a B+ Tree at all? First, nodes will have between N and 2N entries. Which integer N is 3 twice?

You are not correct. B+ trees are order B trees (a node must have between 1 and B children), and B doesn't have to be 2, it can take any integer value greater than 1.Michealt (talk) 11:47, 29 August 2011 (UTC)

Tree Order[edit]

Ramakrishnan's book on DBMS says that tree order of d means size of each internal node can range between d <= m <= 2d. However this article says it ranges between d/2 <= m <= d. I am not sure which one is correct. But majority of the CS Schools around take Ramakrishnan's book as reference. So this may be written wrong in the article. — Preceding unsigned comment added by 95.15.75.81 (talk) 18:18, 17 December 2011 (UTC)

Merge or Additional Clarity[edit]

The only sentence that separates this from the B-tree article is In a B+ tree, in contrast to a B-tree, all records are stored at the leaf level of the tree; only keys are stored in interior nodes.

Can someone add why you'd use a B+ tree instead of a B tree? There seem to be a lot of database solutions using B+ trees, so there has to be a reason, but without that reason, this article doesn't differ substantially from the B tree article. Dean (talk) 14:14, 22 July 2010 (UTC)

I've added some content to the implementation section describing at least one advantage of a B+-tree over a B-tree in the context of database systems. It seems to me that merging these two articles wouldn't be a great idea; while it's true that the articles are similar, that's largely due to a lack of detail on the insert and delete algorithms. There are subtle differences in how these work. If someone were to put in detailed pseudocode for these algorithms, they'd be somewhat different. That would make for quite a large merged article. While it's true that merging them now without that content might make some sense, it just means that someday, when someone wants to put in the detailed pseudocode, someone else would suggest splitting them. Maybe someone should just put in detailed code now? SeparateWays (talk) 19:02, 22 July 2010 (UTC)

Dean, as I understand it, a B-tree node takes up 50% more space (depending on data types, of course). This means that with the same block size a B+-tree fans out quicker, allowing you to traverse less tree levels to get to the leaf. — Preceding unsigned comment added by 82.139.81.0 (talk) 03:17, 18 May 2012 (UTC)

SQlite's use of B+ trees[edit]

At the very top of the article it is mentioned that SQlite uses B+ trees for its indices. The referenced link [[1]] says that B+ trees are only used to store data and not table indices. Should the reference to SQlite be dropped entirely or should there be an extra clause to say "SQlite just uses them for data!"? — Preceding unsigned comment added by 137.48.72.149 (talk) 13:54, 25 January 2012 (UTC)

I think it's fair to keep SQLite on that list as it is. The term "B+ tree" is only relevant when data records consist of a lookup field (e.g. table's primary key) and non-lookup data fields (rest of table data).
In the case of regular database indexes, the whole index record is the lookup key, therefore a B+ tree and B-tree would be equivalent anyway. I suspect that this observation also applies to other databases. -- intgr [talk] 13:40, 2 February 2012 (UTC)

Deletion of seperator value from index node[edit]

as seperator keys are copied from leaf and can be subsequently moved to higher index nodes, while deleting a key value we have to delete the key twice, once from the leaf and once from the index, right?

                     9
                  /     \
           2 5 - -       11    17        21     -
          / | \         /      |         |  \
              5 6 7 - 9 10 - - 11 12 13 -

like in this case if we want to delete 9, we first delete 9 from leaf; which will be prevented from underflowing by borrowing 11 from right sibling, adjusting their seperator entry in parent to 12 subsequently. The tree will then look like

                     9
                  /     \
           2 5 - -       12     17       21     -
          / | \         /       |        |  \
              5 6 7 - 10 11 - - 12 13 - -

but should we then delete the entry 9 from the root? which will need another deletion operation. Or we just keep that entry, it would take some space though and can acumulate to a larger space.

--Samik ganguly (talk) 10:30, 5 April 2012 (UTC)

Maximum possible number of keys in the root node as a leaf[edit]

Hello,

Quotation from the article: "(The root is also the single leaf, in this case.) This node is permitted to have as little as one key if necessary, and at most b."

Are you sure that the root node possible maximum number of keys as a leaf is b and not (b - 1) like any other "regular" leaf?

Thank you. — Preceding unsigned comment added by 217.128.106.129 (talk) 21:17, 10 November 2012 (UTC)

PostgreSQL's use of B+ trees[edit]

PostgreSQL is capable of using B+ trees: http://www.postgresql.org/about/ Ddugovic (talk) 16:59, 25 December 2012 (UTC)

That page only mentions B+ trees in relation to GiST, which is a framework for building advanced inverted index structures. The fact that it *can* be used for building B+ trees doesn't mean that anyone uses it that way. I think it would be very misleading to list it on this article because of GiST.
Normal ("btree" type) indexes in Postgres are not B+ trees. The distinction between B+ trees and B-trees is kind of nonsense for database indexes in the first place -- all the columns in the index itself *are* the lookup key, and they're the same on the leaf level as any other level. The *record* itself is generally stored in a separate structure -- in the case of Postgres, it's the table heap.
The distinction does make sense for databases which store whole table data in a tree, such as InnoDB (which uses a normal B-tree) or Oracle's index-organized tables (and others). Postgres always stores table data in a heap, not a tree, thus the B+ tree/B-tree classificaiton is not applicable. -- intgr [talk] 23:03, 25 December 2012 (UTC)

So, what is a B+tree?[edit]

I. Here it states, "A B+ tree can be viewed as a B-tree in which each node contains only keys (not key-value pairs)"

II. However, in the B tree article : "In the B+-tree, copies of the keys are stored in the internal nodes; the keys and records are stored in leaves"

Is is I, II, or neither?

Search algorithm, insertion, and merging gibberish[edit]

First, a function just calls another function. What is k_0 or k_i? Why are there no brackets like in k_[i+1]? Where does p come from? etc .. etc... etc ...

"B-trees grow at the root and not at the leaves" Umm ....

Is there any reason whey the "merging" garbage should not be deleted?

Insertion: "B-trees grow at the root and not at the leaves."[edit]

I read the referenced "fundamentals of database systems" and after that I do not understand this saying either. If a leaf node is split the tree grows at its bottom as well, which is the common case. Maybe it would be better to say that **new levels** are added by splitting the root node and adding a new root.