Jump to content

Talk:Heap (data structure)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is the current revision of this page, as edited by AHMartin (talk | contribs) at 18:36, 15 January 2024 (Is "almost complete" correct?: Reply). The present address (URL) is a permanent link to this version.

(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Unifying heaps

[edit]

The articles about heaps keep switching from min to max heaps and back (e.g. in this article the definition is for max heap while the complexity is for min heap.) I think we use consistently min-heap (or max-heap) and say in ONE place the other is just the opposite. What do you think? Jirka6 18:38, 25 February 2007 (UTC)[reply]

I agree. Bender2k14 (talk) 00:06, 15 October 2009 (UTC)[reply]

Definition

[edit]

I was under the impression that heaps must also be complete trees...that is...the last level of the tree must be filled with a left to right ordering but maybe this is not true for all heaps...

That only holds for binary heaps. --Mellum 09:59, 3 Dec 2004 (UTC)
Imagine a scenario where the bottom level is completely full. If a new node were to be added to the heap, how could it maintain the property of being a complete binary tree, since the leftmost node would not be filled at the last level? Sheepdude 19:46, 20 April 2007 (UTC)[reply]
You are talking about a perfect tree, not a complete tree. See Types of binary trees. Bender2k14 (talk) 00:16, 15 October 2009 (UTC)[reply]

Let A and B be nodes of a heap, such that B is a child of A. The heap must then satisfy the following condition (heap property): key(A) ≥ key(B)

This is only for the special case of a max-heap, isn't it? --Abdull 17:38, 6 March 2006 (UTC)[reply]

Yes, special in the sense that a heap is either a minheap or a maxheap. Bender2k14 (talk) 00:11, 15 October 2009 (UTC)[reply]

complete table of run time complexity

[edit]

Anybody able to complete the table of the run time complexity? It also would be nice to have more information than worst case or amortised cost.


Cyc 14:05, 2 May 2006 (UTC)[reply]

That table is not correct. As the proof on the Binary Heap page shows, the running time for createHeap should be O(n). I will look into the createHeap times for the other heap implementations to see if they are wrong as well. Bender2k14 21:42, 12 October 2007 (UTC)[reply]

Ditto: the running time for createHeap should be O(n). If it were O(1) as claimed, then sorting is O(n) as follows: for i = 1 to n do { createHeap; remove root } — Preceding unsigned comment added by 96.60.253.126 (talk) 13:45, 28 January 2012 (UTC)[reply]

According to the cited source (pag 159 and 506 3rd ed), createHeap on a Binary Heap is O(n) but the amortized worst case is Θ(1).
Sergio91pt (talk) 12:21, 11 March 2012 (UTC)[reply]

heapifying

[edit]

You know you guys use the term heapifying in this topic? The topic is a mishmash of stuff in its current revision and I might clean it up like I did with tree (data structure). Hope noone objects. Josh Froelich 16:43, 16 February 2007 (UTC)[reply]

confusing

[edit]

I'm a graduate level EE who has been working for 20 years - and I don't understand. Perhaps the explanation could be less circular. Please keep in mind that this data is used by people who do not already know much about the subject.

Agreed. This entry does not read well, and provides no explainations or definitions. —Preceding unsigned comment added by Chieffancypants (talkcontribs) 03:47, 11 January 2008 (UTC)[reply]

Key needs to be explained in the definition

[edit]

It should be explained what a "key" is, becuase it is in the definition with no explanation, which makes the rest of the article usless to anyone who doesn't already know what a heap is, which in turn makes the article usless for %95 of people visiting the page because if you know what a key is there's only a small chance you don't know what a heap is. There isn't even a link (although, noe that a link to key wouldn't be adequeate in this case, because it is probably possible to give a short non-technical descriptioin of a "key"). 76.175.72.51 (talk) 17:19, 13 April 2009 (UTC)[reply]

I agree. I don't know what kind of key this article is referring to, and so the introduction is meaningless to me. Could someone please explain what a heap key is? 75.72.9.174 (talk) 20:09, 21 December 2013 (UTC)[reply]
Yes, "key" is one of several problems in the first paragraph [July2015 version]. A definition of "heap" should be given in the first paragraph. In my last LISP class this page developed into a source of jokes: "if you don't know what you are talking about, elevate the discussion to things that no one knows" and "now politicians will tell us about data structures", etc. wrstark (talk) 15:28, 22 July 2015 (UTC)wrstark[reply]

I know what is "key" is for sorting, but it's unclear to me what determines whether one item in the heap is the parent of another item. The article says that heaps can be used to implement stacks and queues, but the factor that determines parentage is needed to see the relationship. 206.47.113.103 (talk) 16:54, 10 January 2016 (UTC)[reply]

Sublinear Claim

[edit]

I just removed the claim that heaps can be used to find the max, min, both the max and min, median, or k-th largest element in sublinear time because that is not true. You have to look at each element at least once, so the running time is at least linear. Bender2k14 (talk) 00:20, 15 October 2009 (UTC)[reply]

I'm pretty sure that the sublinear claim refers to finding these elements after the heap has been constructed. Otherwise that claim would be pointless, and should be removed altogether. Paradoctor (talk) 06:45, 10 November 2009 (UTC)[reply]
The current statement "Finding the min, max, both the min and max, median, or even the k-th largest element can be done in linear time using heaps" also seems spurious. Firstly, you can easily find min, max and both min and max in linear time without using any kind of special data structure. And … can you really find the median in linear time in any way in which a heap is instrumental? (The standard linear median-finding algorithm doesn't use a heap.) You can find the k-th largest element in linear time with a heap 'if' you assume that k is a constant. Otherwise, if you don't want k popping up in the running time, using the same linear selection algorithm as for the median would be the solution. No heap involved, though. The most natural claims to have in this bullet point (related to selection) would be that it can find the k-th largest or smallest item in (for example) O(n lg k) or O(n + k lg k) time. Mlhetland (talk) 19:58, 12 July 2010 (UTC)[reply]
I removed the corresponding section according to the above discussion. --mellum (talk) 13:05, 2 August 2010 (UTC)[reply]
It's easy to find min/max in O(1) using heaps, and update them in O(ln(n)). You can't do that with the linear algorithm, it takes O(n) each and every time. After thinking about it for a while, I came up with O(1) for finding the median also using min/max heaps that update in O(ln n). But I added a ref tag.- Wolfkeeper 13:53, 2 August 2010 (UTC)[reply]
I hope you don't mind if I remove it again - the article you're citing refers to finding a median or kth element in constant time for data that's already in a heap. The O(ln n) time for an update is correct, but when you're constructing the heap you have to do that n times, so it's O(n ln n) for heap construction. Tristanreid (talk) 23:06, 20 February 2013 (UTC)[reply]

Brodal queue

[edit]

Fibonacci queue cites this paper. From the abstract:

I think this should be mentioned here. Paradoctor (talk) 12:01, 7 November 2009 (UTC)[reply]

Weak heaps?

[edit]

Is there any particular reason why weak heaps are not mentioned? Maybe this variant is not so well known? --188.126.207.212 (talk) 17:46, 23 August 2010 (UTC)[reply]

If you can write up something about weak heaps with proper sources, be bold. Qwertyus (talk) 15:04, 19 January 2011 (UTC)[reply]

Rm pseudo queues

[edit]

I've removed all reference to "Pseudo Queues". I've never heard of them, and the reference given seems to have nothing to do whatsoever with heaps/priority queues. Qwertyus (talk) 15:09, 19 January 2011 (UTC)[reply]

Paring Heap Amortized Linear Time Insert Operation

[edit]

Hi folks,

why is in the pairing heap implementation the insert operation denoted as amortized linear? Shouldn't it be linear without amortized analysis?

Grettings! —Preceding unsigned comment added by 91.18.86.134 (talk) 16:10, 21 February 2011 (UTC)[reply]

Because that's the way it's given in the citation. Btw., I've removed all the citation needed tags for the pairing heap operations, since they are all described in the reference given. I'm not going to cite it again in every field of the table. Qwertyus (talk) 10:48, 14 July 2011 (UTC)[reply]

Fibonacci heaps

[edit]

I found this source for the fibonacci heaps, however, it says that every operation is done in amortized time, mind you I haven't read the hole thing. I think is the original paper. Plaga701 (talk) 10:41, 27 July 2011 (UTC)[reply]

Leaf heap

[edit]

I just come across this term from a course slides of MIT OCW, by J.B. Orlin, Network Optimization, spring 2003. I'd like to create a new article to describe this object. However, I can not find a formal public reference now, since my slides at hand is inherited from a senior. In short, this is a heap with all real elements placed in leaves, and all the internal nodes has the value of smaller child. Every operation costs log(n) time. I don't think this structure practical, but it's reasonable to note it here to make a complete article. Can anybody help proceed along this line? Hupili (talk) 09:16, 4 February 2012 (UTC)[reply]


Radix heap

[edit]

Also from 'J.B. Orlin, Network Optimization, spring 2003.', a specialized data structure to improve Dial's algorithm(an implementation of Dijkstra's shortest path algorithm). Use exponentially growed bucket width to improve find-min procedure.Hupili (talk) 02:48, 5 February 2012 (UTC)[reply]

Origin of Phrase "the Heap"

[edit]

I deleted this line:

Some early popular languages such as Lisp provided dynamic memory allocation using heap data structures, which gave the memory area its name.

The line is mysterious to me because I don't see how heaps (the data structure) are useful in the context of memory management. What would the heap property be? Early Lisp implementations generally used mark-and-sweep to reclaim memory.

It was added by an anonymous editor (IP 82.180.29.126) on 20 April 2011. It was justified by a footnote to Cormen etal, but here is what Cormen etal. actually says:

We note that the term "heap" was originally coined in the context of heapsort, but it has since come to refer to "garbage-collected storage" such as the languages Lisp and Java provide. Our heap data structure is not garbage-collected storage, and whenever we refer to heaps in this book, we shall mean the structure defined in this chapter.

In The Art of Computer Programming, Vol. 1, 2nd Ed. p. 435, Knuth says the following on the use of "heap" in the context of dynamic memory:

Several authors began about 1975 to call the pool of available memory a "heap". But in the present series of books, we will use the word only in its more traditional sense related to priority queues.

I found an article dated 1985 which uses what it calls a "heap data structure" to manage memory in a Forth implementation. However the data structure described is not the heap which is the topic of this Wikipedia page.

Googling the line shows that it has been repeated multiple times on the web, but none of the references predate the 20 April 2011 Wikipedia edit. — Preceding unsigned comment added by Grubbiv (talkcontribs) 17:42, 19 August 2012 (UTC)[reply]

Floyd's algorithm

[edit]

There's currently a reference to Floyd's algorithm in section Implementation and operations, but there's no mention of an application to heaps on the algorithm's page. I think the linked one is not the correct "Floyd's algorithm" in this case. If so, the link should be removed. — Preceding unsigned comment added by 77.56.7.92 (talk) 02:27, 8 February 2013 (UTC)[reply]

Contradiction of running times

[edit]

The running time for Fibonacci heaps appears to contradict between the table at Heap (data structure)#Comparison of theoretic bounds for variants and at Fibonacci heap#Summary of running times. Please check which one is correct. — Preceding unsigned comment added by Theemathas (talkcontribs) 08:32, 30 October 2013 (UTC)[reply]

The timings on both pages appear identical? Although the order is somewhat different and delete operation is missing on the heap page. 178.116.166.208 (talk) 18:02, 16 November 2013 (UTC)[reply]

Sorry, I didn't notice that the rows are permuted. However, only most of the data are consistent, except some of them:

  • decrease-key for pairing heap
  • merge for binary heap

If I miss anything, please post here Theemathas (talk) 06:37, 5 January 2014 (UTC)[reply]

Binomial heap should also have O(1) amortized insertion, as stated by its page. Alexis Beingessner (talk) 20:57, 24 February 2014 (UTC)[reply]

I have replaced the tables on each page with a template with corrected data combined from both pages. I believe this fixes all the issues listed above. Wingedsubmariner (talk) 04:02, 1 September 2014 (UTC)[reply]

Binomial Heap should have O(1) find-min. It is trivial to keep track of the minimum element, just by using a constant amount of space to store it. — Preceding unsigned comment added by Agnishom (talkcontribs) 05:41, 29 March 2017 (UTC)[reply]

[edit]

Could someone replace the absurd link to completeness with an appropriate one? (I'm not sure which one that would be, or I'd have done it by now.)

I'm amazed that anyone would fail to expect completeness to be a disambiguation page. Common sense. Michael Hardy (talk) 22:41, 3 May 2014 (UTC)[reply]

heap vs. binary heap

[edit]

There are two distinct articles, but in this one we see: "... a heap is a complete binary tree ..." -- fixed with "If ..."

min heap vs. max confusion

[edit]

Article states that " ...the word heap will always refer to a min-heap ..." and then refers to the diagram of a max-heap! -- fixed --- silly sentence removed

Naming of operations

[edit]

The common operations are a huge mess at the moment with their naming ranging from abc-xyz to AbcXyz and literally EVERYTHING in between. Also, the parameters are sometimes specified, sometimes not, regardless of weather the function actually takes parameters or not. — Preceding unsigned comment added by 93.139.145.237 (talk) 05:07, 5 July 2014 (UTC)[reply]


Incorrect short definition for heapify operation

[edit]

To me, the short definition in the Operations section for heapify is not correct. It currently states: "heapify: create a heap out of given array of elements". However, to me this procedure is more for ensuring the heap contract. CLRS describes a MAX-HEAPIFY method in Section 6.2 ("Maintaining the heap property"). Skiena's Algorithm Design Manual, Section 4.3.3 ("Extracting the Minimum") explains "This percolate-down operation is also called heapify, because it merges two heaps (the subtrees below the original root) with a new key."

I personally find the "create-heap" operation definition ("create-heap: create an empty heap") a bit useless, adding to the confusion around "heapify" ("create a heap out of given array of elements"). To me, "create-heap" (what I would call "build-heap") should use the current definition of "heapify" and a new one presented based on CLRS, Skiena, et al.

Njuk-njuk (talk) 13:05, 21 August 2015 (UTC)[reply]

Puzzling diagram

[edit]

What is the sorting property for the heap in the diagram? It's a maxheap with all the perfect squares on the right main subtree and all the primes on the left, but I cannot come up with any rule for how the nodes were chosen beyond that.— Preceding unsigned comment added by 2601:19a:c102:74d5:7805:8d9:c4ee:94ab (talkcontribs)

Clarifying min vs max, esp in the lead

[edit]

I see several past comments about potential confusion surrounding min and max heaps but I think the problem is still there, specifically in the lead itself. There are three issues:

  1. The lead contradicts itself by defining heap property in terms of (first sentence) only "max heaps" but then goes on (fourth sentence) to include "min heaps".
  2. Even that restricted definition is further contradicted by the subsequent mentions of max and min heaps in terms of whether the correct relationships are (first sentence) "greater than" and "less than, or (third and fourth sentences) "greater than or equal to" and "less than or equal to" Searching around the web, "the or equal to" versions are by far the more common.
  3. HOWEVER, that then makes the references to "root node" confusing. Putting everything up to that point together it would appear that one could have a heap with multiple nodes, all with the same key (whatever that means). But in that case, what is meant by "highest" or "lowest" (in terms, respectively, of the roots of a max heap and min heap)? Part of the problem here is, as someone earlier noted, it's not exactly clear what a key is. The addition of "or value" helps, but not enough. Is it really the case that one could have a parent and child with the same key (value)? If so, in what way are they: a. different nodes, and; b. parent and child? If not (esp. to a.), then why is the heap property defined in terms of "...or equal to" properties?

Until I hit the third of those points, I was just going to change the text from its current:

In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if P is a parent node of C, then the key (the value) of node P is greater than the key of node C. A heap can be classified further as either a "max heap" or a "min heap". In a max heap, the keys of parent nodes are always greater than or equal to those of the children and the highest key is in the root node. In a min heap, the keys of parent nodes are less than or equal to those of the children and the lowest key is in the root node.

to the following:

In computer science, a heap is a specialized tree-based data structure that satisfies the following heap property: for all nodes C and P, if P is a parent of C, then the key (the value) of P is either greater than or equal to (max heap) or less than or equal to (mind heap) the key of C. The node with the highest (max heap) or lowest (min heap) key is called the root node.

However, as I say that may still be an insufficient definition if you really can have a multiple-node heap but with all keys the same (i.e. as seems to be allowed by the "or equal to" component of the heap property. So I've left it unchanged for now. Maybe someone who knows this stuff better than I do could have a look? (And when you are, could you also check the grammar of my replacement definition. I'm trying to say that the relations are either all greater-than-or-equal to or all less-than-or-equal-to, and I could say it in formal logic, but in the plain English needed here I'm not completely sure I've captured the correct logical associations. Thx!) Sleety Dribble (talk) 20:08, 9 September 2017 (UTC)[reply]

Totally agree about the internal contradiction, which was the most glaring issue. The "≥" versus ">" thing was an issue, and I addressed that too. Basically just used the text you proposed with a couple tweaks, closer to plain English than formal logic. Then I changed the definition of root node to dodge the case when all nodes have equal key.
Details: Yes, I do think you can have a parent and child with the same key. NB I'm not a pro at this, though. They are different nodes in the sense that they are different "slots" in which things are stored. They're parent and child in that they satisfy the heap property. Kind of like how [1,2,3] and [1,2,2,3] and [7,7,7] all satisfy the "perfectly sorted list" property. If someone asks me to sort the list [7,7,7], maybe I'm temporarily bemused about why they're asking, or whether each 7 is "really" a different thing from its neighbors, but at the end of the day I just return a sorted list ;).
Some things became clear to me when I read the NIST definition of key which is simply "The part of a group of data by which it is sorted [or] indexed...." It confused me initially too, because I'm used to think of key and value as different things. But I gather heaps are much more closely related to simple arrays than to key:value stores. The key is the value; there's no "separate" key by which to look up the value.
Finally, I see what you mean about clarifying that "the relations are either all greater-than-or-equal to or all less-than-or-equal-to." I didn't yet address this, but I might. I like what NIST says about "more extreme than or equal to." Probably would require splitting 1st sentence into 2, which might be worth it.--Officiallyover (talk) 14:42, 8 October 2017 (UTC)[reply]
[edit]

I would like to offer my page of animated illustrations about the Heap (http://www.chrislaux.com/heap.html) for the external links section. I believe the animations explain the Heap nicely visually to go with the text-focussed explanations here.

Ctlaux (talk) 19:02, 3 September 2019 (UTC)[reply]

Is first image ("Example of a binary max-heap...") correct?

[edit]

I'm not a specialist, but is the first image (https://en.wikipedia.org/wiki/Heap_(data_structure)#/media/File:Max-Heap.svg) correct? In the "array representation" part (at image bottom) shouln't be slot with value "17" connecting with "2" and "7", instead of slot with value "3"? Joint ventura (talk) 18:37, 11 February 2021 (UTC)[reply]

(For the record), there was apparently a glitch, which got fixed 6 days after you noticed it. AHMartin (talk) 17:55, 15 January 2024 (UTC)[reply]

Is "almost complete" correct?

[edit]

As a beginner programmer, I was reading this article and was confused by the term "almost complete." There's no definition of the property on this page, and I can only find definitions of "almost complete" as it applies specifically to binary trees. However, it's clearly stated that a heap can be nonbinary. So does "almost complete" mean anything in this context? What exact property of heaps is it meant to describe? (In addition, even for binary heaps, "almost complete" excludes the "perfect" case, which doesn't seem right at all.) --66.189.60.1 (talk) 01:09, 7 May 2021 (UTC)[reply]

Given the existance of the (linked) article binary heap, this Heap page has no business discussing the details of binary heap implementations. Also, the nomenclature variants for "complete", "perfect", "full", etc. binary trees are discussed in the (linked) article Binary tree § Types_of_binary_trees. But since "almost complete" is a variant characterization of the tree underlying a binary heap's implementation, I changed it to "complete". Now this article uses the term consistently. AHMartin (talk) 18:36, 15 January 2024 (UTC)[reply]