Bucket queue

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In the design and analysis of data structures, a bucket queue[1] (also called a bucket priority queue[2] or bounded-height priority queue[3]) is a priority queue for prioritizing elements whose priorities are small integers. It has the form of an array of buckets: an array data structure, indexed by the priorities, whose cells contain buckets of items with the same priority as each other.

The bucket queue is the priority-queue analogue of pigeonhole sort (also called bucket sort), a sorting algorithm that places elements into buckets indexed by their priorities and then concatenates the buckets. Using a bucket queue as the priority queue in a selection sort gives a form of the pigeonhole sort algorithm.

Applications of the bucket queue include computation of the degeneracy of a graph as well as fast algorithms for shortest paths and widest paths for graphs with weights that are small integers or are already sorted. Its first use[2] was in a shortest path algorithm by Dial (1969).[4]

Basic data structure[edit]

This structure can handle the insertions and deletions of elements with integer priorities in the range from 0 to some known bound C, as well as operations that find the element with minimum (or maximum) priority. It consists of an array A of container data structures, where array cell A[p] stores the collection of elements with priority p. It can handle the following operations:

  • To insert an element x with priority p, add x to the container at A[p].
  • To remove an element x with priority p, remove x from the container at A[p]
  • To find an element with the minimum priority, perform a sequential search to find the first non-empty container, and then choose an arbitrary element from this container.

In this way, insertions and deletions take constant time, while finding the minimum priority element takes time O(C).[1][3]

Optimizations[edit]

As an optimization, the data structure can also maintain an index L that lower-bounds the minimum priority of an element. When inserting a new element, L should be updated to the minimum of its old value and the new element's priority. When searching for the minimum priority element, the search can start at L instead of at zero, and after the search L should be left equal to the priority that was found in the search.[3] In this way the time for a search is reduced to the difference between the previous lower bound and its next value; this difference could be significantly smaller than C. For applications of monotone priority queues such as Dijkstra's algorithm in which the minimum priorities form a monotonic sequence, the sum of these differences is at most C, so the total time for a sequence of n operations is O(n + C), rather than the slower O(nC) time bound that would result without this optimization.

Another optimization (already given by Dial 1969) can be used to save space when the priorities are monotonic and, at any point in time, fall within a range of r values rather than extending over the whole range from 0 to C. In this case, one can index the array by the priorities modulo r rather than by their actual values. The search for the minimum priority element should always begin at the previous minimum, to avoid priorities that are higher than the minimum but have lower moduli.[1]

Multi-level bucket queue[edit]

When implementing monotone priority queues with a large range of priorities, it is possible to reduce space by using several levels of bucket queue. Priorities closest to the most recently extracted are placed in the lowest level, which has a limited range. More distant priorities are placed in higher-level buckets whose size equals the full range of the lower level.

For example, suppose we have a monotone min-queue over a range of 1000000, implemented as three levels of bucket queues of size 100. Further suppose the most recently extracted value had priority 123456. Values in the range 123456123499 are kept in buckets 56–99 at the lowest level. Values in the range 123500129999 are kept in buckets 35–99 at the second level, ignoring the last two digits. And values in the range 130000999999 are kept in buckets 13–99 at the third level.

Whenever one level is found to be empty, the next level up is searched for the minimum non-empty bucket, whose entries are then distributed across the empty level. If there are k levels, an entry may have to pass through all of them, taking k times as long as a one-level bucket queue, but this is a small increase in time for a large saving in space.

The bound on the time spent per entry depends on the priority queue being monotone, so that entries are only ever moved to lower levels.

Applications[edit]

A bucket queue can be used to maintain the vertices of an undirected graph, prioritized by their degrees, and repeatedly find and remove the vertex of minimum degree.[3] This greedy algorithm can be used to calculate the degeneracy of a given graph. It takes linear time, with or without the optimization that maintains a lower bound on the minimum priority, because each vertex is found in time proportional to its degree and the sum of all vertex degrees is linear in the number of edges of the graph.[5]

In Dijkstra's algorithm for shortest paths in positively-weighted directed graphs, a bucket queue can be used to obtain a time bound of O(n + m + dc), where n is the number of vertices, m is the number of edges, d is the diameter of the network, and c is the maximum (integer) link cost.[6] This variant of DIjkstra's algorithm is also known as Dial's algorithm, after Robert B. Dial, who published it in 1969.[7] In this algorithm, the priorities will only span a range of width c + 1, so the modular optimization can be used to reduce the space to O(n + c).[1] A variant of the same algorithm can be used for the widest path problem, and (in combination with methods for quickly partitioning non-integer edge weights) leads to near-linear-time solutions to the single-source single-destination version of this problem.[8]

References[edit]

  1. ^ a b c d Mehlhorn, Kurt; Sanders, Peter (2008), "10.5.1 Bucket Queues", Algorithms and Data Structures: The Basic Toolbox, Springer, p. 201, ISBN 9783540779773 .
  2. ^ a b Edelkamp, Stefan; Schroedl, Stefan (2011), "3.1.1 Bucket Data Structures", Heuristic Search: Theory and Applications, Elsevier, pp. 90–92, ISBN 9780080919737 . See also p. 157 for the history and naming of this structure.
  3. ^ a b c d Skiena, Steven S. (1998), The Algorithm Design Manual, Springer, p. 181, ISBN 9780387948607 .
  4. ^ Dial, Robert B. (1969), "Algorithm 360: Shortest-path forest with topological ordering [H]", Communications of the ACM, 12 (11): 632–633, doi:10.1145/363269.363610 .
  5. ^ Matula, David W.; Beck, L. L. (1983), "Smallest-last ordering and clustering and graph coloring algorithms", Journal of the ACM, 30 (3): 417–427, doi:10.1145/2402.322385, MR 0709826 .
  6. ^ Varghese, George (2005), Network Algorithmics: An Interdisciplinary Approach to Designing Fast Networked Devices, Morgan Kaufmann, ISBN 9780120884773 .
  7. ^ Dial, Robert B. (1969), "Algorithm 360: Shortest-path forest with topological ordering [H]", Communications of the ACM, 12 (11): 632–633, doi:10.1145/363269.363610 .
  8. ^ Gabow, Harold N.; Tarjan, Robert E. (1988), "Algorithms for two bottleneck optimization problems", Journal of Algorithms, 9 (3): 411–417, doi:10.1016/0196-6774(88)90031-4, MR 0955149