Jump to content

X + Y sorting

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by David Eppstein (talk | contribs) at 00:59, 3 July 2020 (Applications: Hernandez Barrera 1996). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science, X + Y sorting is the problem of sorting pairs of numbers by their sum. Given two finite sets X and Y, both of the same length, the problem is to order all pairs (x, y) in the Cartesian product X × Y in numerical order by the value of x + y. The problem is attributed to Elwyn Berlekamp.[1][2]

Classical comparison sorting

Unsolved problem in computer science:
Is there an sorting algorithm faster than ?

This problem can be solved using a straightforward comparison sort on the Cartesian product of the two given sets. When the sets have size , their product has size , and the time for a comparison sorting algorithm is . This is the best upper bound known on the time for this problem. Whether sorting can be done in a more slowly-growing time bound is an open problem.[3][2]

Harper et al. (1975) suggest separately sorting and , and then constructing a two-dimensional matrix of the values of that is sorted both by rows and by columns before using this partially-sorted data to complete the sort of . Although this can reduce the number of comparisons needed by a constant factor, compared to naive comparison sorting, they show that any comparison sorting algorithm that can work for arbitrary matrices that are sorted by rows and columns still requires comparisons, so additional information about the set beyond this matrix ordering would be needed for any faster sorting algorithm.[1]

Number of orderings

The numbers in the two input lists for the sorting problem can be interpreted as Cartesian coordinates of a point in the -dimensional space . If one partitions the space into cells, so the set has a fixed ordering within each cell, then the boundaries between these cells are hyperplanes defined by an equality of pairs . The number of hyperplanes that can be determined in this way is , and the number of cells that this number of hyperplanes can divide a space of dimension into is less than . Therefore, the set has at most different possible orderings.[1][4]

Number of comparisons

The number of comparisons required to sort is certainly lower than for ordinary comparison sorting: Michael Fredman showed in 1976 that sorting can be done using only O(n2) comparisons. More generally, he shows that any set of elements, whose sorted ordering has already been restricted to a family of orderings, can be sorted using comparisons, by a form of binary insertion sort. For the sorting problem, , and , so and Fredman's bound implies that only comparisons are needed. However, the time needed to decide which comparisons to perform may be significantly higher than the bound on the number of comparisons.[4] If only comparisons between elements of are allowed, then there is also a matching lower bound of on the number of comparisons needed.[4][5]

The first actual algorithm that achieves both comparisons and total complexity was published sixteen years later. The algorithm first recursively sorts the two sets and , and uses the equivalence to infer the sorted orderings of and without additional comparisons. Then, it merges the two sets and and uses the merged order and the equivalence to infer the sorted order of without additional comparisons. The part of the algorithm that recursively sorts (or equivalently ) does so by splitting into two equal sublists and , recursively sorting and , inferring the ordering on as above, and merging the sorted results , , and together.[6]

Non-comparison-based algorithms

On a RAM machine with word size w and integer inputs 0 ≤ {x, y} < n = 2w, the problem can be solved in O(n log n) operations by means of the fast Fourier transform.[1]

Applications

Steven Skiena recounts a practical application in transit fare minimisation, an instance of the shortest path problem: given fares x and y for trips from departure A to some intermediate destination B and from B to final destination C, with only certain pairs of fares allowed to be combined, find the cheapest combined trip from A to C. Skiena's solution consists of sorting the pairs of fares as an instance of the sorting problem, and then testing the resulting pairs in this sorted order until finding one that is allowed. For this problem, one can use a priority queue of pairs, initialized to contain a single pair with the cheapest overall pair of fares. Then, when a pair is found to be disallowed, two more pairs formed by combining and with their successors in their respective sorted lists of single-hop fares can be added (if not already present) to the priority queue. In this way, each successive pair can be found in logarithmic time, and only the pairs up to the first allowable one need to be sorted.[3]

Several other problems in computational geometry have equivalent or harder complexity to sorting, including constructing Minkowski sums of staircase polygons, finding the crossing points of an arrangement of lines in sorted order by their -coordinates, listing pairs of points in sorted order by their distances, and testing whether one rectilinear polygon can be translated to fit within another.[7]

See also

References

  1. ^ a b c d Harper, L. H.; Payne, T. H.; Savage, J. E.; Straus, E. (1975). "Sorting X + Y". Communications of the ACM. 18 (6): 347–349. doi:10.1145/360825.360869.
  2. ^ a b Demaine, Erik; Erickson, Jeff; O'Rourke, Joseph (20 August 2006). "Problem 41: Sorting X + Y (Pairwise Sums)". The Open Problems Project. Retrieved 23 September 2014.
  3. ^ a b Skiena, Steven (2008). "4.4 War Story: Give me a Ticket on an Airplane". The Algorithm Design Manual (2nd ed.). Springer. pp. 118–120. doi:10.1007/978-1-84800-070-4_4.
  4. ^ a b c Fredman, Michael L. (1976). "How good is the information theory bound in sorting?". Theoretical Computer Science. 1 (4): 355–361. doi:10.1016/0304-3975(76)90078-5.
  5. ^ Dietzfelbinger, Martin (1989). "Lower bounds for sorting of sums". Theoretical Computer Science. 66 (2): 137–155. doi:10.1016/0304-3975(89)90132-1. MR 1019082.
  6. ^ Lambert, Jean-Luc (1992). "Sorting the sums (xi + yj) in O(n2) comparisons". Theoretical Computer Science. 103 (1): 137–141. doi:10.1016/0304-3975(92)90089-X.
  7. ^ Hernández Barrera, Antonio (1996). "Finding an o(n2 log n) algorithm is sometimes hard" (PDF). Proceedings of the 8th Canadian Conference on Computational Geometry (CCCG'96). pp. 289–294.