User:Earlster/Sandbox

Earlster/Sandbox
	Example of insertion sort sorting a list of random numbers. Example of insertion sort sorting a list of random numbers.
Class	Sorting algorithm
Data structure	Array
Worst-case performance
Best-case performance
Average performance
Worst-case space complexity	total,

Unlike bucket sorting, the elements are sorted as they are input

Proxmap Sorting, or Proxmap, is a sorting algorithm that works by partitioning an array of data items, or keys, into a number of buckets. If keys are "well distributed" among the buckets, sorting occurs in $O(n)$ time, much faster than comparison-based sorting that can do no better than $O(nlogn)$ . During generation a proxmap is created, which is used to find keys in an average of $O(1.5)$ time. This algorithm also scales up.

Keys are dropped into each bucket using insertion sort. Since bucket sort is not a comparison sort, the Ω(n log n) lower bound is inapplicable. The computational complexity estimates involve the number of buckets and the key used.

Overview

Proxmap works as follows:

Initialize: Create and initialize 4 arrays
Process: Using a carefully chosen key function, come with a series of "buckets" using the arrays
Drop: Go over the original array, dropping each object into its bucket, insertion sorting as needed.
Collect: Visit the buckets in order and put all elements back into the original array.

Example

Consider a full array: A[0 to n-1] with n keys. Let i be an index of A. Sort A's keys into array A2 of equal size.

Array table
1	1	2	3
2	2	4	6
3	3	6	9
4	4	8	12
5	5	10	15

Pseudocode

function proxmap-sort(array, mapKey) is
  hitCount ← new array of size n
  location ← new array of size n
  proxmap ← new array of size n
  sortedArray ← sorted array of size n
  for i = 0 to (length(array)-1) do
    insert array[i] into buckets[keyFunction(array[i], k)]
  for i = 0 to n - 1 do
    next-sort(buckets[i])
  return the concatenation of buckets[0], ..., buckets[n-1]

Here array is the array to be sorted and n is the number of buckets to use. The key function such as keyFunction(x,k) returns the k most significant bits of x (floor(x/2^(size(x)-k))); different functions can be used to translate the range of elements in array to n buckets, such as translating the letters A–Z to 0–25 or returning the first character (0–255) for sorting strings. Another method could simply take the floor of the incoming data divided by 10, or floor(k/10) Buckets get sorted as the data comes in, not all at the end.

Optimizations

Many optimizations include using the same array to store the sorted data, reusing the hitCount, proxMap, and location arrays used to sort the data. To be expanded on later...

Generic bucket sort related to Proxmap

The most common variant of bucket sort operates on a list of n numeric inputs between zero and some maximum value M and divides the value range into n buckets each of size M/n. If each bucket is sorted using insertion sort, the sort can be shown to run in expected linear time (where the average is taken over all possible inputs).^[1] However, the performance of this sort degrades with clustering; if many values occur close together, they will all fall into a single bucket and be sorted slowly. This holds a similar story with Proxmap, if the buckets are too large or too small then the performance of the function degrades severely.

Comparison with other sorting algorithms

Generally, it works relatively fast when compared with other sorting algorithms as its not comparison-based. The function is nearly constant access time which makes it very appealing for large databases. Despite the O(n) build time, it makes up for it with its O(1.5) average access time. If the array doesn't need to be updated often, the access time makes this function more favorable.

References

^ Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 8.4: Bucket sort, pp.174–177.

Thomas A. Standish "Using O(n) ProxmapSort and O(1) ProxmapSearch to Motivate CS2 Students (Part I)" from Donald Bren School of Information and Computer Sciences at UC Irvine.
Norman Jacobson "A Synopsis of ProxmapSort & ProxmapSearch" from Donald Bren School of Information and Computer Sciences at UC Irvine.

External links

http://www.cs.uah.edu/~rcoleman/CS221/Sorting/ProxMapSort.html

[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 8.4: Bucket sort, pp.174–177.

[1]

v t e Sorting algorithms
Theory	Computational complexity theory Big O notation Total order Lists Inplacement Stability Comparison sort Adaptive sort Sorting network Integer sorting X + Y sorting Transdichotomous model Quantum sort
Exchange sorts	Bubble sort Cocktail shaker sort Odd–even sort Comb sort Gnome sort Proportion extend sort Quicksort
Selection sorts	Selection sort Heapsort Smoothsort Cartesian tree sort Tournament sort Cycle sort Weak-heap sort
Insertion sorts	Insertion sort Shellsort Splaysort Tree sort Library sort Patience sorting
Merge sorts	Merge sort Cascade merge sort Oscillating merge sort Polyphase merge sort
Distribution sorts	American flag sort Bead sort Bucket sort Burstsort Counting sort Interpolation sort Pigeonhole sort Proxmap sort Radix sort Flashsort
Concurrent sorts	Bitonic sorter Batcher odd–even mergesort Pairwise sorting network Samplesort
Hybrid sorts	Block merge sort Kirkpatrick–Reisch sort Timsort Introsort Spreadsort Merge-insertion sort
Other	Topological sorting Pre-topological order Pancake sorting Spaghetti sort
Impractical sorts	Stooge sort Slowsort Bogosort