Jump to content

Bubble sort

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Djh2400 (talk | contribs) at 01:19, 17 October 2008 (→‎Pseudocode implementation: (spacing conformity)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Bubble sort
ClassSorting algorithm
Data structureArray
Worst-case performanceО(n²)
Worst-case space complexityО(n) total, O(1) auxiliary
OptimalNo

Bubble sort is a simple sorting algorithm. It works by repeatedly stepping through the list to be sorted, comparing two items at a time and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort.

Analysis

Performance

Bubble sort has worst-case complexity О(n²), where n is the number of items being sorted. There exist many other sorting algorithms with substantially better worst-case complexity O(n log n), meaning that bubble sort should not be used when n is large.

Rabbits and turtles

The positions of the elements in bubble sort will play a large part in determining its performance. Large elements at the beginning of the list do not pose a problem, as they are quickly swapped. Small elements towards the end, however, move to the beginning extremely slowly. This has led to these types of elements being named rabbits and turtles, respectively.

Various efforts have been made to eliminate turtles to improve upon the speed of bubble sort. Cocktail sort does pretty well, but it still retains O(n2) worst-case complexity. Comb sort compares elements large gaps apart and can move turtles extremely quickly, before proceeding to smaller and smaller gaps to smooth out the list. Its average speed is comparable to faster algorithms like Quicksort.

Step-by-step example

Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest number using bubble sort algorithm. In each step, elements written in bold are being compared.

First Pass:
( 5 1 4 2 8 ) ( 1 5 4 2 8 ) Here, algorithm compares the first two elements, and swaps them.
( 1 5 4 2 8 ) ( 1 4 5 2 8 )
( 1 4 5 2 8 ) ( 1 4 2 5 8 )
( 1 4 2 5 8 ) ( 1 4 2 5 8 ) Now, since these elements are already in order, algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) ( 1 4 2 5 8 )
( 1 4 2 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. Algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
Finally, the array is sorted, and the algorithm can terminate.

Pseudocode implementation

A simple way to express bubble sort in pseudocode is as follows:

procedure bubbleSort( A : list of sortable items ) defined as:
  do
    swapped := false
    for each i in 0 to length( A ) - 1 do:
      if A[ i ] > A[ i + 1 ] then
        swap( A[ i ], A[ i + 1 ] )
        swapped := true
      end if
    end for
  while swapped
end procedure

The algorithm can also be expressed as:

procedure bubbleSort( A : list of sortable items ) defined as:
  for each i in 1 to length(A) do:
     for each j in length(A) downto i + 1 do:
       if A[ j - 1 ] > A[ j ] then
         swap( A[ j - 1],  A[ j ] )
       end if
     end for
  end for
end procedure


The difference between this and the first pseudocode implementation is discussed later in the article.

Alternative implementations

One way to optimize bubblesort is to note that, after each pass, the largest element will always move down to the end. During each comparison, it is clear that the largest element will move downwards. Given a list of size n, the nth element will be guaranteed to be in its proper place. Thus it suffices to sort the remaining n - 1 elements. Again, after this pass, the n - 1th element will be in its final place.

In pseudocode, this will cause the following change:

procedure bubbleSort( A : list of sortable items ) defined as:
  n := length( A )
  do
    swapped := false
    n := n - 1
    for each i in 0 to n  do:
      if A[ i ] > A[ i + 1 ] then
        swap( A[ i ], A[ i + 1 ] )
        swapped := true
      end if
    end for
  while swapped
end procedure

We can then do bubbling passes over increasingly smaller parts of the list. More precisely, instead of doing n2 comparisons (and swaps), we can use only (n-1) + (n-2) + ... + 1 comparisons. This sums up to n(n - 1) / 2, which is still O(n2), but which can be considerably faster in practice.

In practice

Although bubble sort is one of the simplest sorting algorithms to understand and implement, its O(n2) complexity means it is far too inefficient for use on lists having more than a few elements. Even among simple O(n2) sorting algorithms, algorithms like insertion sort are usually considerably more efficient, unless the data is already in nearly sorted order.

Due to its simplicity, bubble sort is often used to introduce the concept of an algorithm, or a sorting algorithm, to introductory computer science students. However, some researchers such as Owen Astrachan have gone to great lengths to disparage bubble sort and its continued popularity in computer science education, recommending that it no longer even be taught.[1]

The Jargon file, which famously calls bogosort "the archetypical perversely awful algorithm", also calls bubble sort "the generic bad algorithm".[2] Donald Knuth, in his famous The Art of Computer Programming, concluded that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems", some of which he discusses therein.

Bubble sort is asymptotically equivalent in running time to insertion sort in the worst case, but the two algorithms differ greatly in the number of swaps necessary. Experimental results such as those of Astrachan have also shown that insertion sort performs considerably better even on random lists. For these reasons many modern algorithm textbooks avoid using the bubble sort algorithm in favor of insertion sort.

Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more branch mispredictions. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort and 40% slower than selection sort[citation needed].

Variations

  • Odd-even sort is a parallel version of bubble sort, for message passing systems.
  • In some cases, the sort works from right to left (the opposite direction), which is more appropriate for partially sorted lists, or lists with unsorted items added to the end.

References