Jump to content

Timsort: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
image uploader forgot to remove twinkle-title-warning after adding licence to the image
timsort uses less than log(n!) comparisons, not log(n) comparisons - you can't even look at every observation in log(n) time
Line 26: Line 26:
Like merge sort, it is a [[stable sort|stable]] [[comparison sort]] with a [[Best, worst and average case|worst-case time complexity]] of <math>\Theta(n \log n)</math> <ref>http://mail.python.org/pipermail/python-dev/2002-July/026837.html</ref>.
Like merge sort, it is a [[stable sort|stable]] [[comparison sort]] with a [[Best, worst and average case|worst-case time complexity]] of <math>\Theta(n \log n)</math> <ref>http://mail.python.org/pipermail/python-dev/2002-July/026837.html</ref>.


According to [[information theory]], no [[comparison sort]] can perform fewer than <math>\log_2(n!)</math> comparisons in the average case, which requires that comparison sorting takes <math>\Omega(n \log n)</math> time on random data. However, on real-world data, Timsort often requires far fewer than <math>\log_2(n)</math> comparisons, because it exploits the fact that sublists of the data may already be in order or reverse order.<ref>Alex Martelli, ''Python in a Nutshell'', [http://books.google.com/books?id=JnR9hQA3SncC&pg=PA57&lpg=PA57&dq=timsort+non-recursive&source=bl&ots=Ja6REu123x&sig=PHJ77HYJPwb0I2RmTnMqmdmL7Fc&hl=en&ei=BWCCSo2aBIKBtwet5eHKCg&sa=X&oi=book_result&ct=result&resnum=7#v=onepage&q=timsort%20non-recursive&f=false p. 57].</ref>
According to [[information theory]], no [[comparison sort]] can perform fewer than <math>\log_2(n!)</math> comparisons in the average case, which requires that comparison sorting takes <math>\Omega(n \log n)</math> time on random data. However, on real-world data, Timsort often requires far fewer than <math>\log_2(n!)</math> comparisons, because it exploits the fact that sublists of the data may already be in order or reverse order.<ref>Alex Martelli, ''Python in a Nutshell'', [http://books.google.com/books?id=JnR9hQA3SncC&pg=PA57&lpg=PA57&dq=timsort+non-recursive&source=bl&ots=Ja6REu123x&sig=PHJ77HYJPwb0I2RmTnMqmdmL7Fc&hl=en&ei=BWCCSo2aBIKBtwet5eHKCg&sa=X&oi=book_result&ct=result&resnum=7#v=onepage&q=timsort%20non-recursive&f=false p. 57].</ref>


== See also ==
== See also ==

Revision as of 17:27, 12 November 2009

Timsort
Step-by-step visualisation of timsort
Timsort in action on a list of numbers. For short lists of numbers like this one, Timsort is equivalent to Insertion Sort
ClassSorting algorithm
Data structureArray
Worst-case performance
Best-case performance
Average performance
Worst-case space complexity
OptimalSometimes

Timsort is a hybrid sorting algorithm derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data. It was invented by Tim Peters in 2002, for use in versions 2.3 and later of the Python programming language, and is also used in Java as well [1].

Tim Peters describes the algorithm as follows [2]:

[...] an adaptive, stable, natural mergesort, modestly called

timsort (hey, I earned it <wink>). It has supernatural performance on many kinds of partially ordered arrays (less than lg(N!) comparisons needed, and as few as N-1), yet as fast as Python's previous highly tuned samplesort hybrid on random arrays.

In a nutshell, the main routine marches over the array once, left to right, alternately identifying the next run, then merging it into the previous runs "intelligently". Everything else is complication for speed, and some

hard-won measure of memory efficiency.

Like merge sort, it is a stable comparison sort with a worst-case time complexity of [3].

According to information theory, no comparison sort can perform fewer than comparisons in the average case, which requires that comparison sorting takes time on random data. However, on real-world data, Timsort often requires far fewer than comparisons, because it exploits the fact that sublists of the data may already be in order or reverse order.[4]

See also



References