Jump to content

Tree sort

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 71.141.148.143 (talk) at 22:08, 31 December 2008. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Tree sort
ClassSorting algorithm
Data structureArray
Worst-case performance
Optimal?

A tree sort is a sort algorithm that builds a binary search tree from the keys to be sorted, and then traverses the tree so that the keys come out in sorted order. Adding items to a binary search tree is an O(log(n)) process, so, therefore, adding n items becomes an O(nLog(n)) process, making Tree Sort a so-called, 'fast sort'. Its ideal use is when sorting the elements of a stream from a file. Other sorts would be a slower O(nlogn) because of additional time to load the elements to a temporary data structure, whereas in a tree sort the act of loading the input into a data structure is sorting it.

Efficiency

Ironically, the worst case scenario for efficiency is handing a Tree Sort algorithm a pre-sorted set. This would make the time needed to insert all elements into the binary tree O(n2) rather than O(nLog(n)). The dominant process in the Tree Sort algorithm is the "insertion" into the binary tree, assuming that the time needed for retrieval is less than or equal to O(n).

If a self-balancing binary search tree algorithm is used, the worst case is O(n log n) because insertion in a balanced binary search tree is always O(log n).

Example

In a simple functional programming form, the algorithm (in Haskell) would look something like this:

data Tree a = Leaf | Node (Tree a) a (Tree a)

insert :: Ord a => a -> Tree a -> Tree a
insert x Leaf = Node Leaf x Leaf
insert x (Node t y t') | x <= y = Node (insert x t) y t'
insert x (Node t y t') | x > y = Node t y (insert x t')

flatten :: Tree a -> [a]
flatten Leaf = []
flatten (Node t x t') = flatten t ++ [x] ++ flatten t'

treesort :: Ord a => [a] -> [a]
treesort = flatten . foldr insert Leaf

Mind that in the above example, not just the insertion algorithm, but also the retrieval algorithm, have O(n2) worst case scenarios. Finding variants using balanced trees and a linear retrieval algorithm are left as an exercise to the reader.

An example using a self-balancing binary search tree in C++, with O(n log n) worst-case:

#include <set>       // for multiset
#include <algorithm> // for copy
#include <iterator>  // for iterator_traits
 
template <typename Iterator>
void binary_tree_sort(Iterator begin, Iterator end)
{
    // C++'s multiset class is a self-balancing binary search tree that allows duplicates
    // Add each element in input range to the tree
    std::multiset<typename std::iterator_traits<Iterator>::value_type> tree(begin, end);
 
    // Read elements in ascending order by simply traversing the tree.
    std::copy(tree.begin(), tree.end(), begin);
}