# Polyphase merge sort

A polyphase merge sort is an algorithm which decreases the number of runs at every iteration of the main loop by merging runs into larger runs. It is used for external sorting.

## Ordinary merge sort

Typically, a merge sort splits items into sorted runs and then recursively merges each run into larger runs. When there's only one run left, that is the sorted result.

An ordinary merge sort could use four working files organized as a pair of input files and a pair of output files. At each iteration, two input files are read. The odd numbered runs of the two input files are merged to the first output file, and the even numbered runs are merged to the second output file. When the input is exhausted, the new output files are used as the input for the next iteration. The number of runs decreases by a factor of 2 at each iteration. At each iteration, the same level/phase of merge occurs—a file is either completely read or completely written during an iteration.

If the four files were on four separate tape drives, watching an ordinary merge sort would show some interesting details. On the first iteration, only one input drive is used—the other input file is empty. On subsequent iterations, each input drive runs at half speed,[1] while one output drive runs at full speed and the second output drive stands idle waiting for the next run. The situation is even worse when six tape drives are used—at least two will stand idle. Someone watching the tapes spin would wonder if the idle drives could be more useful.

The polyphase merge found a way to use the idle drives. It can sort using just three sequential files rather than the four required by merge sort.

## Polyphase merge

The polyphase merge changes the game. There might be $N$ files, but the polyphase merge will read from $N-1$ files and write only one output file at a time. The writing to that output file continues until an input file is exhausted, and then that input file becomes the new output file. The number of runs in each file is related to Fibonacci numbers and Fibonacci numbers of higher order.[2][3]

## Perfect 3 file polyphase merge sort

It is easiest to look at the polyphase merge starting from its ending conditions and working backwards. At the start of each iteration, there will be two input files and one output file. At the end of the iteration, one input file will have been completely consumed and will become the output file for the next iteration. The current output file will become an input file for the next iteration. The remaining files (just one in the 3 file case) have only been partially consumed and their remaining runs will be input for the next iteration.

File 1 just emptied and became the new output file. One run is left on each input tape, and merging those runs together will make the sorted file.

File 1 (out):                                           <1 run> *        (the sorted file)
File 2 (in ): ... | <1 run> *               -->     ... <1 run> | *          (consumed)
File 3 (in ):     | <1 run> *                           <1 run> | *          (consumed)

|    marks the read pointer of the file
*    marks end of file

Stepping back to the previous iteration, we were reading from 1 and 2. One run is merged from 1 and 2 before file 1 goes empty. Notice that file 2 is not completely consumed—it has one run left to match the final merge (above).

File 1 (in ): ... | <1 run> *                      ... <1 run> | *
File 2 (in ):     | <2 run> *           -->            <1 run> | <1 run> *
File 3 (out):                                          <1 run> *

Stepping back another iteration, 2 runs are merged from 1 and 3 before file 3 goes empty.

File 1 (in ):     | <3 run>                        ... <2 run> | <1 run> *
File 2 (out):                               -->        <2 run> *
File 3 (in ): ... | <2 run> *                          <2 run> | *

Stepping back another iteration, 3 runs are merged from 2 and 3 before file 2 goes empty.

File 1 (out):                                          <3 run> *
File 2 (in ): ... | <3 run> *               -->    ... <3 run> | *
File 3 (in ):     | <5 run> *                          <3 run> | <2 run> *

Stepping back another iteration, 5 runs are merged from 1 and 2 before file 1 goes empty.

File 1 (in ): ... | <5 run> *                      ... <5 run> | *
File 2 (in ):     | <8 run> *               -->        <5 run> | <3 run> *
File 3 (out):                                          <5 run> *

Looking at the number of runs merged working backwards: 1, 1, 2, 3, 5, ... reveals a Fibonacci sequence.

For everything to work out right, the initial file to be sorted must be distributed to the proper input files and each input file must have the correct number of runs on it. In the example, that would mean an input file with 13 runs would write 5 runs to file 1 and 8 runs to file 2.

In practice, the input file won't happen to have a Fibonacci number of runs it (and the number of runs won't be known until after the file has been read). The fix is to pad the input files with dummy runs to make the required Fibonacci sequence.

For comparison, the ordinary merge sort will combine 16 runs in 4 passes using 4 files. The polyphase merge will combine 13 runs in 5 passes using only 3 files. Alternatively, a polyphase merge will combine 17 runs in 4 passes using 4 files. (Sequence: 1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193, 355, 653, 1201, ...)

An iteration (or pass) in ordinary merge sort involves reading and writing the entire file. An iteration in a polyphase sort does not read or write the entire file,[4] so a typical polyphase iteration will take less time than a merge sort iteration. Additionally, on tapes that can be read backward (even if they can only be written forward) there will be no intermediate rewinds: after the distribution phase (where the input tape contents are distributed among the other tapes) all tapes are read only backward. This means "straight runs" and "reversed runs" have to be set up correctly so that the last run on each tape is a reversed run which, read backward, produces one sorted forward run on the final output tape.

## References

1. ^ The two input drives are throttled by the output drive's speed. They cannot provide data faster than the output drive can write it.
2. ^ Donald Knuth, The Art of Computer Programming, Volume 3, Addison Wesley, 1973, Algorithm 5.4.2D.
3. ^ http://oopweb.com/Algorithms/Documents/Sman/Volume/ExternalSorting.html
4. ^ The first and last iterations do read and write the entire file.
• Bradley, James (1982), File and Data Base Techniques, Holt, Rinehart and Winston, ISBN 0-03-058673-9
• Sedgewick, Robert (1983), Algorithms, Addison-Wesley, pp. 163–165, ISBN 0-201-06672-6