sort (Unix)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In Unix-like operating systems, sort is a standard command line program that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator.

The "-r" flag will reverse the sort order.

Examples[edit]

Sort a file in alphabetical order[edit]

$ cat phonebook
Smith, Brett     555-4321
Doe, John        555-1234
Doe, Jane        555-3214
Avery, Cory      555-4132
Fogarty, Suzie   555-2314

$ sort phonebook
Avery, Cory      555-4132
Doe, Jane        555-3214
Doe, John        555-1234
Fogarty, Suzie   555-2314
Smith, Brett     555-4321

Sort by number[edit]

The -n option makes the program sort according to numerical value:

 $ du /bin/* | sort -n
 4       /bin/domainname
 24      /bin/ls
 102     /bin/sh
 304     /bin/csh

Sort the current directory by file size[edit]

 $ ls -k2 | sort -n
   96 Nov1.txt
  128 _arch_backup.lst
  128 _arch_backup.lst.tmp
 1708 NMON

Columns or fields[edit]

In old versions of sort, the +1 option made the program sort using the second column of data (+2 for the third, etc.). This is deprecated, and instead the -k option can be used to do the same thing (note: "-k 2" for the second column):

$ cat zipcode
Adam  12345
Bob   34567
Joe   56789
Sam   45678
Wendy 23456
   
$ sort -nk 2 zipcode
Adam  12345
Wendy 23456
Bob   34567
Sam   45678
Joe   56789

Sort on multiple fields[edit]

The -k m,n option lets you sort on a key that is potentially composed of multiple fields (start at column m, end at column n):

$ cat quota
fred 2000
bob 1000
an 1000
chad 1000
don 1500
eric 5000

$ sort -k2,2 -k1,1 quota
an 1000
bob 1000
chad 1000
don 1500
fred 2000
eric 5000

Here the first sort is done using column 2. -k2,2 specifies sorting on the key starting and ending with column 2. If -k2 is used instead, the sort key would begin at column 2 and extend to the end of the line, spanning all the fields in between. The n stands for 'numeric ordering'. -k1,1 dictates breaking ties using the value in column 1, sorting alphabetically by default. Note that bob, an and chad have the same quota and are sorted alphabetically in the final output.

Sorting a pipe delimited file[edit]

$ sort -t'|' -k2 zipcode
Adam|12345
Wendy|23456
Bob|34567
Sam|45678
Joe|56789

Sorting a tab delimited file[edit]

Sorting a file with tab separated values requires a tab character to be specified as the column delimiter. This illustration uses the shell's dollar-quote notation[1][2] to specify the tab as a C escape sequence.

 $ sort -k2,2 -t $'\t' phonebook 
 Doe, John	555-1234
 Fogarty, Suzie	555-2314
 Doe, Jane	555-3214
 Avery, Cory	555-4132
 Smith, Brett	555-4321

Sort in reverse[edit]

The -r option just reverses the order of the sort:

$ sort -nrk 2 zipcode
Joe   56789
Sam   45678
Bob   34567
Wendy 23456
Adam  12345

Sort in random[edit]

The GNU implementation has a -R/--random-sort option based on hashing; this is not a full random shuffle because it will sort identical lines together. A true random sort is provided by the Unix utility shuf.

Sorting algorithm[edit]

The implementation in GNU Core Utilities, used on Linux, employs the merge sort algorithm.

See also[edit]

References[edit]

  1. ^ "The GNU Bash Reference Manual, for Bash, Version 4.2: Section 3.1.2.4 ANSI-C Quoting". Free Software Foundation, Inc. 28 December 2010. Retrieved 1 February 2013. "Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard." 
  2. ^ "KornShell FAQ". Retrieved 1 February 2013. "The $'...' string literal syntax was added to ksh93 to solve the problem of entering special characters in scripts. It uses ANSI-C rules to translate the string between the '...'." 

External links[edit]