|This article does not cite any references or sources. (December 2009)|
In Unix and Unix-like operating systems, a filter is a program that gets most of its data from its standard input (the main input stream) and writes its main results to its standard output (the main output stream). Unix filters are often used as elements of pipelines. The pipe operator ("|") on a command line signifies that the main output of the command to the left is passed as main input to the command on the right.
The classic filter would be grep, which at it simplest prints to its output any lines containing a character string. The following is an example:
cut -d : -f 1 /etc/passwd | grep foo
This finds all registered users that have "foo" as part of their username by using the cut command to take the first field (username) of each line of the Unix system password file and passing them all as input to grep, which searches its input for lines containing the character string "foo" and prints them on its output.
Common Unix filter programs are: cat, cut, grep, head, sort, uniq, and tail. Programs like awk and sed can be used to build quite complex filters because they are fully programmable. Unix filters can be also used by Data Scientists to get a quick overview about a file based dataset.
List of Unix filter programs
- Data Analysis with the Unix Shell - Bernd Zuther, comSysto GmbH, 2013