In computing, process substitution is a form of inter-process communication that allows the input or output of a command to appear as a file. The command is substituted in-line, where a file name would normally occur, by the command shell. This allows programs that normally only accept files to directly read from or write to another program.
The following examples use Bash syntax.
$ diff <(sort file1) <(sort file2)
<(command) expression tells the command interpreter to run command and make its output appear as a file. The command can be any arbitrarily complex shell command.
Without process substitution, the alternatives are:
1. Save the output of the command(s) to a temporary file, then read the temporary file(s).
$ sort file2 > /tmp/file2.sorted $ sort file1 | diff - /tmp/file2.sorted $ rm /tmp/file2.sorted
$ mkfifo /tmp/sort2.fifo $ sort file2 > /tmp/sort2.fifo & $ sort file1 | diff - /tmp/sort2.fifo $ rm /tmp/sort2.fifo
Both alternatives are rather more cumbersome.
Process substitution can also be used to capture output that would normally go to a file, and redirect it to the input of a process. The Bash syntax for writing to a process is
>(command). Here is an example using the
gzip commands that counts the lines in a file with
wc -l and compresses it with
gzip in one pass:
$ tee >(wc -l >&2) < bigfile | gzip > bigfile.gz
The main advantages of process substitution over its alternatives are:
- Simplicity: The commands can be given in-line; there is no need to save temporary files or create named pipes first.
- Performance: Reading directly from another process is often faster than having to write a temporary file to disk, then read it back in. This also saves disk space.
- Parallelism: The substituted process can be running concurrently with the command reading its output or writing its input, taking advantage of multiprocessing to reduce the total time for the computation.
Under the hood, process substitution works by creating a named pipe, and then substituting its name on the command line. (Because of this, process substitution is sometimes known as "anonymous named pipes.") To illustrate the steps involved, consider the following simple command substitution:
diff file1 <(sort file2)
The steps the shell performs are:
- Create a new named pipe. This special file is often named something like
/dev/fd/63on Unix-like systems; you can see it with a command like
- Execute the substituted command in the background (
sort file2in this case), piping its output to the named pipe.
- Execute the primary command, replacing the substituted command with the name of the named pipe. In this case, the full command might expand to something like
diff file1 /dev/fd/63.
- When execution is finished, remove the named pipe.
Process substitution has some limitations: the "files" created are not seekable, which means the process reading or writing to the file cannot perform random access; it must read or write once from start to finish. Programs that explicitly check the type of a file before opening it may refuse to work with process substitution, because the "file" resulting from process substitution is not a regular file. "It is not possible to obtain the exit code of a process substitution command from the shell that created the process substitution." 
- "Bash Reference Manual". The GNU Project. Free Software Foundation. 23 December 2009. Retrieved 01 Oct 2011.
- Cooper, Mendel (30 Aug 2011). "Advanced Bash-Scripting Guide". The Linux Documentation Project. Retrieved 01 Oct 2011.
- Frazier, Mitch (22 May 2008). "Bash Process Substitution". Linux Journal. Retrieved 01 Oct 2011.
- "ProcessSubstitution". Greg's Wiki. 27 Jun 2011.