cat (Unix)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The cat program is a standard Unix utility that will output the contents of a specific file and can be used to concatenate and list files. The name is an abbreviation of catenate, a synonym of concatenate.

Usage[edit]

The Single Unix Specification specifies that when the "cat" program is given files in a sequence as arguments, it will output their contents to the standard output in the same sequence. It mandates the support of one option flag, u (unbuffered), by which each byte is written to standard output without buffering as it is read. Many operating systems do this by default and ignore the flag.

If one of the input filenames is specified as a single hyphen (-), then cat reads from standard input at that point in the sequence. If no files are specified, cat reads from standard input only.

The cat command-syntax is:

cat [options] [file_names]

cat will concatenate (put together) the input files in the order given, and if no other commands are given, will print them on the screen as standard output. It can also be used to print the files into a new file as follows:

cat [options] [file_names] > newfile.txt


You can also use a pipe to send the data to a different program. For example to view two files in sequence line by line using the less command, you would use the following command:

cat file1 file2 | less


Options[edit]

Both the BSD versions of cat (as per the OpenBSD manpage) and the GNU coreutils version of cat specify the following options:

b (GNU only: --number-nonblank), number non-blank output lines
n (GNU only: --number), number all output lines
z (GNU only: --squeeze-blank), squeeze multiple adjacent blank lines
v (GNU only: --show-nonprinting), displays nonprinting characters as if they were visible, except for tabs and the end of line character
t on BSD, -T on GNU, implies -v but also display tabs as ^I 
e on BSD, -E on GNU, implies -v but also display end-of-line characters as $
A show all characters, also tabs and end-of-line characters as ^I and $

Use cases[edit]

cat can be used to pipe a file to a program which only expect data on its input stream.

As cat simply catenates streams of bytes, it can be also used to concatenate binary files, where it will just concatenate sequence of bytes.

As such, the two main use cases are text files and other cases.

Text use[edit]

As a simple example, to concatenate 2 text files and write them to a new file, you can use the following command:

cat file1.txt file2.txt > newcombinedfile.txt


With option -n, cat can also number lines as follows:


cat -n file1.txt file2.txt > newnumberedfile.txt


Concatenation of text is limited to text files using a same legacy encoding such as ASCII, and BOM might be not supported. However, cat does not provide a way to concatenate unicode text files which have a Byte Order Mark. In the same way, files using different text encodings cannot be concatenated properly with only cat.

Other files[edit]

For many structured binary data sets, the result may not be parsed properly however, for example, if a file has a unique header or footer, and this use of cat is not especially useful in many cases. For some multimedia digital container formats the resulting file is valid and this provides an effective means of appending files, particularly video streams. Significantly, the MPEG program stream (MPEG-1 and MPEG-2) and DV (Digital Video) formats can be concatenated – such a stream is fundamentally a stream of packets.

Further, any other video format can be concatenated by transcoding to one of these privileged formats, concatenating via cat, and then transcoding back.

Unix culture[edit]

Jargon File definition[edit]

The Jargon File version 4.4.7 lists this as the definition of cat:

  1. To spew an entire file to the screen or some other output sink without pause (syn. blast).
  2. By extension, to dump large amounts of data at an unprepared target or with no intention of browsing it carefully. Usage: considered silly. Rare outside Unix sites. See also dd, BLT.

Among Unix fans, cat(1) is considered an excellent example of user-interface design, because it delivers the file contents without such verbosity as spacing or headers between the files, and because it does not require the files to consist of lines of text, but works with any sort of data.

Among Unix critics, cat(1) is considered the canonical example of bad user-interface design, because of its woefully unobvious name. It is far more often used to blast a single file to standard output than to concatenate two or more files. The name cat for the former operation is just as unintuitive as, say, LISP's cdr.

Useless use of cat[edit]

UUOC (from comp.unix.shell on Usenet) stands for "useless use of cat". comp.unix.shell observes: "The purpose of cat is to concatenate (or catenate) files. If it is only one file, concatenating it with nothing at all is a waste of time, and costs you a process." This is also referred to as "cat abuse". Nevertheless the following usage is common:

cat filename | command arg1 arg2 argn

This can be rewritten using redirection of stdin instead, in either of the following forms (the latter is more traditional):

<filename command arg1 arg2 argn
command arg1 arg2 argn < filename

Beyond other benefits, the input redirection forms allow command to seek in the file, whereas the cat examples do not: cat will prevent the command from seeking in the file.

Another common case where cat is unnecessary is where a command defaults to operating on stdin, but will read from a file, if the filename is given as an argument. This is the case for many common commands; the following examples:

cat $file | grep $pattern
cat $file | less

can instead be written as:

grep $pattern $file
less $file

A common interactive use of cat for a single file is to output the content of a file to standard output. However, if the output is piped or redirected, cat is unnecessary.

Without two named files, the use of cat has no significant benefits. A UUOC campaign will eliminate the inefficiency from shell scripts by using redirection instead.

A similar but less significant issue is the use of echo to start a pipeline, as this can often be replaced by redirection from a string (a here string), as in:

echo -e 'user\npass' | ftp localhost
ftp localhost <<< $'user\npass'

This is less significant as echo is often internally implemented in the shell, and in any case is lighter-weight than cat.

Benefits of using cat[edit]

The primarily benefits of using cat, even when unnecessary, are to avoid human error and for legibility. cat with one named file is safer where human error is a concern — one wrong use of the default redirection symbol ">" instead of "<" (often adjacent on keyboards) may permanently delete[1] the file you were just needing to read.[2] In terms of legibility, a sequence of commands starting with cat and connected by pipes has a clear left-to-right flow of information, in contrast with the back-and-forth syntax and backwards-pointing arrows of using stdin redirection. Contrast:

command < in | command2 > out
<in command | command2 > out

with:

cat in | command | command2 > out

Culture[edit]

Since 1995, occasional awards for UUOC have been given out, usually by Perl programmer Randal L. Schwartz. There is a web page devoted to this and other similar awards.[3] In British hackerdom the activity of fixing instances of UUOC is sometimes called demoggification.[4]

Other operating systems[edit]

The equivalent command in the VMS, CP/M, DOS, OS/2, and Microsoft Windows operating system command shells is type.

In DOS/Windows multiple files may be combined with the "copy /b" command syntax, for example:

copy /b file1.txt + file2.txt file3.txt


This copies file1.txt and file2.txt in binary mode to one file, file3.txt.

See also[edit]

References[edit]

  1. ^ More accurately stated ">" will truncate the file.
  2. ^ The default behavior for redirection is to clobber the file to its immediate right.
  3. ^ Useless Use of Cat Award
  4. ^ moggy is a chiefly British word for "(mongrel) cat", hence demoggification literally means "removal of (non-special) cats".

External links[edit]

Manual pages[edit]

Other[edit]