Jump to content

file (command)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Glenn (talk | contribs) at 05:00, 30 December 2009 (catchg Category:Unix SUS2008 utilities). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

file is a standard Unix program for determining the type of data contained in a computer file.

History

The original version of file originated in Unix Research Version 4[1] in 1973. System V saw a major update with several important changes, most notably moving the file type information into an external text file rather than compiling it into the binary itself.

All major BSD and Linux distributions use a free, open-source reimplementation which was written in 1986-87 by Ian Darwin[2] from scratch. It was expanded by Geoff Collyer in 1989 and since then has had input from many others, including Guy Harris, Chris Lowth and Eric Fischer; from late 1993 onward its maintenance has been organized by Christos Zoulas.

Specification

The Single Unix Specification (SUS) specifies that a series of tests are performed on the file specified on the command line:

  1. if the file cannot be read, its status undetermined, or its type undetermined, file will indicate that the file was processed and its type was undetermined.
  2. file must be able to determine the types directory, FIFO, socket, block special, and character special
  3. zero-length files are identified as such
  4. an initial part of file is considered and file is to use position-sensitive tests
  5. the entire file is considered and file is to use context-sensitive tests
  6. the file is identified as a data file

file's position-sensitive tests are normally implemented by matching various locations within the file against a textual database of magic numbers (see the Usage section). This differs from other simpler methods such as file extensions and schemes like MIME.

In most implementations, the file command uses a database to drive the probing of the lead bytes. That database is implemented in a file called "magic", whose location is usually in /etc/magic, /usr/share/file/magic or a similar location.

Usage

The SUS mandates the following options:

-M file, specify a file specially formatted containing position-sensitive tests; default position-sensitive tests and context-sensitive tests will not be performed
-m file, as for -M, but default tests will be performed after the tests contained in file.
-d, perform default position-sensitive and context-sensitive tests to the given file; this is the default behaviour unless -M or -m is specified
-h, identify links as such, unless the link points to a nonexistent file
-i, do not classify the file further than to identify it as either a: nonexistent, directory, FIFO, socket, block special, character special, socket, symbolic link, regular file, empty file, unreadable file, executable, ar archive, extended cpio format, extended tar format, shell script, C programming language source, FORTRAN programming language source, or a data file

Other Unix and Unix-like operating systems may add extra options than these.

Examples

# file file.c
file.c: C program text

# file program
program: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked 
    (uses shared libs), stripped

# file /dev/wd0a
/dev/wd0a: block special (0/0)

# file -s /dev/hda1
/dev/hda1: Linux/i386 ext2 filesystem

# file -s /dev/hda5
/dev/hda5: Linux/i386 swap file

# file compressed.gz
compressed.gz: gzip compressed data, deflated, original filename, `compressed', last
    modified: Thu Jan 26 14:08:23 2006, os: Unix

# file data.ppm
data.ppm: Netpbm PPM "rawbits" image data

References

  1. ^ See [1] this copy of the UNIX V4 man page
  2. ^ The history of this program is recorded in its private CVS repository; see [2] the log of the main program

See Also

Manual pages

Other