|Initial release||1973Unix Research Version 4; 1986 open-source reimplementationas part of|
|Operating system||Unix, Unix-like, Cross-platform|
|Type||file type detector|
|License||BSD license, CDDL|
The original version of file originated in Unix Research Version 4 in 1973. System V brought a major update with several important changes, most notably moving the file type information into an external text file rather than compiling it into the binary itself.
Most major BSD and Linux distributions use a free, open-source reimplementation which was written in 1986-87 by Ian Darwin from scratch. It was expanded by Geoff Collyer in 1989 and since then has had input from many others, including Guy Harris, Chris Lowth and Eric Fischer; from late 1993 onward its maintenance has been organized by Christos Zoulas. The OpenBSD system has its own subset implementation written from scratch, but still uses the Darwin/Zoulas collection of magic file formatted information.
The Single Unix Specification (SUS) specifies that a series of tests are performed on the file specified on the command line:
- if the file cannot be read, or its Unix file type is undetermined, the file program will indicate that the file was processed but its type was undetermined.
- file must be able to determine the types directory, FIFO, socket, block special file, and character special file
- zero-length files are identified as such
- an initial part of file is considered and file is to use position-sensitive tests
- the entire file is considered and file is to use context-sensitive tests
- the file is identified as a data file
file's position-sensitive tests are normally implemented by matching various locations within the file against a textual database of magic numbers (see the Usage section). This differs from other simpler methods such as file extensions and schemes like MIME.
In most implementations, the file command uses a database to drive the probing of the lead bytes. That database is implemented in a file called magic, whose location is usually in /etc/magic, /usr/share/file/magic or a similar location.
The SUS mandates the following options:
- -M file, specify a file specially formatted containing position-sensitive tests; default position-sensitive tests and context-sensitive tests will not be performed.
- -m file, as for -M, but default tests will be performed after the tests contained in file.
- -d, perform default position-sensitive and context-sensitive tests to the given file; this is the default behaviour unless -M or -m is specified.
- -h, do-not-dereference symbolic links that point to an existing file or directory.
- -L, dereference the symbolic link that points to an existing file or directory.
- -i, do not classify the file further than to identify it as either: nonexistent, a block special file, a character special file, a directory, a FIFO, a socket, a symbolic link, or a regular file. Linux and BSD systems behave differently with this option and instead output an Internet media type (“MIME type”) identifying the recognized file format.
The command tells only what the file looks like, not what it is (in the case where file looks at the content). It is easy to fool the program by putting a magic number into a file the content of which does not match it. Thus the command is not usable as a security tool other than in specific situations.
$ file program program: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), stripped
$ file /dev/hda1 /dev/hda1: block special (0/0)
$ file -s /dev/hda1 /dev/hda1: Linux/i386 ext2 filesystem
Note that -s is a non-standard option available only on some platforms, which tells file to read device files and try to identify their contents rather than merely identifying them as device files. Normally file does not try to read device files since reading such a file can have undesirable side effects.
$ file compressed.gz compressed.gz: gzip compressed data, deflated, original filename, `compressed', last modified: Thu Jan 26 14:08:23 2006, os: Unix
$ file data.ppm data.ppm: Netpbm PPM "rawbits" image data
$ file /bin/cat /bin/cat: Mach-O universal binary with 2 architectures /bin/cat [for architecture ppc7400): Mach-O executable ppc /bin/cat (for architecture i386): Mach-O executable i386
As of version 4.00 of the Ian Darwin/Christos Zoulas version of file, the functionality of file is incorporated into a libmagic library that is accessible via C (and C-compatible) linking; file is implemented using that library.
- "Source of the UNIX V4 "file" man page".
- The early history of this program is recorded in its private CVS repository; see  the log of the main program
- The Open Group Base Specifications Issue 7 — file command
- Linux Programmer's Manual – User Commands –
- NetBSD General Commands Manual –
- Linux Programmer's Manual – Library Functions –
- NetBSD Library Functions Manual –
- Zoulas, Christos (February 27, 2003). "file-3.41 is now available". File (Mailing list). Retrieved January 1, 2013.
- Zoulas, Christos (March 24, 2003). "file-4.00 is now available". File (Mailing list). Retrieved January 1, 2013.
- The Single UNIX® Specification, Issue 7 from The Open Group : determine file type – Commands & Utilities Reference,
- Linux User Commands Manual –
- NetBSD Library Functions Manual –
- Linux Library Functions Manual –
- OpenBSD General Commands Manual –