Hard link
In computing, a hard link is a reference, or pointer, to physical data on a storage volume. On most file systems, all named files are hard links. The name associated with the file is simply a label that refers the operating system to the actual data. As such, more than one name can be associated with the same data. Though called by different names, any changes made will affect the actual data, regardless of how the file is called at a later time. Hard links can only refer to data that exists on the same file system.
On Unix-like systems, hard links can be created with the link() system call, or the ln utility.
On Microsoft Windows, hard links can be created only on NTFS volumes, either with fsutil hardlink or mklink. Also, the Cygwin set of utilities has a Unix-like ln command.
The process of unlinking disassociates a name from the data on the volume. The data is still accessible as long as at least one link that points to it still exists. When the last link is removed, the space is considered free. A process ambiguously called undeleting allows the recreation of links to data that is no longer associated with a name. However, this process is not available on all systems and is often not reliable.
Link counter
Most file systems that support hard links use reference counting. An integer value is stored with each physical data section. This integer represents the total number of links that have been created to point to the data. When a new link is created, this value is increased by one. When a link is removed, the value is decreased by one. The maintenance of this value assists users in preventing data loss. It is also the simplest way for the file system to track the use of a given area of storage, as zero values indicate free space and nonzero values indicate used space.
On Unix, the reference count for a file or directory is returned by the stat() or fstat() system calls in the st_nlink
field of struct stat
. In contrast, programming language implementations that use reference counting rarely expose the reference count to the program being executed, since this information is just an implementation detail.
Example
In the figure to the right, there are two hard links named "LINK A.TXT" and "LINK B.TXT". They have both been linked - that is, made to point - to the same physical data.
If the filename "LINK A.TXT" is opened in an editor, modified and saved, then those changes will be visible even if the filename "LINK B.TXT" is opened for viewing since both filenames point to the same data. The same is true if the file were opened as "LINK B.TXT" - or any other name associated with the data.
Additional links can also be created to the physical data. The user need only specify the name of an existing link; the operating system will resolve the location of the actual data section.
If one of the links is removed (ie, with the UNIX 'rm' command), then the data is still accessible under any other links that remain. If all of the links are removed and no process has the file open, then the space occupied by the data will be considered free, allowing it to be reused in the future for other files. This semantic allows for deleting open files without affecting the process that uses them - an action which is impossible on filesystems with a 1-to-1 relationship between directory entries and data.
Limitations of hard links
There are some issues with hard links that can sometimes make them unsuitable. First of all, because the link is identical to the thing it points to, it becomes difficult to give a command such as "list all the contents of this directory recursively but ignore any links". Most modern operating systems don't allow hard links on directories to prevent endless recursion. (Mac OS X 10.5 "Leopard" is a notable exception[1].) Another drawback of hard links is that they have to be located within the same file system, and most large systems today consist of multiple file systems.
See also
- Symbolic link
- NTFS junction point
- alias (Mac OS)
- shadow (OS/2)
- ln (Unix)—The ln command, which is used to create new links on Unix-like systems.