Andrew File System
The Andrew File System (AFS) is a distributed file system which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University as part of the Andrew Project. It is named after Andrew Carnegie and Andrew Mellon. Its primary use is in distributed computing.
AFS has several benefits over traditional networked file systems, particularly in the areas of security and scalability. It is not uncommon for enterprise AFS deployments to exceed 25,000 clients. AFS uses Kerberos for authentication, and implements access control lists on directories for users and groups. Each client caches files on the local filesystem for increased speed on subsequent requests for the same file. This also allows limited filesystem access in the event of a server crash or a network outage.
AFS uses the Weak Consistency model. Read and write operations on an open file are directed only to the locally cached copy. When a modified file is closed, the changed portions are copied back to the file server. Cache consistency is maintained by callback mechanism. When a file is cached, the server makes a note of this and promises to inform the client if the file is updated by someone else. Callbacks are discarded and must be re-established after any client, server, or network failure, including a time-out. Re-establishing a callback involves a status check and does not require re-reading the file itself.
A consequence of the file locking strategy is that AFS does not support large shared databases or record updating within files shared between client systems. This was a deliberate design decision based on the perceived needs of the university computing environment. It leads, for example, to the use of a single file per message in the original email system for the Andrew Project, the Andrew Message System, rather than a single file per mailbox (i.e., maildir instead of mbox). See AFS and buffered I/O Problems for handling shared databases
A significant feature of AFS is the volume, a tree of files, sub-directories and AFS mountpoints (links to other AFS volumes). Volumes are created by administrators and linked at a specific named path in an AFS cell. Once created, users of the filesystem may create directories and files as usual without concern for the physical location of the volume. A volume may have a quota assigned to it in order to limit the amount of space consumed. As needed, AFS administrators can move that volume to another server and disk location without the need to notify users; indeed, the operation can occur while files in that volume are being used.
AFS volumes can be replicated to read-only cloned copies. When accessing files in a read-only volume, a client system will retrieve data from a particular read-only copy. If at some point that copy becomes unavailable, clients will look for any of the remaining copies. Again, users of that data are unaware of the location of the read-only copy; administrators can create and relocate such copies as needed. The AFS command suite guarantees that all read-only volumes contain exact copies of the original read-write volume at the time the read-only copy was created.
The file name space on an Andrew workstation is partitioned into a shared and local name space. The shared name space (usually mounted as /afs on the Unix filesystem) is identical on all workstations. The local name space is unique to each workstation. It only contains temporary files needed for workstation initialization and symbolic links to files in the shared name space.
The Andrew File System heavily influenced Version 4 of Sun Microsystems' popular Network File System (NFS). Additionally, a variant of AFS, the Distributed File System (DFS) was adopted by the Open Software Foundation in 1989 as part of their Distributed Computing Environment.
There are three major implementations, Transarc (IBM), OpenAFS and Arla, although the Transarc software is losing support and is deprecated. AFS (version two) is also the predecessor of the Coda file system.
A fourth implementation exists in the Linux kernel source code since at least version 2.6.10. Committed by Red Hat, this is a fairly simple implementation still in its early stages of development and therefore incomplete as of January 2013.
The following Access Control List (ACL) permissions can be granted:
- Lookup (l)
- allows a user to list the contents of the AFS directory, examine the ACL associated with the directory and access subdirectories.
- Insert (i)
- allows a user to add new files or subdirectories to the directory.
- Delete (d)
- allows a user to remove files and subdirectories from the directory.
- Administer (a)
- allows a user to change the ACL for the directory. Users always have this right on their home directory, even if they accidentally remove themselves from the ACL.
Permissions that affect files and subdirectories include:
- Read (r)
- allows a user to look at the contents of files in a directory and list files in subdirectories. Files that are to be granted read access to any user, including the owner, need to have the standard UNIX "owner read" permission set.
- Write (w)
- allows a user to modify files in a directory. Files that are to be granted write access to any user, including the owner, need to have the standard UNIX "owner write" permission set.
- Lock (k)
- allows the processor to run programs that need to "flock" files in the directory.
Additionally, AFS includes Application ACLs (A)-(H) which have no effect on access to files.
- Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014), Operating Systems: Three Easy Pieces [The Andrew File System (AFS)] (PDF), Arpaci-Dusseau Books
- What is Andrew - part of CMU's official site chronicling the history of the Andrew Project.
- Howard, J.H., Kazar, M.L., Nichols, S.G., Nichols, D.A., Satyanarayanan, M., Sidebotham, R.N., & West, M.J. (February 1988). "Scale and Performance in a Distributed File System". ACM Transactions on Computer Systems 6 (1): 51–81. doi:10.1145/35037.35059.
- Yaniv Pessach (2013), Distributed Storage (Distributed Storage: Concepts, Algorithms, and Implementations ed.), Amazon
- Linux kernel AFS documentation for 2.6.10
- Linux kernel AFS documentation for the latest kernel version