Snapshot (computer storage)
|This article needs additional citations for verification. (September 2013) (Learn how and when to remove this template message)|
In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. It can refer to an actual copy of the state of a system or to a capability provided by certain systems.
A full backup of a large data set may take a long time to complete. On multi-tasking or multi-user systems, there may be writes to that data while it is being backed up. This prevents the backup from being atomic and introduces a version skew that may result in data corruption. For example, if a user moves a file into a directory that has already been backed up, then that file would be completely missing on the backup media, since the backup operation had already taken place before the addition of the file. Version skew may also cause corruption with files which change their size or contents underfoot while being read.
One approach to safely backing up live data is to temporarily disable write access to data during the backup, either by stopping the accessing applications or by using the locking API provided by the operating system to enforce exclusive read access. This is tolerable for low-availability systems (on desktop computers and small workgroup servers, on which regular downtime is acceptable). High-availability 24/7 systems, however, cannot bear service stoppages.
To avoid downtime, high-availability systems may instead perform the backup on a snapshot—a read-only copy of the data set frozen at a point in time—and allow applications to continue writing to their data. Most snapshot implementations are efficient and can create snapshots in O(1). In other words, the time and I/O needed to create the snapshot does not increase with the size of the data set; by contrast, the time and I/O required for a direct backup is proportional to the size of the data set. In some systems once the initial snapshot is taken of a data set, subsequent snapshots copy the changed data only, and use a system of pointers to reference the initial snapshot. This method of pointer-based snapshots consumes less disk capacity than if the data set was repeatedly cloned.
Read-write snapshots are sometimes called branching snapshots, because they implicitly create diverging versions of their data. Aside from backups and data recovery, read-write snapshots are frequently used in virtualization, sandboxing and virtual hosting setups because of their usefulness in managing changes to large sets of files.
Some Unix systems have snapshot-capable logical volume managers. These implement copy-on-write on entire block devices by copying changed blocks—just before they are to be overwritten within "parent" volumes—to other storage, thus preserving a self-consistent past image of the block device. Filesystems on such snapshot images can later be mounted as if they were on a read-only media.
Some volume managers also allow creation of writable snapshots, extending the copy-on-write approach by disassociating any blocks modified within the snapshot from their "parent" blocks in the original volume. Such a scheme could be also described as performing additional copy-on-write operations triggered by the writes to snapshots.
Some file systems, such as WAFL,[note 1] fossil for Plan 9 from Bell Labs, and ODS-5, internally track old versions of files and make snapshots available through a special namespace. Others, like UFS2, provide an operating system API for accessing file histories. In NTFS, access to snapshots is provided by the Volume Shadow-copying Service (VSS) in Windows XP and Windows Server 2003 and Shadow Copy in Windows Vista. Melio FS provides snapshots via the same VSS interface for shared storage. Snapshots have also been available in the NSS (Novell Storage Services) file system on NetWare since version 4.11, and more recently on Linux platforms in the Open Enterprise Server product.
EMC's Isilon OneFS clustered storage platform implements a single scalable file system that supports read-only snapshots at the file or directory level. Any file or directory within the file system can be snapshotted and the system will implement a copy-on-write or point-in-time snapshot dynamically based on which method is determined to be optimal for the system.
On Linux, the Btrfs and OCFS2 file systems support creating snapshots (cloning) of individual files. Additionally, Btrfs also supports the creation of snapshots of subvolumes. On AIX, JFS2 also support snapshots.
Time Machine, included in Apple's Mac OS X v10.5 operating system, is not a snapshotting scheme but a system-level incremental backup service: it merely watches mounted volumes for changes and copies changed files periodically to a specially-designated volume using hard links.[note 2]
- Application checkpointing
- Logical Volume Manager (Linux)
- Persistence (computer science)
- R1Soft Hot copy
- Shadow Copy
- Storage Hypervisor
- System image
- WAFL is not a file system. WAFL is a file layout that provides mechanisms that enable a variety of file systems and technologies that want to access disk blocks.
- Time Machine is not a file system and it does not make use of a snapshotting feature. It is only included here as a reference.
- "LVM HOWTO". 3.8. Snapshots. tldp.org. Retrieved 2013-09-29.
- "Optimized Storage Solution for Enterprise Scale Hyper-V Deployments" (PDF). Microsoft. March 2010. p. 15. Retrieved 25 October 2012.
|Look up snapshot in Wiktionary, the free dictionary.|
- Garimella, Neeta (2006-04-26). "Understanding and exploiting snapshot technology for data protection, Part 1: Snapshot technology overview".
- Harwood, Mike (2003-09-24). "Storage Basics: Backup Strategies".
- Project web page of rsnapshot