Copy-on-write (CoW or COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is duplicated but not modified, it is not necessary to create a new resource; the resource can be shared between the copy and the original. Modifications must still create a copy, hence the technique: the copy operation is deferred to the first write. By sharing resources in this way, it is possible to significantly reduce the resource consumption of unmodified copies, while adding a small overhead to resource-modifying operations.
In virtual memory management
Copy-on-write finds its main use in sharing the virtual memory of operating system processes, in the implementation of the fork system call. Typically, the process does not modify any memory and immediately executes a new process, replacing the address space entirely. Thus, it would be wasteful to copy all of the process's memory during a fork, and instead the copy-on-write technique is used.
Copy-on-write can be implemented efficiently using the page table by marking certain pages of memory as read-only and keeping a count of the number of references to the page. When data is written to these pages, the kernel intercepts the write attempt and allocates a new physical page, initialized with the copy-on-write data, although the allocation can be skipped if there is only one reference. The kernel then updates the page table with the new (writable) page, decrements the number of references, and performs the write. The new allocation ensures that a change in the memory of one process is not visible in another's.
The copy-on-write technique can be extended to support efficient memory allocation by having a page of physical memory filled with zeros. When the memory is allocated, all the pages returned refer to the page of zeros and are all marked copy-on-write. This way, physical memory is not allocated for the process until data is written, allowing processes to reserve more virtual memory than physical memory and use memory sparsely, at the risk of running out of virtual address space. The combined algorithm is similar to demand paging.
Loading the libraries for an application is also a use of copy-on-write technique. Dynamic linker always maps libraries as private like follows. Any writing action on the libraries will trigger a COW in virtual memory management.
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 3906144, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0)
mmap(0x7f8a3ced4000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b0000)
This section needs expansion. You can help by adding to it. (October 2017)
In multithreaded systems, COW can be implemented without the use of traditional locking and instead use compare-and-swap to increment or decrement the internal reference counter. Since the original resource will never be altered, it can safely be copied by multiple threads (after the reference count was increased) without the need of performance-expensive locking such as mutexes. If the reference counter turns 0, then by definition only 1 thread was holding a reference so the resource can safely be de-allocated from memory, again without the use of performance-expensive locking mechanisms. The benefit of not having to copy the resource (and the resulting performance gain over traditional deep-copying) will therefore be valid in both single- and multithreaded systems.
std::string x("Hello"); std::string y = x; // x and y use the same buffer y += ", World!"; // now y uses a different buffer // x still uses the same old buffer
In the PHP programming language, all types except references are implemented as copy-on-write. For example, strings and arrays are passed by reference, but when modified, they are duplicated if they have non-zero reference counts. This allows them to act as value types without the performance problems of copying on assignment or making them immutable.
In the Qt framework, many types are copy-on-write ("implicitly shared" in Qt's terms). Qt uses atomic compare-and-swap operations to increment or decrement the internal reference counter. Since the copies are cheap, Qt types can often be safely used by multiple threads without the need of locking mechanisms such as mutexes. The benefits of CoW are thus valid in both single- and multithreaded systems.
In computer storage
CoW may also be used as the underlying mechanism for snapshots, such as those provided by logical volume management, file systems such as Btrfs and ZFS, and database servers such as Microsoft SQL Server. Typically, the snapshots store only the modified data, and are stored close to the main array, so they are only a weak form of incremental backup and cannot substitute for a full backup. Some systems also use a CoW technique to avoid the fuzzy backups, otherwise incurred when any file in the set of files being backed up changes during that backup.
When implementing snapshots, there are two techniques:
- The original storage is never modified. When a write request is made, it is redirected away from the original data into a new storage area. (called "Redirect-on-write" or ROW)
- When a write request is made, the data are copied into a new storage area, and then the original data are modified. (called "Copy-on-write" or CoW)
Despite their names, copy-on-write usually refers to the first technique. CoW does two data writes compared to ROW's one; it is difficult to implement efficiently and thus used infrequently.
The qcow2 (QEMU copy on write) disk image format uses the copy-on-write technique to reduce disk image size.
Some Live CDs (and Live USBs) use copy-on-write techniques to give the impression of being able to add and delete files in any directory, without actually making any changes to the CD (or USB flash drive).
In high-reliability software
Phantom OS uses CoW at all levels, not just a database or file system. At any time, a computer running this system can fail, and then, when it starts again, the software and operating system resume operation. Only small amounts of work can be lost.
The basic approach is that all program data are kept in virtual memory. On some schedule, a summary of all software data are written to virtual memory, forming a log that tracks the current value and location of each value.
When the computer fails, a recent copy of the log and other data remain safe on disk. When operation resumes, operating system software reads the log to restore consistent copies of all the programs and data.
This approach uses copy-on-write at all levels in all software, including in application software. This requires support within the application programming language. In practice, Phantom OS permits only languages that generate Java byte codes.
- Demand paging
- Dirty COW – a computer security vulnerability for the Linux operating system kernel
- Flyweight pattern
- Memory management
- Memory mapping
- Persistent data structure
- Snapshot (computer storage)
- "Implicit Sharing". Qt Project. Retrieved 4 August 2016.
- Rodeh, Ohad (1 February 2008). "B-trees, shadowing, and clones" (PDF). ACM Transactions on Storage. 3 (4): 1. CiteSeerX 10.1.1.161.6863. doi:10.1145/1326542.1326544. Retrieved 4 August 2016.
- Bovet, Daniel Pierre; Cesati, Marco (2002-01-01). Understanding the Linux Kernel. "O'Reilly Media, Inc.". p. 295. ISBN 9780596002138.
- Abbas, Ali. "The Kernel Samepage Merging Process". alouche.net. Retrieved 4 August 2016.
- Meyers, Scott (2012), Effective STL, Addison-Wesley, pp. 64–65, ISBN 9780132979184
- "Concurrency Modifications to Basic String". Open Standards. Retrieved 13 February 2015.
- Pauli, Julien; Ferrara, Anthony; Popov, Nikita (2013). "Memory management". www.phpinternalsbook.com. PHP Internals Book. Retrieved 4 August 2016.
- "Threads and Implicitly Shared Classes". Qt Project. Retrieved 4 August 2016.
- Kasampalis, Sakis (2010). "Copy On Write Based File Systems Performance Analysis And Implementation" (PDF). p. 19. Retrieved 11 January 2013.
- Chien, Tim. "Snapshots Are NOT Backups". www.oracle.com. Oracle. Retrieved 4 August 2016.