|Introduced||November 2005 with OpenSolaris|
|Directory contents||Extensible hash table|
|Max. volume size||256 quadrillion zebibytes (2128 bytes)|
|Max. file size||16 exbibytes (264 bytes)|
|Max. number of files||
|Max. filename length||255 ASCII characters (fewer for multibyte character encodings such as Unicode)|
|Forks||Yes (called "extended attributes", but they are full-fledged streams)|
|File system permissions||POSIX, NFSv4 ACLs|
|Supported operating systems||Solaris, OpenSolaris, illumos distributions, OpenIndiana, FreeBSD, Mac OS X Server 10.5 (only read-only support), NetBSD, Linux via third-party kernel module or ZFS-FUSE, OSv)|
ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include protection against data corruption, support for high storage capacities, efficient data compression, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs.
The ZFS name is registered as a trademark of Oracle Corporation; although it was briefly given the retrofitted expanded name "Zettabyte File System", it is no longer considered an initialism. Originally, ZFS was proprietary, closed-source software developed internally by Sun as part of Solaris, with a team led by the CTO of Sun's storage business unit and Sun Fellow, Jeff Bonwick. In 2005, the bulk of Solaris, including ZFS, was licensed as open-source software under the Common Development and Distribution License (CDDL), as the OpenSolaris project. ZFS became a standard feature of Solaris 10 in June 2006.
In 2010, Oracle stopped the releasing of source code for new OpenSolaris and ZFS development, effectively forking their closed-source development from the open-source branch. In response, OpenZFS was created as a new open-source development umbrella project, aiming at bringing together individuals and companies that use the ZFS filesystem in an open-source manner.
- 1 Overview and ZFS design goals
- 2 Features
- 2.1 Data integrity
- 2.2 RAID
- 2.3 Capacity
- 2.4 Encryption
- 2.5 Other features
- 2.5.1 Storage pools
- 2.5.2 Caching mechanisms: ARC (L1), L2ARC, Transaction groups, SLOG (ZIL)
- 2.5.3 Copy-on-write transactional model
- 2.5.4 Snapshots and clones
- 2.5.5 Sending and receiving snapshots
- 2.5.6 Dynamic striping
- 2.5.7 Variable block sizes
- 2.5.8 Lightweight filesystem creation
- 2.5.9 Adaptive endianness
- 2.5.10 Deduplication
- 2.5.11 Additional capabilities
- 3 Limitations
- 4 Platforms
- 4.1 Solaris
- 4.2 BSD
- 4.3 Linux
- 4.4 List of operating systems supporting ZFS
- 5 History
- 6 See also
- 7 References
- 8 Bibliography
- 9 External links
Overview and ZFS design goals
|This section does not cite any sources. (January 2017) (Learn how and when to remove this template message)|
ZFS compared to most other file systems
Historically, the management of stored data has involved two aspects — the physical management of block devices such as hard drives and SD cards, and devices such as RAID controllers that present a logical single device based upon multiple physical devices (often undertaken by a volume manager, array manager, or suitable device driver), and the management of files stored as logical units on these logical block devices (a file system).
- Example: A RAID array of 2 hard drives and an SSD caching disk is controlled by Intel's RST system, part of the chipset and firmware built into a desktop computer. The user sees this as a single volume, containing an NTFS-formatted drive of their data, and NTFS is not necessarily aware of the manipulations that may be required (such as rebuilding the RAID array if a disk fails). The management of the individual devices and their presentation as a single device, is distinct from the management of the files held on that apparent device.
ZFS is unusual, because unlike most other storage systems, it unifies both of these roles and acts as both the volume manager and the file system. Therefore, it has complete knowledge of both the physical disks and volumes (including their condition, status, their logical arrangement into volumes, and also of all the files stored on them). ZFS is designed to ensure (subject to suitable hardware) that data stored on disks cannot be lost due to physical error or misprocessing by the hardware or operating system, or bit rot events and data corruption which may happen over time, and its complete control of the storage system is used to ensure that every step, whether related to file management or disk management, is verified, confirmed, corrected if needed, and optimized, in a way that storage controller cards, and separate volume and file managers cannot achieve.
ZFS also includes a mechanism for snapshots and replication, including snapshot cloning; the former is described by the FreeBSD documentation as one of its "most powerful features", having features that "even other file systems with snapshot functionality lack". Very large numbers of snapshots can be taken, without degrading performance, allowing snapshots to be used prior to risky system operations and software changes, or an entire production ("live") file system to be fully snapshotted several times an hour, in order to mitigate data loss due to user error or malicious activity. Snapshots can be rolled back "live" or the file system at previous points in time viewed, even on very large file systems, leading to "tremendous" savings in comparison to formal backup and restore processes, or cloned "on the spot" to form new independent file systems.
Summary of key differentiating features
Examples of features specific to ZFS which facilitate its objective include:
- Designed for long term storage of data, and indefinitely scaled datastore sizes with zero data loss, and high configurability.
- Hierarchical checksumming of all data and metadata, ensuring that the entire storage system can be verified on use, and confirmed to be correctly stored, or remedied if corrupt. Checksums are stored with a block's parent block, rather than with the block itself. This contrasts with many file systems where checksums (if held) are stored with the data so that if the data is lost or corrupt, the checksum is also likely to be lost or incorrect.
- Can store a user-specified number of copies of data or metadata, or selected types of data, to improve the ability to recover from data corruption of important files and structures.
- Automatic rollback of recent changes to the file system and data, in some circumstances, in the event of an error or inconsistency.
- Automated and (usually) silent self-healing of data inconsistencies and write failure when detected, for all errors where the data is capable of reconstruction. Data can be reconstructed using all of the following: error detection and correction checksums stored in each block's parent block; multiple copies of data (including checksums) held on the disk; write intentions logged on the SLOG (ZIL) for writes that should have occurred but did not occur (after a power failure); parity data from RAID/RAIDZ disks and volumes; copies of data from mirrored disks and volumes.
- Native handling of standard RAID levels and additional ZFS RAID layouts ("RAIDZ"). The RAIDZ levels stripe data across only the disks required, for efficiency (many RAID systems stripe indiscriminately across all devices), and checksumming allows rebuilding of inconsistent or corrupted data to be minimised to those blocks with defects;
- Native handling of tiered storage and caching devices, which is usually a volume related task. Because it also understands the file system, it can use file-related knowledge to inform, integrate and optimize its tiered storage handling which a separate device cannot;
- Native handling of snapshots and backup/replication which can be made efficient by integrating the volume and file handling. ZFS can routinely take snapshots several times an hour of the data system, efficiently and quickly. (Relevant tools are provided at a low level and require external scripts and software for utilization).
- Native data compression and deduplication, although the latter is largely handled in RAM and is memory hungry.
- Efficient rebuilding of RAID arrays — a RAID controller often has to rebuild an entire disk, but ZFS can combine disk and file knowledge to limit any rebuilding to data which is actually missing or corrupt, greatly speeding up rebuilding;
- Ability to identify data that would have been found in a cache but has been discarded recently instead; this allows ZFS to reassess its caching decisions in light of later use and facilitates very high cache hit levels;
- Alternative caching strategies can be used for data that would otherwise cause delays in data handling. For example, synchronous writes which are capable of slowing down the storage system can be converted to asynchronous writes by being written to a fast separate caching device, known as the SLOG (sometimes called the ZIL - ZFS Intent Log).
- Highly tunable - many internal parameters can be configured for optimal functionality.
- Can be used for high availability clusters and computing, although not fully designed for this use.
Inappropriately specified systems
Unlike many file systems, ZFS is intended to work in a specific way and towards specific ends. It expects or is designed with the assumption of a specific kind of hardware environment. If the system is not suitable for ZFS, then ZFS may underperform significantly. Common system design failures:
- Inadequate RAM — ZFS may use a large amount of memory in many scenarios;
- Inadequate disk free space — ZFS uses copy on write for data storage; its performance may suffer if the disk pool gets too close to full;
- 'No efficient dedicated SLOG device, when synchronous writing is prominent — this is notably the case for NFS and ESXi; even SSD based systems may need a separate SLOG device for expected performance. The SLOG device is only used for writing apart from when recovering from a system error. It can often be small (for example, in FreeNAS, the SLOG device only needs to store the largest amount of data likely to be written in about 10 seconds (or the size of two 'transaction groups'), although it can be made larger to allow longer lifetime of the device). SLOG is therefore unusual in that its main criteria are pure write functionality, low latency, and loss protection - usually little else matters.
- Lack of suitable caches, or misdesigned caches — for example, ZFS can cache read data in RAM ("ARC") or a separate device ("L2ARC"); in some cases adding extra ARC is needed, in other cases adding extra L2ARC is needed, and in some situations adding extra L2ARC can even degrade performance, by forcing RAM to be used for lookup data for the slower L2ARC, at the cost of less room for data in the ARC.
- 'Use of hardware RAID cards, perhaps in the mistaken belief that these will 'help' ZFS. While routine for other filing systems, ZFS handles RAID natively, and is designed to work with a raw and unmodified low level view of storage devices, so it can fully use its functionality. A separate RAID card may leave ZFS less efficient and reliable. For example ZFS checksums all data, but most RAID cards will not do this as effectively, or for cached data. Separate cards can also mislead ZFS about the state of data, for example after a crash, or by mis-signalling exactly when data has safely been written, and in some cases this can lead to issues and data loss. Separate cards can also slow down the system, sometimes greatly, by adding latency to every data read/write operation, or by undertaking full rebuilds of damaged arrays where ZFS would have only needed to do minor repairs of a few seconds.
One major feature that distinguishes ZFS from other file systems is that it is designed with a focus on data integrity by protecting the user's data on disk against silent data corruption caused by data degradation, current spikes, bugs in disk firmware, phantom writes (the previous write did not make it to disk), misdirected reads/writes (the disk accesses the wrong block), DMA parity errors between the array and server memory or from the driver (since the checksum validates data inside the array), driver errors (data winds up in the wrong buffer inside the kernel), accidental overwrites (such as swapping to a live file system), etc.
A 2012 research showed that neither any of the then-major and widespread filesystems (such as UFS, Ext, XFS, JFS, or NTFS) nor hardware RAID (which has some issues with data integrity) provided sufficient protection against data corruption problems. Initial research indicates that ZFS protects data better than earlier efforts. It is also faster than UFS and can be seen as its replacement.
ZFS data integrity
For ZFS, data integrity is achieved by using a Fletcher-based checksum or a SHA-256 hash throughout the file system tree. Each block of data is checksummed and the checksum value is then saved in the pointer to that block—rather than at the actual block itself. Next, the block pointer is checksummed, with the value being saved at its pointer. This checksumming continues all the way up the file system's data hierarchy to the root node, which is also checksummed, thus creating a Merkle tree. In-flight data corruption or phantom reads/writes (the data written/read checksums correctly but is actually wrong) are undetectable by most filesystems as they store the checksum with the data. ZFS stores the checksum of each block in its parent block pointer so the entire pool self-validates.
When a block is accessed, regardless of whether it is data or meta-data, its checksum is calculated and compared with the stored checksum value of what it "should" be. If the checksums match, the data are passed up the programming stack to the process that asked for it; if the values do not match, then ZFS can heal the data if the storage pool provides data redundancy (such as with internal mirroring), assuming that the copy of data is undamaged and with matching checksums. If the storage pool consists of a single disk, it is possible to provide such redundancy by specifying copies=2 (or copies=3), which means that data will be stored twice (or three times) on the disk, effectively halving (or, for copies=3, reducing to one third) the storage capacity of the disk. If redundancy exists, ZFS will fetch a copy of the data (or recreate it via a RAID recovery mechanism), and recalculate the checksum—ideally resulting in the reproduction of the originally expected value. If the data passes this integrity check, the system can then update the faulty copy with known-good data so that redundancy can be restored.
ZFS and hardware RAID
If the disks are connected to a RAID controller, it is most efficient to configure it as a HBA in JBOD mode (i.e. turn off RAID function). If a hardware RAID card is used, ZFS always detects all data corruption but cannot always repair data corruption because the hardware RAID card will interfere. Therefore, the recommendation is to not use a hardware RAID card, or to flash a hardware RAID card into JBOD/IT mode. For ZFS to be able to guarantee data integrity, it needs to either have access to a RAID set (so all data is copied to at least two disks), or if one single disk is used, ZFS needs to enable redundancy (copies) which duplicates the data on the same logical drive. Using ZFS copies is a good feature to use on notebooks and desktop computers, since the disks are large and it at least provides some limited redundancy with just a single drive.
There are several reasons as to why it is better to rely solely on ZFS by using several independent disks and RAID-Z or mirroring.
When using hardware RAID, the controller usually adds controller-dependent data to the drives which prevents software RAID from accessing the user data. While it is possible to read the data with a compatible hardware RAID controller, this inconveniences consumers as a compatible controller usually isn't readily available. Using the JBOD/RAID-Z combination, any disk controller can be used to resume operation after a controller failure.
Note that hardware RAID configured as JBOD may still detach drives that do not respond in time (as has been seen with many energy-efficient consumer-grade hard drives), and as such, may require TLER/CCTL/ERC-enabled drives to prevent drive dropouts.
Software RAID using ZFS
ZFS offers software RAID through its RAID-Z and mirroring organization schemes.
RAID-Z is a data/parity distribution scheme like RAID-5, but uses dynamic stripe width: every block is its own RAID stripe, regardless of blocksize, resulting in every RAID-Z write being a full-stripe write. This, when combined with the copy-on-write transactional semantics of ZFS, eliminates the write hole error. RAID-Z is also faster than traditional RAID 5 because it does not need to perform the usual read-modify-write sequence.
As all stripes are of different sizes, RAID-Z reconstruction has to traverse the filesystem metadata to determine the actual RAID-Z geometry. This would be impossible if the filesystem and the RAID array were separate products, whereas it becomes feasible when there is an integrated view of the logical and physical structure of the data. Going through the metadata means that ZFS can validate every block against its 256-bit checksum as it goes, whereas traditional RAID products usually cannot do this.
In addition to handling whole-disk failures, RAID-Z can also detect and correct silent data corruption, offering "self-healing data": when reading a RAID-Z block, ZFS compares it against its checksum, and if the data disks did not return the right answer, ZFS reads the parity and then figures out which disk returned bad data. Then, it repairs the damaged data and returns good data to the requestor.
RAID-Z does not require any special hardware: it does not need NVRAM for reliability, and it does not need write buffering for good performance. With RAID-Z, ZFS provides fast, reliable storage using cheap, commodity disks.
There are three different RAID-Z modes: RAID-Z1 (similar to RAID 5, allows one disk to fail), RAID-Z2 (similar to RAID 6, allows two disks to fail), and RAID-Z3 (allows three disks to fail). The need for RAID-Z3 arose recently because RAID configurations with future disks (say, 6–10 TB) may take a long time to repair, the worst case being weeks. During those weeks, the rest of the disks in the RAID are stressed more because of the additional intensive repair process and might subsequently fail, too. By using RAID-Z3, the risk involved with disk replacement is reduced.
Mirroring, the other ZFS RAID option, is essentially the same as RAID 1, allowing any number of disks to be mirrored.
Resilvering and scrub
ZFS has no tool equivalent to fsck (the standard Unix and Linux data checking and repair tool for file systems). Instead, ZFS has a built-in scrub function which regularly examines all data and repairs silent corruption and other problems. Some differences are:
- fsck must be run on an offline filesystem, which means the filesystem must be unmounted and is not usable while being repaired, while scrub is designed to be used on a mounted, live filesystem, and does not need the ZFS filesystem to be taken offline.
- fsck usually only checks metadata (such as the journal log) but never checks the data itself. This means, after a fsck, the data might still be corrupt.
- fsck cannot always validate and repair data when checksums are stored with data (often the case in many file systems), because the checksums may also be corrupted or unreadable. ZFS always stores checksums separately from the data they verify, improving reliability and the ability of scrub to repair the volume. ZFS also stores multiple copies of data - metadata in particular may have upwards of 4 or 6 copies (multiple copies per disk and multiple disk mirrors per volume), greatly improving the ability of scrub to detect and repair extensive damage to the volume, compared to fsck.
- scrub checks everything, including metadata and the data. The effect can be observed by comparing fsck to scrub times – sometimes a fsck on a large RAID completes in a few minutes, which means only the metadata was checked. Traversing all metadata and data on a large RAID takes many hours, which is exactly what scrub does.
ZFS is a 128-bit file system, so it can address 1.84 × 1019 times more data than 64-bit systems such as Btrfs. The maximum limits of ZFS are designed to be so large that they should never be encountered in practice. For instance, fully populating a single zpool with 2128 bits of data would require 1024 3 TB hard disk drives.
Some theoretical limits in ZFS are:
- 248: number of entries in any individual directory
- 16 exbibytes (264 bytes): maximum size of a single file
- 16 exbibytes: maximum size of any attribute
- 256 quadrillion zebibytes (2128 bytes): maximum size of any zpool
- 256: number of attributes of a file (actually constrained to 248 for the number of files in a directory)
- 264: number of devices in any zpool
- 264: number of zpools in a system
- 264: number of file systems in a zpool
With Oracle Solaris, the encryption capability in ZFS is embedded into the I/O pipeline. During writes, a block may be compressed, encrypted, checksummed and then deduplicated, in that order. The policy for encryption is set at the dataset level when datasets (file systems or ZVOLs) are created. The wrapping keys provided by the user/administrator can be changed at any time without taking the file system offline. The default behaviour is for the wrapping key to be inherited by any child data sets. The data encryption keys are randomly generated at dataset creation time. Only descendant datasets (snapshots and clones) share data encryption keys. A command to switch to a new data encryption key for the clone or at any time is provided—this does not re-encrypt already existing data, instead utilising an encrypted master-key mechanism.
Unlike traditional filesystems which reside on single devices and thus require a volume manager to use more than one device, ZFS filesystems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), themselves constructed of block devices: Files, hard drive partitions, or entire drives, with the latter being the recommended usage. Block devices within a vdev may be configured in different ways, depending on needs and space available: non-redundantly (similar to RAID 0), as a mirror (RAID 1) of two or more devices, as a RAID-Z group of three or more devices, or as a RAID-Z2 (similar to RAID-6) group of four or more devices. In July 2009, triple-parity RAID-Z3 was added to OpenSolaris. RAID-Z is a data-protection technology featured by ZFS in order to reduce the block overhead in mirroring.
Thus, a zpool (ZFS storage pool) is vaguely similar to a computer's RAM. The total RAM pool capacity depends on the number of RAM memory sticks and the size of each stick. Likewise, a zpool consists of one or more vdevs. Each vdev can be viewed as a group of hard disks (or partitions, or files, etc.). Each vdev should have redundancy, otherwise if a vdev is lost, then the whole zpool is lost. Thus, each vdev should be configured as RAID-Z1, RAID-Z2, mirror, etc. It is not possible to change the number of drives in an existing vdev (Block Pointer Rewrite will allow this, and also allow defragmentation), but it is always possible to increase storage capacity by adding a new vdev to a zpool. It is possible to swap a drive to a larger drive and resilver (repair) the zpool. If this procedure is repeated for every disk in a vdev, then the zpool will grow in capacity when the last drive is resilvered. A vdev will have the same base capacity as the smallest drive in the group. For instance, a vdev consisting of three 500 GB and one 700 GB drive, will have a capacity of 4×500 GB.
In addition, pools can have hot spares to compensate for failing disks. When mirroring, block devices can be grouped according to physical chassis, so that the filesystem can continue in the case of the failure of an entire chassis.
Storage pool composition is not limited to similar devices, but can consist of ad-hoc, heterogeneous collections of devices, which ZFS seamlessly pools together, subsequently doling out space to clarification needed] as needed. Arbitrary storage device types can be added to existing pools to expand their size.[
The storage capacity of all vdevs is available to all of the file system instances in the zpool. A quota can be set to limit the amount of space a file system instance can occupy, and a reservation can be set to guarantee that space will be available to a file system instance.
Caching mechanisms: ARC (L1), L2ARC, Transaction groups, SLOG (ZIL)
ZFS uses different layers of disk cache to speed up read and write operations. Ideally, all data should be stored in RAM, but that is usually too expensive. Therefore, data is automatically cached in a hierarchy to optimize performance vs cost. Frequently accessed data will be stored in RAM, and less frequently accessed data can be stored on slower media, such as solid state drives (SSDs). Data that is not often accessed is not cached and left on the slow hard drives. If old data is suddenly read a lot, ZFS will automatically move it to SSDs or to RAM.
ZFS caching mechanisms include one each for reads and writes, and in each case, two levels of caching can exist, one in computer memory (RAM) and one on fast storage (usually solid state drives (SSDs)), for a total of four caches.
|Where stored||Read cache||Write cache|
|First level cache||In RAM||Known as ARC, due to its use of a variant of the adaptive replacement cache (ARC) algorithm. RAM will always be used for caching, thus this level is always present. The efficiency of the ARC algorithm means that disks will often not need to be accessed, provided the ARC size is sufficiently large. If RAM is too small there will hardly be any ARC at all; in this case, ZFS always needs to access the underlying disks which impacts performance considerably.||Handled by means of "transaction groups" - writes are collated over a short period (typically 5 – 30 seconds) up to a given limit, with each group being written to disk ideally while the next group is being collated. This allows writes to be organized more efficiently for the underlying disks at the risk of minor data loss of the most recent transactions upon power interruption or hardware fault. In practice the power loss risk is avoided by ZFS write journaling and by the SLOG/ZIL second tier write cache pool (see below), so writes will only be lost if a write failure happens at the same time as a total loss of the second tier SLOG pool, and then only when settings related to synchronous writing and SLOG use are set in a way that would allow such a situation to arise. If data is received faster than it can be written, data receipt is paused until the disks can catch up.|
|Second level cache||On fast storage devices (which can be added or removed from a "live" system without disruption in current versions of ZFS, although not always in older versions)||Known as L2ARC ("Level 2 ARC"), optional. ZFS will cache as much data in L2ARC as it can, which can be tens or hundreds of gigabytes in many cases. L2ARC will also considerably speed up deduplication if the entire deduplication table can be cached in L2ARC. It can take several hours to fully populate the L2ARC from empty (before ZFS has decided which data are "hot" and should be cached). If the L2ARC device is lost, all reads will go out to the disks which slows down performance, but nothing else will happen (no data will be lost).||Known as SLOG or ZIL ("ZFS Intent Log"),optional but an SLOG will be created on the main storage devices if no cache device is provided. This is the second tier write cache, and is often misunderstood. Strictly speaking, ZFS does not use the SLOG device to cache its disk writes. Rather, it uses SLOG to ensure writes are captured to a permanent storage medium as quickly as possible, so that in the event of power loss or write failure, no data which was acknowledged as written, is lost. The SLOG device allows ZFS to speedily store writes and report then as written to the main storage pool devices. In the normal course of activity, the SLOG will never be referred to or read, and therefore does not act as a cache; its purpose is to safeguard data in flight during the time taken for collation and writing out, in case the eventual write were to fail. If all goes well, then the storage pool will be updated at some point within the next 5 to 60 seconds, when the current transaction group is collated and written out to disk (see above), at which point the saved writes on the SLOG are simply ignored and overwritten. If the write eventually fails, or the system suffers a crash or fault preventing its writing, then ZFS can recreate all the writes that should have taken place from the SLOG (the only time it is read from), and can repair the data loss. This becomes crucial if a large number of synchronous writes take place (such as with ESXi, NFS and some databases),. where the client requires confirmation of successful writing before continuing its activity; the SLOG allows ZFS to confirm writing is successful much more quickly than if it had to write to the main store every time, without the risk involved in misleading the client as to the state of data storage. If there is no SLOG device then part of the main data pool will be used for the same purpose, although this is slower. If the log device is lost, it is possible to lose the latest writes, therefore the log device should be mirrored. In earlier versions of ZFS, loss of the log device could result in loss of the entire zpool, although this is no longer the case. Therefore, one should upgrade ZFS if planning to use a separate log device.|
Copy-on-write transactional model
ZFS uses a copy-on-write transactional object model. All block pointers within the filesystem contain a 256-bit checksum or 256-bit hash (currently a choice between Fletcher-2, Fletcher-4, or SHA-256) of the target block, which is verified when the block is read. Blocks containing active data are never overwritten in place; instead, a new block is allocated, modified data is written to it, then any metadata blocks referencing it are similarly read, reallocated, and written. To reduce the overhead of this process, multiple updates are grouped into transaction groups, and ZIL (intent log) write cache is used when synchronous write semantics are required. The blocks are arranged in a tree, as are their checksums (see Merkle signature scheme).
Snapshots and clones
|This section does not cite any sources. (January 2017) (Learn how and when to remove this template message)|
An advantage of copy-on-write is that, when ZFS writes new data, the blocks containing the old data can be retained, allowing a snapshot version of the file system to be maintained. ZFS snapshots are created very quickly, since all the data composing the snapshot is already stored. They are also space efficient, since any unchanged data is shared among the file system and its snapshots.
Writeable snapshots ("clones") can also be created, resulting in two independent file systems that share a set of blocks. As changes are made to any of the clone file systems, new data blocks are created to reflect those changes, but any unchanged blocks continue to be shared, no matter how many clones exist. This is an implementation of the Copy-on-write principle.
Sending and receiving snapshots
|This section does not cite any sources. (January 2017) (Learn how and when to remove this template message)|
ZFS file systems can be moved to other pools, also on remote hosts over the network, as the send command creates a stream representation of the file system's state. This stream can either describe complete contents of the file system at a given snapshot, or it can be a delta between snapshots. Computing the delta stream is very efficient, and its size depends on the number of blocks changed between the snapshots. This provides an efficient strategy, e.g., for synchronizing offsite backups or high availability mirrors of a pool.
Dynamic striping across all devices to maximize throughput means that as additional devices are added to the zpool, the stripe width automatically expands to include them; thus, all disks in a pool are used, which balances the write load across them.
Variable block sizes
ZFS uses variable-sized blocks, with 128 KB as the default size. Available features allow the administrator to tune the maximum block size which is used, as certain workloads do not perform well with large blocks. If data compression is enabled, variable block sizes are used. If a block can be compressed to fit into a smaller block size, the smaller size is used on the disk to use less storage and improve IO throughput (though at the cost of increased CPU use for the compression and decompression operations).
Lightweight filesystem creation
In ZFS, filesystem manipulation within a storage pool is easier than volume manipulation within a traditional filesystem; the time and effort required to create or expand a ZFS filesystem is closer to that of making a new directory than it is to volume manipulation in some other systems.
Pools and their associated ZFS file systems can be moved between different platform architectures, including systems implementing different byte orders. The ZFS block pointer format stores filesystem metadata in an endian-adaptive way; individual metadata blocks are written with the native byte order of the system writing the block. When reading, if the stored endianness does not match the endianness of the system, the metadata is byte-swapped in memory.
This does not affect the stored data; as is usual in POSIX systems, files appear to applications as simple arrays of bytes, so applications creating and reading data remain responsible for doing so in a way independent of the underlying system's endianness.
Data deduplication capabilities were added to the ZFS source repository at the end of October 2009, and relevant OpenSolaris ZFS development packages have been available since December 3, 2009 (build 128).
Effective use of deduplication may require large RAM capacity; recommendations range between 1 and 5 GB of RAM for every TB of storage. Insufficient physical memory or lack of ZFS cache can result in virtual memory thrashing when using deduplication, which can either lower performance or result in complete memory starvation. Solid-state drives (SSDs) can be used to cache deduplication tables, thereby speeding up deduplication performance.
Other storage vendors use modified versions of ZFS to achieve very high data compression ratios. Two examples in 2012 were GreenBytes and Tegile. In May 2014, Oracle bought GreenBytes for its ZFS deduplication and replication technology.
- Explicit I/O priority with deadline scheduling.
- Claimed globally optimal I/O sorting and aggregation.
- Multiple independent prefetch streams with automatic length and stride detection.
- Parallel, constant-time directory operations.
- End-to-end checksumming, using a kind of "Data Integrity Field", allowing data corruption detection (and recovery if you have redundancy in the pool).
- Transparent filesystem compression. Supports LZJB, gzip and LZ4.
- Intelligent scrubbing and resilvering (resyncing).
- Load and space usage sharing among disks in the pool.
- Ditto blocks: Configurable data replication per filesystem, with zero, one or two extra copies requested per write for user data, and with that same base number of copies plus one or two for metadata (according to metadata importance). If the pool has several devices, ZFS tries to replicate over different devices. Ditto blocks are primarily an additional protection against corrupted sectors, not against total disk failure.
- ZFS design (copy-on-write + superblocks) is safe when using disks with write cache enabled, if they honor the write barriers. This feature provides safety and a performance boost compared with some other filesystems.[according to whom?]
- On Solaris, when entire disks are added to a ZFS pool, ZFS automatically enables their write cache. This is not done when ZFS only manages discrete slices of the disk, since it does not know if other slices are managed by non-write-cache safe filesystems, like UFS. The FreeBSD implementation can handle disk flushes for partitions thanks to its GEOM framework, and therefore does not suffer from this limitation.
- Per-user and per-group quotas support.
- Filesystem encryption since Solaris 11 Express.
- Pools can be imported in read-only mode.
- It is possible to recover data by rolling back entire transactions at the time of importing the zpool.
- ZFS is not a clustered filesystem; however, clustered ZFS is available from third parties.
- Snapshots can be taken manually or automatically. The older versions of stored data that they contain can be exposed to client systems via CIFS (also known as SMB/Samba and used by Microsoft Windows file shares) as "Previous versions", "VSS shadow copies", or "File history", and AFP (for Apple devices) as " Apple Time Machine".
Limitations in preventing data corruption
A 2010 paper examining the ability of file systems to detect and prevent data corruption observed that ZFS itself is effective in detecting and correcting data errors on storage devices, but that it assumes data in RAM are "safe", and not prone to error. Thus when ZFS caches pages, or stores copies of metadata, in RAM, or holds data in its "dirty" cache for writing to disk, no test is made whether the checksums still match the data at the point of use. Much of this risk can be mitigated by use of ECC RAM but the authors considered that error detection related to the page cache and heap would allow ZFS to handle certain classes of error more robustly.
The authors of a study entirely dedicated to ZFS data integrity point out that "[...] a single bit flip in memory causes a small but non-negligible percentage of runs to experience a failure", with the probability of committing bad data to disk varying from 0% to 3.6% (according to the workload).
However, one of the main architects of ZFS, Matt Ahrens, explains there is an option to turn on checksumming of the data in memory by using the ZFS_DEBUG_MODIFY flag (zfs_flags=0x10) which adresses these concerns https://arstechnica.com/civis/viewtopic.php?f=2&t=1235679&p=26303271#p26303271
Limitations specific to ZFS
- Capacity expansion is normally achieved by adding groups of disks as a top-level vdev: simple device, RAID-Z, RAID Z2, RAID 3, or mirrored. Newly written data will dynamically start to use all available vdevs. It is also possible to expand the array by iteratively swapping each drive in the array with a bigger drive and waiting for ZFS to self-heal; the heal time will depend on the amount of stored information, not the disk size.
- As of Solaris 10 Update 11 and Solaris 11.2, it is neither possible to reduce the number of top-level vdevs in a pool, nor to otherwise reduce pool capacity. This functionality was said to be in development already in 2007.
- It is not possible to add a disk as a column to a RAID Z, RAID Z2 or RAID Z3 vdev. However, a new RAID Z vdev can be created instead and added to the zpool.
- Some traditional nested RAID configurations, such as RAID 51 (a mirror of RAID 5 groups), are not configurable in ZFS. Vdevs can only be composed of raw disks or files, not other vdevs. However, a ZFS pool effectively creates a stripe (RAID 0) across its vdevs, so the equivalent of a RAID 50 or RAID 60 is common.
- Reconfiguring the number of devices in a top-level vdev requires copying data offline, destroying the pool, and recreating the pool with the new top-level vdev configuration, except for adding extra redundancy to an existing mirror, which can be done at any time or if all top level vdevs are mirrors with sufficient redundancy the zpool split command can be used to remove a vdev from each top level vdev in the pool, creating a 2nd pool with identical data.
- Resilver (repair) of a crashed disk in a ZFS RAID can take a long time (this is not unique to ZFS, it applies to all types of RAID, in one way or another). This means that very large volumes can take several days to repair or to being back to full redundancy after severe data corruption or failure, and during this time a second disk failure may occur, especially as the repair puts additional stress on the system as a whole. In turn this means that configurations that only allow for recovery of a single disk failure, such as RAID Z1 (similar to RAID 5) should be avoided. Therefore, with large disks, one should use RAID Z2 (allow two disks to crash) or RAID Z3 (allow three disks to crash). It should be noted however, that ZFS RAID differs from conventional RAID by only reconstructing live data and metadata when replacing a disk, not the entirety of the disk including blank and garbage blocks, which means that replacing a member disk on a ZFS pool that is only partially full will take considerably less time compared to conventional RAID.
- IOPS performance of a ZFS storage pool can suffer if the ZFS raid is not appropriately configured. This applies to all types of RAID, in one way or another. If the zpool consists of only one group of disks configured as, say, eight disks in RAID Z2, then the IOPS performance will be that of a single disk (read speed will be equivalent to 8 disks, but write speed will be similar to a single disk). However, there are ways to mitigate this IOPS performance problem, for instance add SSDs as L2ARC cache — which can boost IOPS into 100.000s. In short, a zpool should consist of several groups of vdevs, each vdev consisting of 8–12 disks, if using RAID Z. It is not recommended to create a zpool with a single large vdev, say 20 disks, because IOPS performance will be that of a single disk, which also means that resilver time will be very long (possibly weeks with future large drives).
- Online shrink is not supported.
Solaris 10 update 2 and later
After Oracle's Solaris 11 Express release, the OS/Net consolidation (the main OS code) was made proprietary and closed-source, and further ZFS upgrades and implementations inside Solaris (such as encryption) are not compatible with other non-proprietary implementations which use previous versions of ZFS.
When creating a new ZFS pool, to retain the ability to use access the pool from other non-proprietary Solaris-based distributions, it is recommended to upgrade to Solaris 11 Express from OpenSolaris (snv_134b), and thereby stay at ZFS version 28.
OpenSolaris 2008.05, 2008.11 and 2009.06 use ZFS as their default filesystem. There are over a dozen 3rd-party distributions, of which nearly a dozen are mentioned here. (OpenIndiana and illumos are two new distributions not included on the OpenSolaris distribution reference page.)
By upgrading from OpenSolaris snv_134 to both OpenIndiana and Solaris 11 Express, one also has the ability to upgrade and separately boot Solaris 11 Express on the same ZFS pool, but one should not install Solaris 11 Express first because of ZFS incompatibilities introduced by Oracle past ZFS version 28.
OpenZFS on OSX (abbreviated to O3X) is an implementation of ZFS for macOS. O3X is under active development, with close relation to ZFS on Linux and illumos' ZFS implementation, while maintaining feature flag compatibility with ZFS on Linux. O3X implements zpool version 5000, and includes the Solaris Porting Layer (SPL) originally authored for MacZFS, which has been further enhanced to include a memory management layer based on the illumos kmem and vmem allocators. O3X is fully featured, supporting LZ4 compression, deduplication, ARC, L2ARC, and SLOG.
MacZFS is free software providing support for ZFS on macOS. The stable legacy branch provides up to ZFS pool version 8 and ZFS filesystem version 2. The development branch, based on ZFS on Linux and OpenZFS, provides updated ZFS functionality, such as up to ZFS zpool version 5000 and feature flags.
A proprietary implementation of ZFS (Zevo) was available at no cost from GreenBytes, Inc., implementing up to ZFS file system version 5 and ZFS pool version 28. Zevo offered a limited ZFS feature set, pending further commercial development; it was sold to Oracle in 2014, with unknown future plans.
FreeBSD's ZFS implementation is fully functional; the only missing features are kernel CIFS server and iSCSI, but the latter can be added using externally available packages. Samba can be used to provide a userspace CIFS server.
FreeBSD 7-STABLE (where updates to the series of versions 7.x are committed to) uses zpool version 6.
FreeBSD 8 includes a much-updated implementation of ZFS, and zpool version 13 is supported. zpool version 14 support was added to the 8-STABLE branch on January 11, 2010, and is included in FreeBSD release 8.1. zpool version 15 is supported in release 8.2. The 8-STABLE branch gained support for zpool version v28 and zfs version 5 in early June 2011. These changes were released mid-April 2012 with FreeBSD 8.3.
FreeBSD 9.2-RELEASE is the first FreeBSD version to use the new "feature flags" based implementation thus Pool version 5000.
MidnightBSD, a desktop operating system derived from FreeBSD, supports ZFS storage pool version 6 as of 0.3-RELEASE. This was derived from code included in FreeBSD 7.0-RELEASE. An update to storage pool 28 is in progress in 0.4-CURRENT and based on 9-STABLE sources around FreeBSD 9.1-RELEASE code.
pfSense and PCBSD
NAS4Free, an embedded open source network-attached storage (NAS) distribution based on FreeBSD, has the same ZFS support as FreeBSD, ZFS storage pool version 5000. This project is a continuation of FreeNAS 7 series project.
Being based on the FreeBSD kernel, Debian GNU/kFreeBSD has ZFS support from the kernel. However, additional userland tools are required, while it is possible to have ZFS as root or /boot file system in which case required GRUB configuration is performed by the Debian installer since the Wheezy release.
As of 31 January 2013, the ZPool version available is 14 for the Squeeze release, and 28 for the Wheezy-9 release.
|This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: wording and style issues. (July 2016) (Learn how and when to remove this template message)|
Although the ZFS filesystem supports Linux-based operating systems, difficulties arise for Linux distribution maintainers wishing to provide native support for ZFS in their products due to potential legal incompatibilities between the CDDL license used by the ZFS code, and the GPL license used by the Linux kernel. To enable ZFS support within Linux, a loadable kernel module containing the CDDL-licensed ZFS code must be compiled and loaded into the kernel. According to the Free Software Foundation, the wording of the GPL license legally prohibits redistribution of the resulting product as a derivative work, though this viewpoint has caused some controversy.
ZFS on FUSE
One potential workaround to licensing incompatibility was trialed in 2006, with an experimental port of the ZFS code to Linux's FUSE system. The filesystem ran entirely in userspace instead of being integrated into the Linux kernel, and was therefore not considered a derivative work of the kernel. This approach was functional, but suffered from significant performance penalties when compared with integrating the filesystem as a native kernel module running in kernel space. As of 2016, the ZFS on FUSE project appears to be defunct.
Native ZFS on Linux
- 2008: prototype to determine viability
- 2009: initial ZVOL and Lustre support
- 2010: development moved to GitHub
- 2011: POSIX layer added
- 2011: community of early adopters
- 2012: production usage of ZFS
- 2013: stable GA release
As of August 2014[update], ZFS on Linux uses the OpenZFS pool version number 5000, which indicates that the features it supports are defined via feature flags. This pool version is an unchanging number that is expected to never conflict with version numbers given by Oracle.
Another native port for Linux was developed by KQ InfoTech in 2010. This port used the zvol implementation from the Lawrence Livermore National Laboratory as a starting point. A release supporting zpool v28 was announced in January 2011. In April 2011, KQ Infotech was acquired by sTec, Inc., and their work on ZFS ceased. Source code of this port can be found on GitHub.
The work of KQ InfoTech was ultimately integrated into the LLNL's native port of ZFS for Linux.
Source code distribution
While the license incompatibility may arise with the distribution of compiled binaries containing ZFS code, it is generally agreed that distribution of the source code itself is not affected by this. In Gentoo, configuring a ZFS root filesystem is well documented and the required packages can be installed from its package repository. Slackware also provides documentation on supporting ZFS, both as a kernel module and when built into the kernel.
The question of the CDDL license's compatibility with the GPL license resurfaced in 2015, when the Linux distribution Ubuntu announced that it intended to make precompiled OpenZFS binary kernel modules available to end-users directly from the distribution's official package repositories. In 2016, Ubuntu announced that a legal review resulted in the conclusion that providing support for ZFS via a binary kernel module was not in violation of the provisions of the GPL license. Others followed Ubuntu's conclusion, while the FSF and SFC reiterated their opposing view.
Ubuntu 16.04 LTS ("Xenial Xerus"), released on April 21, 2016, allows the user to install the OpenZFS binary packages directly from the Ubuntu software repositories. As of April 2017[update], no legal challenge has been brought against Canonical regarding the distribution of these packages.
List of operating systems supporting ZFS
List of Operating Systems, distributions and add-ons that support ZFS, the zpool version it supports, and the Solaris build they are based on (if any):
|OS||Zpool version||Sun/Oracle Build #||Comments|
|Oracle Solaris 11.3||37||0.5.11-0.175.3.1.0.5.0|
|Oracle Solaris 10 1/13 (U11)||32|
|Oracle Solaris 11.2||35||0.5.11-0.175.2.0.0.42.0|
|Oracle Solaris 11 2011.11||34||b175|
|Oracle Solaris Express 11 2010.11||31||b151a||licensed for testing only|
|OpenSolaris (last dev)||22||b134|
|OpenIndiana||5000||b147||OpenIndiana creates a name clash with naming their code b151a|
|Nexenta Core 3.0.1||26||b134+||GNU userland|
|NexentaStor Community 3.0.1||26||b134+||up to 18 TB, web admin|
|NexentaStor Community 3.1.0||28||b134+||GNU userland|
|NexentaStor Community 4.0||5000||b134+||up to 18 TB, web admin|
|NexentaStor Enterprise||28||b134 +||not free, web admin|
|GNU/kFreeBSD "Squeeze" (as of 1/31/2013)||14||Requires package "zfsutils"|
|GNU/kFreeBSD "Wheezy-9" (as of 2/21/2013)||28||Requires package "zfsutils"|
|zfs-fuse 0.7.2||23||suffered from performance issues; now defunct|
|ZFS on Linux 0.6.5.8||5000||0.6.0 release candidate has POSIX layer|
|KQ Infotech's ZFS on Linux||28||now defunct; code integrated into LLNL-supported ZFS on Linux|
|StormOS "hail"||based on Nexenta|
|MilaX 0.5||20||b128a||small size[clarification needed]|
|FreeNAS 8.0.2 / 8.2||15|
|FreeNAS 8.3.0||28||based on FreeBSD 8.3|
|FreeNAS 9.1.0||5000||based on FreeBSD 9.1|
|NAS4Free 10.2.0.2/10.3.0.3||5000||based on FreeBSD 10.2/10.3|
|EON NAS (v0.6)||22||b130||embedded NAS|
|EON NAS (v1.0beta)||28||b151a||embedded NAS|
|napp-it||28/5000||Illumos/ Solaris||Storage appliance with Web-UI for OpenIndiana (Hipster), OmniOS, Solaris 11 or Linux (ZFS management)|
|OmniOS||28/5000||illumos-omnios branch||minimal storage server distribution based on Illumos|
|SmartOS||28/5000||Illumos b151+||minimal live distribution based on Illumos (boots from USB/CD) suited for cloud and hypervisor use (KVM)|
|macOS 10.5, 10.6, 10.7, 10.8, and 10.9||5000||MacZFS|
|macOS 10.6, 10.7 and 10.8||28||ZEVO|
|Ubuntu 16.04 LTS||5000||offers native support via installable binary module|
ZFS was designed and implemented by a team at Sun led by Jeff Bonwick, Bill Moore and Matthew Ahrens. It was announced on September 14, 2004, but development started in 2001. Source code for ZFS was integrated into the main trunk of Solaris development on October 31, 2005 and released as part of build 27 of OpenSolaris on November 16, 2005. Sun announced that ZFS was included in the 6/06 update to Solaris 10 in June 2006, one year after the opening of the OpenSolaris community.
The name at one point was said to stand for "Zettabyte File System", but by 2006 was no longer considered to be an abbreviation. A ZFS file system can store up to 256 quadrillion zettabytes (ZB).
In September 2007, NetApp sued Sun claiming that ZFS infringed some of NetApp's patents on Write Anywhere File Layout. Sun counter-sued in October the same year claiming the opposite. The lawsuits were ended in 2010 with an undisclosed settlement.
Open source implementations
- 2005: Source code was released as part of OpenSolaris.
- 2006: Development of a FUSE port for Linux started.
- 2007: Apple started porting ZFS to Mac OS X.
- 2008: A port to FreeBSD was released as part of FreeBSD 7.0.
- 2008: Development of a native Linux port started.
- 2009: Apple's ZFS project closed. The MacZFS project continued to develop the code.
- 2010: OpenSolaris was discontinued. Further development of ZFS on Solaris was no longer open source.
- 2010: illumos was founded as an open source successor, and continued to develop ZFS in the open. Ports of ZFS to other platforms continued porting upstream changes from illumos.
- 2013: The OpenZFS project begins, aiming at coordinated open-source development of ZFS.
Use in commercial products
|This section needs expansion. You can help by adding to it. (December 2013)|
- 2008: Sun shipped a line of ZFS-based 7000-series storage appliances.
- 2013: Oracle shipped ZS3 series of ZFS-based filers and seized first place in the SPC-2 benchmark with one of them.
- 2013: iXsystems ships ZFS-based NAS devices called FreeNAS for SOHO and TrueNAS for the enterprise.
- 2014: Netgear ships a line of ZFS-based NAS devices called ReadyDATA, designed to be used in the enterprise.
Detailed release history
With ZFS in Oracle Solaris: as new features are introduced, the version numbers of the pool and file system are incremented to designate the format and features available. Features that are available in specific file system versions require a specific pool version.
Distributed development of OpenZFS involves feature flags and pool version 5000, an unchanging number that is expected to never conflict with version numbers given by Oracle. Legacy version numbers still exist for pool versions 1–28, implied by the version 5000. Illumos uses pool version 5000 for this purpose. Future on-disk format changes are enabled / disabled independently via feature flags.
|Latest FOSS stable release|
|Latest Proprietary stable release|
|Latest Proprietary beta release|
|ZFS Filesystem Version Number||Release date||Significant changes|
|1||OpenSolaris Nevada build 36||First release|
|2||OpenSolaris Nevada b69||Enhanced directory entries. In particular, directory entries now store the object type. For example, file, directory, named pipe, and so on, in addition to the object number.|
|3||OpenSolaris Nevada b77||Support for sharing ZFS file systems over SMB. Case insensitivity support. System attribute support. Integrated anti-virus support.|
|4||OpenSolaris Nevada b114||Properties: userquota, groupquota, userused and groupused|
|5||OpenSolaris Nevada b137||System attributes; symlinks now their own object type|
|6||Solaris 11.1||Multilevel file system support|
|ZFS Pool Version Number||Release date||Significant changes|
|1||OpenSolaris Nevada b36||First release|
|2||OpenSolaris Nevada b38||Ditto Blocks|
|3||OpenSolaris Nevada b42||Hot spares, double-parity RAID-Z (raidz2), improved RAID-Z accounting|
|4||OpenSolaris Nevada b62||zpool history|
|5||OpenSolaris Nevada b62||gzip compression for ZFS datasets|
|6||OpenSolaris Nevada b62||"bootfs" pool property|
|7||OpenSolaris Nevada b68||ZIL: adds the capability to specify a separate Intent Log device or devices|
|8||OpenSolaris Nevada b69||ability to delegate zfs(1M) administrative tasks to ordinary users|
|9||OpenSolaris Nevada b77||CIFS server support, dataset quotas|
|10||OpenSolaris Nevada b77||Devices can be added to a storage pool as "cache devices"|
|11||OpenSolaris Nevada b94||Improved zpool scrub / resilver performance|
|12||OpenSolaris Nevada b96||Snapshot properties|
|13||OpenSolaris Nevada b98||Properties: usedbysnapshots, usedbychildren, usedbyrefreservation, and usedbydataset|
|14||OpenSolaris Nevada b103||passthrough-x aclinherit property support|
|15||OpenSolaris Nevada b114||Properties: userquota, groupquota, usuerused and groupused; also required FS v4|
|16||OpenSolaris Nevada b116||STMF property support|
|17||OpenSolaris Nevada b120||triple-parity RAID-Z|
|18||OpenSolaris Nevada b121||ZFS snapshot holds|
|19||OpenSolaris Nevada b125||ZFS log device removal|
|20||OpenSolaris Nevada b128||zle compression algorithm that is needed to support the ZFS deduplication properties in ZFS pool version 21, which were released concurrently|
|21||OpenSolaris Nevada b128||Deduplication|
|22||OpenSolaris Nevada b128||zfs receive properties|
|23||OpenSolaris Nevada b135||slim ZIL|
|24||OpenSolaris Nevada b137||System attributes. Symlinks now their own object type. Also requires FS v5.|
|25||OpenSolaris Nevada b140||Improved pool scrubbing and resilvering statistics|
|26||OpenSolaris Nevada b141||Improved snapshot deletion performance|
|27||OpenSolaris Nevada b145||Improved snapshot creation performance (particularly recursive snapshots)|
|28||OpenSolaris Nevada b147||Multiple virtual device replacements|
|29||Solaris Nevada b148||RAID-Z/mirror hybrid allocator|
|30||Solaris Nevada b149||ZFS encryption|
|31||Solaris Nevada b150||Improved 'zfs list' performance|
|32||Solaris Nevada b151||One MB block support|
|33||Solaris Nevada b163||Improved share support|
|34||Solaris 11.1 (0.5.11-0.175.1.0.0.24.2)||Sharing with inheritance|
|35||Solaris 11.2 (0.5.11-0.175.2.0.0.42.0)||Sequential resilver|
|36||Solaris 11.3||Efficient log block allocation|
|37||Solaris 11.3||LZ4 compression|
|5000||OpenZFS||Unchanging pool version to signify that the pool indicates new features after pool version 28 using ZFS feature flags rather than by incrementing the pool version|
Note: The Solaris version under development by Sun since the release of Solaris 10 in 2005 was codenamed 'Nevada', and was derived from what was the OpenSolaris codebase. 'Solaris Nevada' is the codename for the next-generation Solaris OS to eventually succeed Solaris 10 and this new code was then pulled successively into new OpenSolaris 'Nevada' snapshot builds. OpenSolaris is now discontinued and OpenIndiana forked from it. A final build (b134) of OpenSolaris was published by Oracle (2010-Nov-12) as an upgrade path to Solaris 11 Express.
||This section may contain an excessive amount of intricate detail that may only interest a specific audience. (December 2013) (Learn how and when to remove this template message)|
The first indication of Apple Inc.'s interest in ZFS was an April 2006 post on the opensolaris.org zfs-discuss mailing list where an Apple employee mentioned being interested in porting ZFS to their Mac OS X operating system. In the release version of Mac OS X 10.5, ZFS was available in read-only mode from the command line, which lacks the possibility to create zpools or write to them. Before the 10.5 release, Apple released the "ZFS Beta Seed v1.1", which allowed read-write access and the creation of zpools,; however, the installer for the "ZFS Beta Seed v1.1" has been reported to only work on version 10.5.0, and has not been updated for version 10.5.1 and above. In August 2007, Apple opened a ZFS project on their Mac OS Forge web site. On that site, Apple provided the source code and binaries of their port of ZFS which includes read-write access, but there was no installer available until a third-party developer created one. In October 2009, Apple announced a shutdown of the ZFS project on Mac OS Forge. That is to say that their own hosting and involvement in ZFS was summarily discontinued. No explanation was given, just the following statement: "The ZFS project has been discontinued. The mailing list and repository will also be removed shortly." Apple would eventually release the legally required, CDDL-derived, portion of the source code of their final public beta of ZFS, code named "10a286". Complete ZFS support was once advertised as a feature of Snow Leopard Server (Mac OS X Server 10.6). However, by the time the operating system was released, all references to this feature had been silently removed from its features page. Apple has not commented regarding the omission.
Apple's "10a286" source code release, and versions of the previously released source and binaries, have been preserved and new development has been adopted by a group of enthusiasts. The MacZFS project acted quickly to mirror the public archives of Apple's project before the materials would have disappeared from the internet, and then to resume its development elsewhere. The MacZFS community has curated and matured the project, supporting ZFS for all Mac OS releases since 10.5. The project has an active mailing list. As of July 2012, MacZFS implements zpool version 8 and ZFS version 2, from the October 2008 release of Solaris. Additional historical information and commentary can be found on the MacZFS web site and FAQ.
The 17th September 2013 launch of OpenZFS included ZFS-OSX, which will become a new version of MacZFS, as the distribution for Darwin.
- APFS – for Apple operating systems
- Btrfs – for Linux
- Comparison of file systems
- HAMMER – a file system with a similar feature set for DragonFly BSD
- LFS – BSD Log Structured Filesystem
- List of file systems
- LVM – Logical Volume Manager (Linux), supports snapshots
- LZJB – data compression algorithm used in ZFS
- NILFS – a Linux file system with checksumming (but not scrubbing), also supporting snapshots
- ReFS – a Microsoft file system with built-in resiliency features
- Sun Open Storage
- Veritas File System and Veritas Volume Manager – similar to ZFS
- Versioning file systems – List of versioning file systems
- Write Anywhere File Layout – a similar file system by NetApp
- "What Is ZFS?". Oracle Solaris ZFS Administration Guide. Oracle. Retrieved 29 December 2015.
- "What's new in Solaris 11 Express 2010.11" (PDF). Oracle. Retrieved November 17, 2010.
- "1.1 What about the licensing issue?". Retrieved November 18, 2010.
- "Status Information for Serial Number 85901629 (ZFS)". United States Patent and Trademark Office. Retrieved October 21, 2013.
- Jeff Bonwick (May 3, 2006). "You say zeta, I say zetta". Jeff Bonwick's Blog. Retrieved April 21, 2017.
So we finally decided to unpimp the name back to ZFS, which doesn't stand for anything.
- "The Birth of ZFS". OpenZFS. Retrieved October 21, 2015.
- "Sun's ZFS Creator to Quit Oracle and Join Startup". eWeek. Retrieved September 29, 2010.
- /History on open-zfs.org "OpenZFS is the truly open source successor to the ZFS project [...] Effects of the fork (2010 to date)"
- Sean Michael Kerner (2013-09-18). "LinuxCon: OpenZFS moves Open Source Storage Forward". infostor.com. Retrieved 2013-10-09.
- "The OpenZFS project launches". LWN.net. 2013-09-17. Retrieved 2013-10-01.
- "OpenZFS – Communities co-operating on ZFS code and features". freebsdnews.net. 2013-09-23. Retrieved 2014-03-14.
- The Extended file system (Ext) has metadata structure copied from UFS. "Rémy Card (Interview, April 1998)". April Association. April 19, 1999. Retrieved 2012-02-08. (In French)
- Vijayan Prabhakaran (2006). "IRON FILE SYSTEMS" (PDF). Doctor of Philosophy in Computer Sciences. University of Wisconsin-Madison. Retrieved 9 June 2012.
- "Parity Lost and Parity Regained".
- "An Analysis of Data Corruption in the Storage Stack" (PDF).
- "Impact of Disk Corruption on Open-Source DBMS" (PDF).
- Kadav, Asim; Rajimwale, Abhishek. "Reliability Analysis of ZFS" (PDF).
- Yupu Zhang; Abhishek Rajimwale; Andrea C. Arpaci-Dusseau; Remzi H. Arpaci-Dusseau. "End-to-end Data Integrity for File Systems: A ZFS Case Study" (PDF). Madison: Computer Sciences Department, University of Wisconsin. p. 14. Retrieved December 6, 2010.
- Larabel, Michael. "Benchmarking ZFS and UFS On FreeBSD vs. EXT4 & Btrfs On Linux". Phoronix Media 2012. Retrieved 21 November 2012.
- Larabel, Michael. "Can DragonFlyBSD's HAMMER Compete With Btrfs, ZFS?". Phoronix Media 2012. Retrieved 21 November 2012.
- Bonwick, Jeff (2005-12-08). "ZFS End-to-End Data Integrity". blogs.oracle.com. Retrieved 2013-09-19.
- Cook, Tim (November 16, 2009). "Demonstrating ZFS Self-Healing". blogs.oracle.com. Retrieved 2015-02-01.
- Ranch, Richard (2007-05-04). "ZFS, copies, and data protection". blogs.oracle.com. Retrieved 2015-02-02.
- "Difference between Desktop edition and RAID (Enterprise) edition drives".
- Bonwick, Jeff (2005-11-17). "RAID-Z". Jeff Bonwick's Blog. Oracle Blogs. Retrieved 2015-02-01.
- "Why RAID 6 stops working in 2019". ZDNet. February 22, 2010. Retrieved October 26, 2014.
- "Actually it's a n-way mirror". c0t0d0s0.org. 2013-09-04. Retrieved 2013-11-19.
- "No fsck utility equivalent exists for ZFS. This utility has traditionally served two purposes, those of file system repair and file system validation." "Checking ZFS File System Integrity". Oracle. Retrieved 25 November 2012.
- "ZFS Scrubs". freenas.org. Archived from the original on November 27, 2012. Retrieved 25 November 2012.
- "You should also run a scrub prior to replacing devices or temporarily reducing a pool's redundancy to ensure that all devices are currently operational." "ZFS Best Practices Guide". solarisinternals.com. Archived from the original on September 5, 2015. Retrieved 25 November 2012.
- Jeff Bonwick. "128-bit storage: are you high?". oracle.com. Retrieved May 29, 2015.
- Bonwick, Jeff (October 31, 2005). "ZFS: The Last Word in Filesystems". blogs.oracle.com. Retrieved June 22, 2013.
- "ZFS: Boils the Ocean, Consumes the Moon (Dave Brillhart's Blog)". Retrieved December 19, 2015.
- "Solaris ZFS Administration Guide". Oracle Corporation. Retrieved February 11, 2011.
- "Encrypting ZFS File Systems".
- "Having my secured cake and Cloning it too (aka Encryption + Dedup with ZFS)".
- "Solaris ZFS Administration Guide". Oracle Corporation. Retrieved February 11, 2011.
- "ZFS Best Practices Guide". Solaris Performance Wiki. Archived from the original on September 28, 2007. Retrieved October 2, 2007.
- Leventhal, Adam. "Bug ID: 6854612 triple-parity RAID-Z". Sun Microsystems. Archived from the original on July 27, 2009. Retrieved July 17, 2009.
- Leventhal, Adam (July 16, 2009). "6854612 triple-parity RAID-Z". zfs-discuss (Mailing list). Archived from the original on December 23, 2009. Retrieved July 17, 2009.
- "WHEN TO (AND NOT TO) USE RAID-Z". Oracle. Retrieved 13 May 2013.
- "Solaris ZFS Enables Hybrid Storage Pools—Shatters Economic and Performance Barriers" (PDF). Sun.com. September 7, 2010. Retrieved November 4, 2011.
- "Brendan's blog » ZFS L2ARC". Dtrace.org. Retrieved 2012-10-05.
- "Solaris ZFS Performance Tuning: Synchronous Writes and the ZIL". Constantin.glez.de. 2010-07-20. Retrieved 2012-10-05.
- "ZFS On-Disk Specification" (PDF). Sun Microsystems, Inc. 2006. Archived from the original (PDF) on December 30, 2008. See section 2.4.
- Eric Sproul (2009-05-21). "ZFS Nuts and Bolts". slideshare.net. pp. 30–31. Retrieved 2014-06-08.
- "ZFS Deduplication". blogs.oracle.com.
- Gary Sims (4 January 2012). "Building ZFS Based Network Attached Storage Using FreeNAS 8" (Blog). TrainSignal Training. TrainSignal, Inc. Retrieved 9 June 2012.
- Ray Van Dolson (May 2011). "[zfs-discuss] Summary: Deduplication Memory Requirements". zfs-discuss mailing list. Archived from the original on 2012-04-25.
- Chris Mellor (October 12, 2012). "GreenBytes brandishes full-fat clone VDI pumper". The Register. Retrieved August 29, 2013.
- Chris Mellor (June 1, 2012). "Newcomer gets out its box, plans to sell it cheaply to all comers". The Register. Retrieved August 29, 2013.
- Chris Mellor (2014-12-11). "Dedupe, dedupe... dedupe, dedupe, dedupe: Oracle polishes ZFS diamond". The Register. Retrieved 2014-12-17.
- "Solaris ZFS Administration Guide". Chapter 6 Managing ZFS File Systems. Archived from the original on February 5, 2011. Retrieved March 17, 2009.
- "Smokin' Mirrors". blogs.oracle.com. May 2, 2006. Retrieved February 13, 2012.
- "ZFS Block Allocation". Jeff Bonwick's Weblog. November 4, 2006. Retrieved February 23, 2007.
- "Ditto Blocks — The Amazing Tape Repellent". Flippin' off bits Weblog. May 12, 2006. Retrieved March 1, 2007.
- "Adding new disks and ditto block behaviour". Archived from the original on August 23, 2011. Retrieved October 19, 2009.
- "OpenSolaris.org". Sun Microsystems. Archived from the original on May 8, 2009. Retrieved May 22, 2009.
- End-to-end data integrity for file systems: a ZFS case study, Zhang et al 2010 direct link
- "Bug ID 4852783: reduce pool capacity". OpenSolaris Project. Archived from the original on June 29, 2009. Retrieved March 28, 2009.
- Goebbels, Mario (April 19, 2007). "Permanently removing vdevs from a pool". zfs-discuss (Mailing list).[dead link]
- "Expand-O-Matic RAID Z". Adam Leventhal. April 7, 2008.
- "zpool(1M)". Download.oracle.com. June 11, 2010. Retrieved November 4, 2011.
- Leventhal, Adam. "Triple-Parity RAID Z". Adam Leventhal's blog. Retrieved 19 December 2013.
- brendan (December 2, 2008). "A quarter million NFS IOPS". Oracle Sun. Retrieved January 28, 2012.
- "Oracle Has Killed OpenSolaris". Techie Buzz. August 14, 2010. Retrieved July 17, 2013.
- "oi_151a_prestable5 Release Notes". Retrieved May 23, 2016.
- "Upgrading from OpenSolaris". Retrieved September 24, 2011.
- "OpenZFS on OS X". openzfsonosx.org. 2014-09-29. Retrieved 2014-11-23.
- "Features – OpenZFS – Feature flags". OpenZFS. Retrieved 22 September 2013.
- "MacZFS: Official Site for the Free ZFS for Mac OS". code.google.com. MacZFS. Retrieved 2014-03-02.
- "ZEVO Wiki Site/ZFS Pool And Filesystem Versions". GreenBytes, Inc. 2012-09-15. Retrieved 22 September 2013.
- "Github zfs-port branch".
- "NetBSD Google Summer of Code projects: ZFS".
- Dawidek, Paweł (April 6, 2007). "ZFS committed to the FreeBSD base". Retrieved April 6, 2007.
- "Revision 192498". May 20, 2009. Retrieved May 22, 2009.
- "ZFS v13 in 7-STABLE". May 21, 2009. Archived from the original on May 27, 2009. Retrieved May 22, 2009.
- "iSCSI target for FreeBSD". Retrieved August 6, 2011.
- "FreeBSD 8.0-RELEASE Release Notes". FreeBSD. Retrieved November 27, 2009.
- "FreeBSD 8.0-STABLE Subversion logs". FreeBSD. Retrieved February 5, 2010.
- "FreeBSD 8.2-RELEASE Release Notes". FreeBSD. Retrieved March 9, 2011.
- "HEADS UP: ZFS v28 merged to 8-STABLE". June 6, 2011. Retrieved June 11, 2011.
- "FreeBSD 8.3-RELEASE Announcement". Retrieved June 11, 2012.
- Pawel Jakub Dawidek. "ZFS v28 is ready for wider testing.". Retrieved August 31, 2010.
- "FreeBSD 9.0-RELEASE Release Notes". FreeBSD. Retrieved January 12, 2012.
- "FreeBSD 9.2-RELEASE Release Notes". FreeBSD. Retrieved September 30, 2013.
- "NAS4Free: Features". Retrieved 13 January 2015.
- "Debian GNU/kFreeBSD FAQ". Is there ZFS support?. Retrieved 2013-09-24.
- "Debian GNU/kFreeBSD FAQ". Can I use ZFS as root or /boot file system?. Retrieved 2013-09-24.
- "Debian GNU/kFreeBSD FAQ". What grub commands are necessary to boot Debian/kFreeBSD from a zfs root?. Retrieved 2013-09-24.
- Larabel, Michael (2010-09-10). "Debian GNU/kFreeBSD Becomes More Interesting". Phoronix. Retrieved 2013-09-24.
- Eben Moglen; Mishi Choudharyl (February 26, 2016). "The Linux Kernel, CDDL and Related Issues". softwarefreedom.org. Retrieved March 30, 2016.
- Bradley M. Kuhn; Karen M. Sandler (February 25, 2016). "GPL Violations Related to Combining ZFS and Linux". sfconservancy.org. Retrieved March 30, 2016.
- "Linus on GPLv3 and ZFS". Lwn.net. June 12, 2007. Retrieved November 4, 2011.
- Ryan Paul (June 9, 2010). "Uptake of native Linux ZFS port hampered by license conflict". Ars Technica. Retrieved July 1, 2014.
- Aditya Rajgarhia & Ashish Gehani (November 23, 2012). "Performance and Extension of User Space File Systems" (PDF).
- Behlendorf, Brian (2013-05-28). "spl/zfs-0.6.1 released". zfs-announce mailing list. Retrieved 2013-10-09.
- "ZFS on Linux". Retrieved 29 August 2013.
- Matt Ahrens; Brian Behlendorf (2013-09-17). "LinuxCon 2013: OpenZFS" (PDF). linuxfoundation.org. Retrieved 2013-11-13.
- "ZFS on Linux". zfsonlinux.org. Retrieved 2014-08-13.
- Darshin (August 24, 2010). "ZFS Port to Linux (all versions)". Archived from the original on March 11, 2012. Retrieved August 31, 2010.
- "Where can I get the ZFS for Linux source code?". Archived from the original on October 8, 2011. Retrieved 29 August 2013.
- Phoronix (November 22, 2010). "Running The Native ZFS Linux Kernel Module, Plus Benchmarks". Retrieved December 7, 2010.
- "KQ ZFS Linux Is No Longer Actively Being Worked On". June 10, 2011.
- "zfs-linux / zfs".
- "ZFS – Gentoo documentation". gentoo.org. Retrieved 2013-10-09.
- "ZFS root". Slackware ZFS root. SlackWiki.com. Retrieved 2014-08-13.
- "ZFS root (builtin)". Slackware ZFS root (builtin). SlackWiki.com. Retrieved 2014-08-13.
- Michael Larabel (6 October 2015). "Ubuntu Is Planning To Make The ZFS File-System A "Standard" Offering". Phoronix.
- Dustin Kirkland (18 February 2016). "ZFS Licensing and Linux". Ubuntu Insights. Canonical.
- Are GPLv2 and CDDL incompatible? on hansenpartnership.com by James E.J. Bottomley "What the above analysis shows is that even though we presumed combination of GPLv2 and CDDL works to be a technical violation, there’s no way actually to prosecute such a violation because we can’t develop a convincing theory of harm resulting. Because this makes it impossible to take the case to court, effectively it must be concluded that the combination of GPLv2 and CDDL, provided you’re following a GPLv2 compliance regime for all the code, is allowable." (23 February 2016)
- Moglen, Eben; Choudhary, Mishi (26 February 2016). "The Linux Kernel, CDDL and Related Issues".
- GPL Violations Related to Combining ZFS and Linux on sfconservancy.org by Bradley M. Kuhn and Karen M. Sandler "Ultimately, various Courts in the world will have to rule on the more general question of Linux combinations. Conservancy is committed to working towards achieving clarity on these questions in the long term. That work began in earnest last year with the VMware lawsuit, and our work in this area will continue indefinitely, as resources permit. We must do so, because, too often, companies are complacent about compliance. While we and other community-driven organizations have historically avoided lawsuits at any cost in the past, the absence of litigation on these questions caused many companies to treat the GPL as a weaker copyleft than it actually is." (February 25, 2016)
- GPL Violations Related to Combining ZFS and Linux on sfconservancy.org by Bradley M. Kuhn and Karen M. Sandler "Conservancy (as a Linux copyright holder ourselves), along with the members of our coalition in the GPL Compliance Project for Linux Developers, all agree that Canonical and others infringe Linux copyrights when they distribute zfs.ko."
- Ubuntu 16.04 LTS arrives today complete with forbidden ZFS on the theregister.com (April 21, 2016)
- "ZFS filesystem will be built into Ubuntu 16.04 LTS by default". Ars Technica.
- Larabel, Michael. "Taking ZFS For A Test Drive On Ubuntu 16.04 LTS". phoronix. Phoronix Media. Retrieved 25 April 2016.
- "How to install ubuntu mate onto single sdd with zfs as main fs". Ubuntu MATE. ubuntu-mate.community. Retrieved 25 April 2016.
- Brown, David. "A Conversation with Jeff Bonwick and Bill Moore". ACM Queue. Association for Computing Machinery. Retrieved 17 November 2015.
- "ZFS: the last word in file systems". Sun Microsystems. September 14, 2004. Archived from the original on April 28, 2006. Retrieved April 30, 2006.
- Matthew Ahrens (November 1, 2011). "ZFS 10 year anniversary". Retrieved July 24, 2012.
- "Sun Celebrates Successful One-Year Anniversary of OpenSolaris". Sun Microsystems. June 20, 2006.
- "ZFS FAQ at OpenSolaris.org". Sun Microsystems. Archived from the original on May 15, 2011. Retrieved May 18, 2011.
The largest SI prefix we liked was 'zetta' ('yotta' was out of the question)
- "Oracle and NetApp dismiss ZFS lawsuits". theregister.co.uk. 2010-09-09. Retrieved 2013-12-24.
- "OpenZFS History". OpenZFS. Retrieved 2013-09-24.
- "illumos FAQs". illumos. Retrieved 2013-09-24.
- "Sun rolls out its own storage appliances". techworld.com.au. 2008-11-11. Retrieved 2013-11-13.
- Chris Mellor (2013-10-02). "Oracle muscles way into seat atop the benchmark with hefty ZFS filer". theregister.co.uk. Retrieved 2014-07-07.
- "Unified ZFS Storage Appliance built in Silicon Valley by iXsystem". ixsystems.com. Retrieved 2014-07-07.
- "ReadyDATA 516 - Unified Network Storage" (PDF). netgear.com. Retrieved 2014-07-07.
- "Solaris ZFS Administration Guide, Appendix A ZFS Version Descriptions". Oracle Corporation. 2010. Retrieved February 11, 2011.
- "Oracle Solaris ZFS Version Descriptions". Oracle Corporation. Retrieved 2013-09-23.
- Siden, Christopher (January 2012). "ZFS Feature Flags" (PDF). Illumos Meetup. Delphix. p. 4. Retrieved 2013-09-22.
- "/usr/src/uts/common/sys/fs/zfs.h (line 338)". illumos (GitHub). Retrieved 2013-11-16.
- "/usr/src/uts/common/fs/zfs/zfeature.c (line 89)". illumos (GitHub). Retrieved 2013-11-16.
- "While under Sun Microsystems' control, there were bi-weekly snapshots of Solaris Nevada (the codename for the next-generation Solaris OS to eventually succeed Solaris 10) and this new code was then pulled into new OpenSolaris preview snapshots available at Genunix.org. The stable releases of OpenSolaris are based off of these Nevada builds." Larabel, Michael. "It Looks Like Oracle Will Stand Behind OpenSolaris". Phoronix Media. Retrieved 21 November 2012.
- Ljubuncic, Igor (23 May 2011). "OpenIndiana — there's still hope". DistroWatch.
- "Welcome to Project OpenIndiana!". Project OpenIndiana. 10 September 2010. Retrieved 14 September 2010.
- "Porting ZFS to OSX". zfs-discuss. April 27, 2006. Archived from the original on May 15, 2006. Retrieved April 30, 2006.
- "Apple: Leopard offers limited ZFS read-only". MacNN. June 12, 2007. Retrieved June 23, 2007.
- "Apple delivers ZFS Read/Write Developer Preview 1.1 for Leopard". Ars Technica. October 7, 2007. Retrieved October 7, 2007.
- Ché Kristo (November 18, 2007). "ZFS Beta Seed v1.1 will not install on Leopard.1 (10.5.1) " ideas are free". Archived from the original on December 24, 2007. Retrieved December 30, 2007.
- ZFS.macosforge.org Archived November 2, 2009, at the Wayback Machine.
- http://alblue.blogspot.com/2008/11/zfs-119-on-mac-os-x.html |title=Alblue.blogspot.com
- "Snow Leopard (archive.org cache)". July 21, 2008. Archived from the original on 2008-07-21.
- "Snow Leopard". June 9, 2009. Retrieved June 10, 2008.
- "maczfs – Official Site for the Free ZFS for Mac OS – Google Project Hosting". Google. Retrieved July 30, 2012.
- "zfs-macos | Google Groups". Google. Retrieved November 4, 2011.
- MacZFS on github
- Frequently Asked Questions page on code.google.com/p/maczfs
- "Distribution – OpenZFS". OpenZFS. Retrieved 17 September 2013.
- The OpenZFS Project
- Comparison of SVM mirroring and ZFS mirroring
- EON ZFS Storage (NAS) distribution
- ZFS on Linux Homepage
- End-to-end Data Integrity for File Systems: A ZFS Case Study
- ZFS – The Zettabyte File System, archived from the original on February 28, 2013
- ZFS and RAID-Z: The Über-FS?
- ZFS: The Last Word In File Systems, by Jeff Bonwick and Bill Moore
- Visualizing the ZFS intent log (ZIL), April 2013, by Aaron Toponce