Nested RAID levels

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Levels of nested RAID,[1] also known as hybrid RAID,[2] combine two or more of the standard levels of RAID (redundant array of independent disks) to gain performance, additional redundancy, or both.

Contents

RAID 0+1 [edit]

Typical RAID 0+1 setup.

A RAID 0+1 (also called RAID 01), is a RAID level used for both replicating and sharing data among disks.[3] RAID 0+1 is a mirror of stripes. The usable capacity of a RAID 0+1 array is the same as a RAID 1 array, where half of the total capacity is used to mirror the other half. (N/2) \cdot S_{\mathrm{min}}, where N is the total number of drives and S_{\mathrm{min}} is the capacity of the smallest drive in the array. The minimum number of disks required to implement RAID 0+1 is three, where the data is striped across two disks in RAID 0 and then all data is mirrored on a third disk. But it is more common to use a minimum of four disks.

RAID 1+0 [edit]

Typical RAID 1+0 setup.

A RAID 1+0, sometimes called RAID 1&0 or RAID 10, is similar to a RAID 0+1 with exception that the RAID levels used are reversed — RAID 10 is a stripe of mirrors.[3]

Near versus far, advantages for bootable RAID [edit]

A nonstandard definition of "RAID 10" was created for the Linux MD driver;[4] RAID 10 as recognized by the storage industry association and as generally implemented by RAID controllers is a RAID 0 array of mirrors (which may be two way or three way mirrors) [5] and requires a minimum of 4 drives. Linux "RAID 10" can be implemented with as few as two disks. Implementations supporting two disks such as Linux RAID 10[4] offer a choice of layouts, including one in which copies of a block of data are "near" each other or at the same address on different devices or predictably offset: Each disk access is split into full-speed disk accesses to different drives, yielding read and write performance like RAID 0 but without necessarily guaranteeing that every stripe is on both drives. Another layout uses "a more RAID 0 like arrangement over the first half of all drives, and then a second copy in a similar layout over the second half of all drives - making sure that all copies of a block are on different drives." This has high read performance because only one of the two read locations must be found on each access, but writing requires more head seeking as two write locations must be found. Very predictable offsets minimize the seeking in either configuration. "Far" configurations may be exceptionally useful for Hybrid SSD with huge caches of 4 GB (compared to the more typical 64 MB of spinning platters in 2010) and by 2011 64 GB (as this level of storage exists now on one single chip). They may also be useful for those small pure SSD bootable RAIDs which are not reliably attached to network backup and so must maintain data for hours or days, but which are quite sensitive to the cost, power and complexity of more than two disks. Write access for SSDs is extremely fast so the multiple access become less of a problem with speed: At PCIe x4 SSD speeds, the theoretical maximum of 730 MB/s is already more than double the theoretical maximum of SATA-II at 300 MB/s.

Another use for these configurations is to continue to use slower disk interfaces in NAS or low-end RAIDs/SAS (notably SATA-II at 300 MB/s or 3 Gbit/s) rather than replace them with faster ones (USB-3 at 5 Gbit/s, SATA-III at 600 MB/s or 6 Gbit/s, PCIe x4 at 730 MB/s, PCIe x8 at 1460 MB/s, etc.). A pair of identical SATA-II disks with any of hybrid SSD, OS caching to a SSD or a large software write cache, could be expected to achieve performance identical to SATA-III. Three or four could achieve at least read performance similar to PCIe x8 or striped SATA-III if properly configured to minimize seek time (predictable offsets, redundant copies of most accessed data).

Examples [edit]

Note: A1, A2, et cetera each represent one data block; each column represents one disk.

More typically, larger arrays of disks are combined for professional applications. In high end configurations, enterprise storage experts expected PCIe and SAS storage to dominate and eventually replace interfaces designed for spinning metal[6] and for these interfaces to become further integrated with Ethernet and network storage suggesting that rarely accessed data stripes could often be located over networks and that very large arrays using protocols like iSCSI would become more common. Pictured in this section is an example where three collections of 120 GB level 1 arrays are striped together to make 360 GB of total storage space:

Redundancy and data-loss recovery capability [edit]

All but one drive from each RAID 1 set could fail without damaging the data. However, if the failed drive is not replaced, the single working hard drive in the set then becomes a single point of failure for the entire array. If that single hard drive then fails, all data stored in the entire array is lost. As is the case with RAID 0+1, if a failed drive is not replaced in a RAID 10 configuration then a single uncorrectable media error occurring on the mirrored hard drive would result in data loss. Some RAID 10 vendors address this problem by supporting a "hot spare" drive, which automatically replaces and rebuilds a failed drive in the array.

Performance (speed) [edit]

According to manufacturer specifications[7] and official independent benchmarks,[8][9] in most cases RAID 10 provides better throughput and latency than all other RAID levels except RAID 0 (which wins in throughput).

It is the preferable RAID level for I/O-intensive applications such as database, email, and web servers, as well as for any other use requiring high disk performance.[10]

Efficiency (potential waste of storage) [edit]

The usable capacity of a RAID 10 array is Σ Vi,min where Vi,min is the capacity of the smallest disk in the mirror set and the sum is taken over all the mirror sets. If each mirror set contains the same number M of disks, and the smallest disk in each mirror set has capacity C, and there are N disks in total, this can be simplified to usable capacity = (N/M) C.

Implementation [edit]

The Linux kernel RAID 10 implementation (from version 2.6.9 and onwards) is not nested. The mirroring and striping is done in one process. Only certain layouts are standard RAID 10.[4] See also the Linux MD RAID 10 and RAID 1.5 sections in the Non-standard RAID article for details.

RAID 100 (RAID 1+0+0) [edit]

Representative RAID-100 Setup.
(Note: A1, B1, et cetera each represent one data sector; each column represents one disk.)

A RAID 100, sometimes also called RAID 10+0, is a stripe of RAID 10s. This is logically equivalent to a wider RAID 10 array, but is generally implemented using software RAID 0 over hardware RAID 10. Being "striped two ways", RAID 100 is described as a "plaid RAID".[11] Below is an example in which two sets of two 120 GB RAID 1 arrays are striped and re-striped to make 480 GB of total storage space:

The failure characteristics are identical to RAID 10: all but one drive from each RAID 1 set could fail without loss of data. However, the remaining disk from the RAID 1 becomes a single point of failure for the already degraded array. Often the top level stripe is done in software. Some vendors call the top level stripe a MetaLun (Logical Unit Number (LUN)), or a Soft Stripe.

The major benefits of RAID 100 (and plaid RAID in general) over single-level RAID is spreading the load across multiple RAID controllers, giving better random read performance and mitigating hotspot risk on the array. For these reasons, RAID 100 is often the best choice for very large databases, where the hardware RAID controllers limit the number of physical disks allowed in each standard array. Implementing nested RAID levels allows virtually limitless spindle counts in a single logical volume.

RAID 0+3 and 3+0 [edit]

RAID 0+3 [edit]

Diagram of a 0+3 array

RAID level 0+3 or RAID level 03 is a dedicated parity array across striped disks. Each block of data at the RAID 3 level is broken up amongst RAID 0 arrays where the smaller pieces are striped across disks.

However, this is a perilous arrangement. One drive from any of the underlying RAID 0 sets can fail. Other drives in that same RAID 0 set can fail -- however, any drive in the other RAID 0 sets is a single point of failure for the entire RAID 03 array. (more advanced recovery techniques might be possible, if and only if the RAID 0 controllers stripe data exactly in lock-step with each other; however this may not be reliable.)

Rebuilding the array involves the entire array (as opposed to RAID 30, where it involves only half the array). This is relevant to both rebuild time and performance during rebuild.

RAID 30 [edit]

Diagram of a 3+0 array

RAID level 30 is also known as striping of dedicated parity arrays. It is a combination of RAID level 3 and RAID level 0. RAID 30 provides high data transfer rates, combined with high data reliability. RAID 30 is best implemented on two RAID 3 disk arrays with data striped across both disk arrays. RAID 30 breaks up data into smaller blocks, and then stripes the blocks of data to each RAID 3 RAID set. RAID 3 breaks up data into smaller blocks, calculates parity by performing an Exclusive OR on the blocks, and then writes the blocks to all but one drive in the array. The parity bit created using the Exclusive OR is then written to the last drive in each RAID 3 array. The size of each block is determined by the stripe size parameter, which is set when the RAID is created.

One drive from each of the underlying RAID 3 sets can fail. Until the failed drives are replaced the other drives in the sets that suffered such a failure are a single point of failure for the entire RAID 30 array. In other words, if one of those drives fails, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set. RAID 30's strength over both RAID 03 and a large RAID 3 is that only the HDDs of one RAID 3 set (that is, at most half the HDDs) are affected during the rebuild process, which means both shorter rebuild times, and a less severe performance hit during the rebuild.

RAID 50 (RAID 5+0) [edit]

Representative RAID-50 Setup.
(Note: A1, B1, et cetera each represent one data block; each column represents one disk; Ap, Bp, et cetera each represent parity information for each distinct RAID 5 and may represent different values across the RAID 5 (that is, Ap for A1 and A2 can differ from Ap for A3 and A4).)

A RAID 50 combines the straight block-level striping of RAID 0 with the distributed parity of RAID 5.[3] This is a RAID 0 array striped across RAID 5 elements. It requires at least 6 drives.

Below is an example where three collections of 240 GB RAID 5s are striped together to make 720 GB of total storage space:

One drive from each of the RAID 5 sets could fail without loss of data. However, if the failed drive is not replaced, the remaining drives in that set then become a single point of failure for the entire array. If one of those drives fails, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set.

In the example below, datasets may be striped across both RAID sets. A dataset with 5 blocks would have 3 blocks written to the first RAID set, and the next 2 blocks written to RAID set 2.

RAID-50 Setup consisting of two sets of four drives each.

The configuration of the RAID sets will impact the overall fault tolerance. A construction of three seven-drive RAID 5 sets has higher capacity and storage efficiency, but can only tolerate three maximum potential drive failures. Because the reliability of the system depends on quick replacement of the bad drive so the array can rebuild, it is common to construct three six-drive RAID 5 sets each with a hot spare that can immediately start rebuilding the array on failure. This does not address the issue that the array is put under maximum strain reading every bit to rebuild the array precisely at the time when it is most vulnerable. A construction of seven three-drive RAID 5 sets can handle up to seven drive failures, if they are in different RAID 5 sets, but has lower capacity and storage efficiency.

RAID 50 improves upon the performance of RAID 5 particularly during writes, and provides better fault tolerance than a single RAID level does. This level is recommended for applications that require high fault tolerance, capacity and random positioning performance.

As the number of drives in a RAID set increases, and the capacity of the drives increase, this impacts the fault-recovery time correspondingly as the interval for rebuilding the RAID set increases.

RAID 51 [edit]

Diagram of a RAID 51 setup.

A RAID 51 or RAID 5+1 is an array that consists of two RAID 5's that are mirrors of each other. Generally this configuration is used so that each RAID 5 resides on a separate controller. In this configuration reads and writes are balanced across both RAID 5s. Some controllers support RAID 51 across multiple channels and cards with hinting to keep the different slices synchronized. However a RAID 51 can also be accomplished using a layered RAID technique. In this configuration, the two RAID 5's have no idea that they are mirrors of each other and the RAID 1 has no idea that its underlying disks are RAID 5's. However, if the configuration is layered, the RAID controllers cannot exploit the fact that each HDD is mirrored for a faster rebuild process, but has to go through the lengthier RAID 5 rebuild process. This configuration can sustain the failure of all disks in either of the arrays, plus up to one additional disk from the other array before suffering data loss. The maximum amount of space of a RAID 51 is (N - 1) where N is the number of drives making up each individual RAID 5 set.

RAID 05 (RAID 0+5) [edit]

A RAID 0 + 5 consists of several RAID 0's (a minimum of three) that are grouped into a single RAID 5 set. The total capacity is (N-1) where N is total number of RAID 0's that make up the RAID 5. This configuration is not generally used in production systems.

RAID 53 [edit]

Note that RAID 53 is typically used as a name for RAID 30 or 0+3.[12]

RAID 60 (RAID 6+0) [edit]

RAID-60 (RAID 6+0) setup consisting of two sets of four drives each

A RAID 60 combines the straight block-level striping of RAID 0 with the distributed double parity of RAID 6. That is, a RAID 0 array striped across RAID 6 elements. It requires at least eight disks.[3]

See also [edit]

References [edit]


  1. ^ Delmar, Michael Graves (2003). "Data Recovery and Fault Tolerance". The Complete Guide to Networking and Network+. Cengage Learning. p. 448. ISBN 1-4018-3339-X. 
  2. ^ Mishra, S. K.; Vemulapalli, S. K.; Mohapatra, P (1995). "Dual-Crosshatch Disk Array: A Highly Reliable Hybrid-RAID Architecture". Proceedings of the 1995 International Conference on Parallel Processing: Volume 1. CRC Press. pp. I–146ff. ISBN 0-8493-2615-X. 
  3. ^ a b c d "Selecting a RAID level and tuning performance". IBM Systems Software Information Center. IBM. 2011. p. 1. 
  4. ^ a b c Brown, Neil (27 August 2004). "RAID10 in Linux M driver". 
  5. ^ http://www.snia.org/tech_activities/standards/curr_standards/ddf/SNIA-DDFv1.2.pdf
  6. ^ Cole, Arthur (24 August 2010). "SSDs: From SAS/SATA to PCIe". IT Business Edge. 
  7. ^ "Intel Rapid Storage Technology: What is RAID 10?". Intel. 16 November 2009. 
  8. ^ "IBM and HP 6-Gbps SAS RAID Controller Performance" (PDF). Demartek. October 2009. 
  9. ^ "Summary Comparison of RAID Levels". PCGuide.com. 17 April 2001. 
  10. ^ Gupta, Meeta (2002). Storage Area Network Fundamentals. Cisco Press. p. 268. ISBN 1-58705-065-X. 
  11. ^ McKinstry, Jim. "Server Management: Questions and Answers". Sys Admin. Archived from the original on 19 January 2008. 
  12. ^ Kozierok, Charles M. (17 April 2001). "RAID Levels 0+3 (03 or 53) and 3+0 (30)". The PC Guide.