Talk:RAID: Difference between revisions

Content deleted Content added

Inline

Revision as of 03:05, 2 February 2013

This is the talk page for discussing improvements to the RAID article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

RAID was a Engineering and technology good articles nominee, but did not meet the good article criteria at the time. There may be suggestions below for improving the article. Once these issues have been addressed, the article can be renominated. Editors may also seek a reassessment of the decision if they believe there was a mistake.

Article milestones
Date	Process	Result
May 22, 2006	Good article nominee	Not listed

Computing: Software C‑class High‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
C	This article has been rated as C-class on Wikipedia's content assessment scale.
High	This article has been rated as High-importance on the project's importance scale.
	This article is supported by WikiProject Software (assessed as Mid-importance).
	This article is supported by Computer hardware task force (assessed as High-importance).

Archives

Index 1, 2, 3, 4, 5, 6, 7

This page has archives. Sections older than 60 days may be automatically archived by when more than 4 sections are present.

References

IEEE.org as of 4-6 Dec 1991 and updated as of Aug 2002 cited Inexpensive, ^[1]

IEEE.org as of June 29 2009-July 2 2009 and updated as of 29 September 2009 cited Independent, ^[2]

NetApp -Independent ^[3]

EMC - Independent ^[4]

Western Digital Independent ^[5]

A paper from Duke University Dept. of Computer Science in 1993 - Inexpensive ^[6]

Edit revert

I have reverted two edits by 145.225.17.100 (talk · contribs) ( http://en.wikipedia.org/w/index.php?title=RAID&diff=523846804&oldid=523363466 and http://en.wikipedia.org/w/index.php?title=RAID&diff=523846913&oldid=523846804 )due to the fact they removed the entity of the Standard Raids section, removed part of the cite in the nested/hybrid section, and added no content. I'm going to assume this is either vandalism or accidental, though the IP does have other changes elsewhere that look less suspicious. Anything needed other than undoing the edit? Nazzy (talk) 15:58, 19 November 2012 (UTC)[reply]

Nope! If an edit seems accidental or otherwise incoherent, and is not explained in the edit summary (the editor you mention didn't leave edit summaries), simply undoing it is the best solution. On a similar note, always providing edit summaries helps others understand your edits. – voidxor ^{(talk | contrib)} 08:07, 19 January 2013 (UTC)[reply]

reliability terms mistake

in the section "reliability terms" it says: "System failure is defined as loss of data and its rate will depend on the type of RAID. For RAID 0 this is equal to the logical failure rate, as there is no redundancy." however wouldn't the faliure rate be higher, asuming each drive in the array has a chance of faliure and the data is spread across more than one drive? ~patrick

Claim that RAID 3 is theoretical and not used in practice is incorrect.

The article claims RAID 3 is a 'theoretical' RAID level and 'not used in practice.' This is incorrect, EMC Clariion and VNX support a RAID 3 option. — Preceding unsigned comment added by 98.229.192.30 (talk) 21:28, 14 December 2012 (UTC)[reply]

Yes true, there do appear to be several older implementations which used RAID 3- the article text has now been changed to say that RAID 3 is not widely used in practice, with a link to the BSD Unix implementation. I believe that is an accurate assessment of the current usage- is the RAID 3 option still supported in the product lines examples above? It would be good to find a reference giving estimated usage of these RAID levels in contemporary products.

121.45.193.118 (talk) 12:09, 20 December 2012 (UTC)[reply]

You can see the RAID 3 reference, as well as 0, 1, 1/0, 5, and 5 in the current (1/7/2013) datasheet for EMC VNX on their website. http://www.emc.com/collateral/hardware/data-sheets/h8520-vnx-family-ds.pdf — Preceding unsigned comment added by 168.159.213.60 (talk) 20:02, 7 January 2013 (UTC)[reply]

Duplicate sentence

The sentence "a single drive failure results in reduced performance of the entire array until the failed drive has been replaced and the associated data rebuilt" appears twice in the description of RAID 5. Somebody should fix it. 173.34.179.126 (talk) 14:19, 22 December 2012 (UTC)[reply]

Done. 121.45.193.118 (talk) 22:09, 22 December 2012 (UTC)[reply]

Various improvements - feedback please.

I've just made a few edits to improve clarity.

I'm looking for constructive criticism regarding the following ones:

(and feel free to make them for me.

CONFUSING:

"Background scrubbing can be used to detect and recover from UREs (which are latent and invisibly compensated for dynamically by the RAID array) as a background process, by reconstruction from the redundant RAID data and then re-writing and re-mapping to a new sector; and so reduce the risk of double-failures to the RAID system[42][43] (see Data scrubbing above)."

Idea:

Add a subheading to this section and the one above what's CONFUSING

and

= Mitigations

REWORD THUS:

Double....

Background data scrubbing is a RAID feature that [mitigates the problem|attempts to address the issue] by having a background process read drive blocks when a drive is idle and whenever a URE occurs, using the redundant RAID data to reconstruct, re-write and re-map the logical block to a new spare block, and so...

or perhaps

Background data scrubbing is a RAID feature that [mitigates the problem|attempts to address the issue] by having a background process read drive sectors when a drive is idle and whenever a URE occurs, using the redundant RAID data to reconstruct, re-write and re-map the logical sector to a new spare sector, and so... Where the read triggeres a Recoverable Read Error, the drive itself reconstruct, re-write and re-map the logical sector to a new spare sector...

Section on Atomicity: Wow, this is confusingly written. Perhaps it should kick off with the simple statement, like "Databases are designed to maintain data consistency despite the non-atomicity of the write process of the disk drives on which they normally store data, even when there is unexpected power loss." If I'm not mistaken, either disk drives can't perform atomic write processes, period, or ones that can are rare enough to merit little more than a parenthetical.

Write case reliability:

P(aragraph)3 doesn't mesh with P1 and P2. In fact all 3 are about write-back, but only P3 calls it that. Merge 'em.

Hardware Labeling:

Some systems have a feature intended to reduce operator error and facilitate drive failure management by having an LED by a failed drive indicate that it is a drive that has failed.

History:

REMOVE:With S/38 checksum, when a disk failed, the system stopped and was powered off. Under maintenance, the bad disk was replaced and the new disk was fully recovered using RAID parity bits. While checksum had 10%-30% overhead and was not concurrent recovery, non-concurrent recovery was still a far better solution than a reload of the entire system. With 30% overhead and the then high expense of extra disk, few customers implemented checksum.[citation needed]

doesn't make sense.

It's so confusingly written I don't know what it's trying to say or how to fix it.

--Elvey (talk) 11:42, 30 December 2012 (UTC)[reply]

RAID 10 compared to RAID 5 section moved here for improvement

The section ==Mirroring versus parity RAID levels in relational databases== is very problematic- e.g. the claim that "Given the rare nature of drive failures in general, and the exceedingly low probability of multiple concurrent drive failures occurring within the same RAID, the choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator" is incorrect-- UREs are extremely likely to happen with large disks and thus RAID 5 is not recommended for large databases by the major storage manufacturers (as noted in the main article). The references given are to a non-authoritative web site and do not support the claims in any case. A well-referenced section on this topic could be useful, but the current section is misleading to users of RAID. The current text is copied below for improvement before possibly returning to the main article. 121.45.213.224 (talk) 06:54, 28 January 2013 (UTC)[reply]

Mirroring versus parity RAID levels in relational databases

A common opinion (and one which serves to illustrate the dynamics of proper RAID deployment) is that RAID 10 (a non-parity, mirrored RAID) is inherently better for relational databases than RAID 5, because RAID 5 requires the recalculation and redistribution of parity data on a per-write basis.^[1]

There are, however, other considerations which must be taken into account other than simply those regarding performance. RAID 5 and other non-mirror-based arrays offer a lower degree of resiliency than RAID 10 by virtue of RAID 10's mirroring strategy. In a RAID 10, I/O can continue even in spite of multiple drive failures. By comparison, in a RAID 5 array, any failure involving more than one drive renders the array itself unusable by virtue of parity recalculation being impossible to perform. Thus, RAID 10 is frequently favored because it provides the lowest level of risk.^[2]

Modern SAN design largely masks any performance hit while a RAID is in a degraded state, by virtue of being able to perform rebuild operations both in-band or out-of-band with respect to existing I/O traffic. Given the rare nature of drive failures in general, and the exceedingly low probability of multiple concurrent drive failures occurring within the same RAID, the choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator, particularly when weighed against other factors such as cost, throughput requirements, and physical spindle availability.^[2]^{[failed verification]}

RAID 10 space efficiency calculations seems off

It seems that RAID 10 and RAID 1 should have the same space efficiency formula, namely 50% since they both only use mirroring to tolerate single drive failures. — Preceding unsigned comment added by 99.245.3.15 (talk) 19:40, 1 February 2013 (UTC)[reply]

^ "RAID Classifications". BytePile.com. 2012-04-10. Retrieved 2012-08-26.
^ ^a ^b "RAID Classifications". BytePile.com. 2012-04-10. Retrieved 2012-08-26.

[1] ttp://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=183060

[2] ttp://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5270303

[3] ttps://kb.netapp.com/support/index?page=content&id=3010575

[4] ttp://www.emc.com/backup-and-recovery/disk-library/disk-library-for-mainframe.htm#!offerings

[5] ttp://www.wdc.com/en/products/products.aspx?id=210

[6] ttp://people.ee.duke.edu/~kst/markovpapers/Manish-RAID.pdf

[7] "RAID Classifications". BytePile.com. 2012-04-10. Retrieved 2012-08-26.

[bytepile1-8] "RAID Classifications". BytePile.com. 2012-04-10. Retrieved 2012-08-26.

[1]

[2]

[3]

[4]

[5]

[6]

[1]

[2]

@@ Line 55: / Line 55: @@
 == Edit revert ==
 I have reverted two edits by {{User|145.225.17.100}} ( http://en.wikipedia.org/w/index.php?title=RAID&diff=523846804&oldid=523363466 and http://en.wikipedia.org/w/index.php?title=RAID&diff=523846913&oldid=523846804 )due to the fact they removed the entity of the Standard Raids section, removed part of the cite in the nested/hybrid section, and added no content.  I'm going to assume this is either vandalism or accidental, though the IP does have other changes elsewhere that look less suspicious.  Anything needed other than undoing the edit? [[User:Nazzy|Nazzy]] ([[User talk:Nazzy|talk]]) 15:58, 19 November 2012 (UTC)
@@ Line 62: / Line 63: @@
 in the section "reliability terms" it says: "System failure is defined as loss of data and its rate will depend on the type of RAID. For RAID 0 this is equal to the logical failure rate, as there is no redundancy." however wouldn't the faliure rate be higher, asuming each drive in the array has a chance of faliure and the data is spread across more than one drive? ~patrick
-== Any citations to support claim for RAID 5 in IBM S/38? ==
-In the history section the claim is made that:
-"In October 1986, the IBM S/38 announced "checksum" - an operating system software level implementation of what became RAID-5"
-which is one year before Patterson defined the term RAID.
-I have done a pretty thorough search for literature evidence for this and can find none- does anyone have a reference. Otherwise I will remove it in the future.
-[[Special:Contributions/121.45.215.68|121.45.215.68]] ([[User talk:121.45.215.68|talk]])  <span style="font-size: smaller;" class="autosigned">—Preceding [[Wikipedia:Signatures|undated]] comment added 04:35, 2 December 2012 (UTC)</span><!--Template:Undated--> <!--Autosigned by SineBot-->
-== Section "RAID 10 versus RAID 5 in relational databases" problematic ==
-The section "RAID 10 versus RAID 5 in relational databases" has several problems- the main one is that RAID5 is no longer recommended or used for any large data set, presumably including most databases. Could this section be generalised to compare parity versus non-parity (mirrored) RAID schemes?
-Also, there is a lack of references. [[Special:Contributions/121.45.215.68|121.45.215.68]] ([[User talk:121.45.215.68|talk]])  <span style="font-size: smaller;" class="autosigned">—Preceding [[Wikipedia:Signatures|undated]] comment added 22:44, 3 December 2012 (UTC)</span><!--Template:Undated--> <!--Autosigned by SineBot-->
 == Claim that RAID 3 is theoretical and not used in practice is incorrect. ==
@@ Line 175: / Line 163: @@
 A well-referenced section on this topic could be useful, but the current section is misleading to users of RAID. The current text is copied below for improvement before possibly returning to the main article. [[Special:Contributions/121.45.213.224|121.45.213.224]] ([[User talk:121.45.213.224|talk]]) 06:54, 28 January 2013 (UTC)
-==Mirroring versus parity RAID levels in relational databases==
+== Mirroring versus parity RAID levels in relational databases ==
 A common opinion (and one which serves to illustrate the dynamics of proper RAID deployment) is that RAID&nbsp;10 (a non-parity, mirrored RAID) is inherently better for [[relational databases]] than RAID&nbsp;5, because RAID&nbsp;5 requires the recalculation and redistribution of parity data on a per-write basis.<ref>{{Cite web|url=http://www.bytepile.com/raid_class.php#5 |title=RAID Classifications |publisher=BytePile.com |date=2012-04-10 |accessdate=2012-08-26}}</ref>