Jump to content

Talk:Standard RAID levels

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Tcncv (talk | contribs) at 05:30, 11 September 2022 (→‎RAID 6 Simplified parity scheme is wrong?: Agree that secion is flawed and support removal). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

WikiProject iconComputing C‑class Low‑importance
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
CThis article has been rated as C-class on Wikipedia's content assessment scale.
LowThis article has been rated as Low-importance on the project's importance scale.
Taskforce icon
This article is supported by Computer hardware task force (assessed as Mid-importance).

Clarifying the Need for Reed Solomon coding in RAID-6

In the RAID-6 section the explanation of the second parity says that orthogonal, diagonal and Reed Solomon can all be used. However, when Reed Solomon is explained it is presented as necessary. It isn't necessary, the other two methods work equally as well. Both of them require calculating a second parity block on a write but this is less CPU intense than an equivalent sized RS code. I will some clarifying language to this effect.

References

RAID 6 Simplified parity scheme is wrong?

I'm no expert on erasure codes, but I'm pretty sure the Simplified Parity Example section is wrong.

Consider the case where all data bits are zero. In that case, P and Q will also be all zeros, and if we lose disks and (or any two data disks), A and B will also be all zeros. But if we use the given equations for A and B to try and recover and :

We can easily see that these equations are true when and are all zeros, or all ones. So there is no unique solution for or , and we cannot recover the lost data.

This problem is not limited to the case where the data bits are all zeros (or all ones). I think there are always two solutions to the equations, regardless of the inputs.

Also, I haven't found any other source that describes this algorithm. It seems to only exist on Wikipedia.

I did find some RAID-6 coding schemes which are similar to this one, but not identical: EVENODD and RDP. They are described nicely in [The RAID-6 Liberation Codes].

PromisesPromises (talk) 22:46, 18 January 2022 (UTC)[reply]

Yea, I recently read through this when talking to a friend and went... wait a sec, these 2k equations aren't actually linearly independent. If you view Di and Dj as vectors in Z2^n, and use a simple Shift(Dj), you end up with a system like [100 100, 010 010, 001 001, 100 010, 010 001, 001 100] = v, for k=3, n=3. Using any shift function on Dj for any size system will give you one redundant equation, and *not* a well-defined solution for your system.
However, to convince myself I did a little more work, and used (as a simple *working* example) the polynomial field Z2[x]/<x^3+x+1>, which is a 8-element irreducible polynomial field over {0,1}, since the only nonzero element, 1, isn't a root. You can see that x has order 8 in this field (keep multiplying 1 by x and reducing via the relation x^3+x+1=0), and thus can be used as a generator per the general case in the next section. Finally, use multiplication by g=x as our "shift" operator instead. And Dj=y0 + y1*x + y2*x^2, we have g*Dj = y0*x + y1*x^2 + y2*(x^3 -> x^2 + 1) = y2 + (y0+y2)*x + y1*x^2. Using these in our linear system instead from above gives [100 100, 010 010, 001 001, 100 001, 010 101, 001 010]. A little work shows that these equations ARE independent, and thus any parity blocks P and Q can uniquely reconstruct Di and Dj.
For a little more abstract explanation (sorry, I'm actually figuring this out as I go), we need Di+Dj=0 and Di+g*Dj=0 to have the unique solution Di=Dj=0. Combining these gives Dj+g*Dj=(I+g*)Dj=0 requiring Dj=0, thus I+g* must be invertible as a linear operator, 1 is *not* an eigenvalue of the operator g*, and g(v)=v has no nonzero solution. As all nonzero elements of our field are invertible, we have g*v=v -> g=1, which was an invalid choice for our (2nd) operator. Thus finally, (I+g*) is invertible, and our system implies Dj=0, which then by our original system Di+Dj=0 gives Di=0. This solution is actually general, I just used the easiest choice of size, suitable finite field, and generator I came up with.
I'm sure there's a MUCH easier way of explaining this, but the upshot here is that (I+Shift^n) is NOT an invertible operator ever in our setup, since as you've seen Di = {11111111} is an eigenvalue, satisfying v=Shift^n(v). However, in an irreducible polynomial field of degree n over Z2, any generator g of the aforementioned field gives a proper set of equations, and any power of g up to 2^n-1 gives a unique operator to apply to each drive, hence D0 + g*D1 + g^2*D2 +... that you see in the next wiki section. Since g (and thus any power of g not equal to 1) is cyclic and thus loops back to 1 and itself by applying powers of g, the above proof works with any g^i and g^j, requiring that (g^(i-j)-I) is invertible for each power, which checks out by the same method. Ok, I apologize, but I figured I might as well post this instead of erasing it. 174.134.36.149 (talk) 07:01, 18 April 2022 (UTC)[reply]
At the end of the example: A xor B = D3 xor (D3 rotate-shift 3). So, will be a 4 bit HDD: the bits are abcd. So, we known a xor d, b xor a, c xor b, d xor c. The xor of the starting 3 is the last: d xor c. So, we have only 3 bit information, but the capacity of the HDD is 4 bits. By the way, the example can not work on 2+2 and 3+2 drives so. So, the example is wrong, must be deleted. X00000 (talk) 09:07, 7 July 2022 (UTC)[reply]
At the end of the example: A xor B = D3 xor (D3 rotate-shift 3). So, will be a 4 bit HDD: the bits are abcd. So, we known a xor d, b xor a, c xor b, d xor c. The xor of the starting 3 is the last: d xor c. So, we have only 3 bit information, but the capacity of the HDD is 4 bits. By the way, the example can not work on 2+2 and 3+2 drives so. So, the example is wrong, must be deleted. X00000 (talk) 17:51, 14 July 2022 (UTC)[reply]

why was my article section removal reverted? i thought the result of this talk section was, that the section in the article shalt be deleted... --2003:D7:DF0B:AA00:76D4:35FF:FE54:CF0D (talk) 20:31, 10 September 2022 (UTC)[reply]

I agree with all of the above. A simple bit-shift (actually a rotate) cannot be used as the generator. The statement in the article "On a bitwise level, this represents a system of equations in unknowns which uniquely determine the lost data" is flawed in that it does not consider the possibility that the equations are not independent, which is in fact the case for a simple bit-shift/rotate.
For example, for k = 8 and lost data and and calculation A and B as described, you get the following equations:
At the bit level where
This expands to
If you combine the first 15 equations, the result matches the last. Thus the equations are not independent and cannot be uniquely solved. A more suitable generator such as a Linear-feedback shift register mentioned in the general section avoids this issue.
Since the section does not reference any sources supporting the use of simple shift/rotate operation as the generator, I expect this may have been some original research which was both incomplete and flawed, and I support its removal. -- Tom N talk/contrib 05:29, 11 September 2022 (UTC)[reply]

Outdated performance information

Many of the performance source are form around 2015 and seem outdated. Fro example RAID 0 Performance states that is often applied in gaming but looking at any recent PC hardware forum discussion (e.g. r/pcgaming/ 2020 , Linus Tech Tips forum 2021) it seems that today's SSD, especially NVMe ones, are fast enough even for simultaneous playing, recording, and streaming, therefore, RAID 0 has no practical benefits for gaming nowadays and almost nobody does it. And when someone wants a large logical volume, JBOD is recommended instead. Finally, it seems that for non-server use RAID 0 might only help if working with 8K or 4K60 footage that isn't encoded for hardware accelerated or for compiling a giant codebase as it might help with faster random IOPs. I wasn't able to find any reliable enough sources for this, so I'm unable to edit the article. Yaqub Kabanoki (talk) 12:56, 27 August 2022 (UTC)[reply]

There are lots of uses for RAID0, and the article mentions just a few. I used to use a four-disk RAID0 as intermediate backup target, before writing to tape. Currently, the claim is sourced, so imho you'd need a reliable source to remove as well, something along "nobody uses RAID0 for gaming rigs anymore". --Zac67 (talk) 15:18, 27 August 2022 (UTC)[reply]