Chipkill: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
In computer memory systems, '''Chipkill''' is [[IBM]]'s trademark for a form of advanced [[Error Checking and Correcting]] (ECC) [[computer memory]] technology that protects computer memory systems from any single memory chip failure as well as multi-bit errors from any portion of a single memory chip. It performs this function by scattering the bits of an ECC word across multiple memory chips, such that the failure of any one memory chip will affect only one ECC bit. This allows memory contents to be reconstructed despite the complete failure of one chip. The equivalent system from [[Sun_microsystems|Sun Microsystems]] is called ''Extended ECC''. The equivalent system from [[HP]] is called ''Chipspare''. |
In computer memory systems, '''Chipkill''' is [[IBM]]'s trademark for a form of advanced [[Error Checking and Correcting]] (ECC) [[computer memory]] technology that protects computer memory systems from any single memory chip failure as well as multi-bit errors from any portion of a single memory chip. It performs this function by scattering the bits of an ECC word across multiple memory chips, such that the failure of any one memory chip will affect only one ECC bit. This allows memory contents to be reconstructed despite the complete failure of one chip. The equivalent system from [[Sun_microsystems|Sun Microsystems]] is called ''Extended ECC''. The equivalent system from [[HP]] is called ''Chipspare''. A similar system from Intel is called ''SDDC''. |
||
Chipkill is frequently combined with [[dynamic bit-steering]], so that if a chip fails (or has exceeded a threshold of bit errors), another, spare, memory chip is used to replace the failed chip. The concept is similar to that of [[RAID]], which protects against disk failure, except that now the concept is applied to individual memory chips. The technology was developed by the [[IBM Corporation]] in the early and middle 1990s. An important [[Reliability, Availability and Serviceability|RAS]] feature, Chipkill technology is deployed primarily on [[Solid-state drive|SSD]]s, [[Mainframe computer|mainframes]] and midrange [[Unix]] or [[Linux]] servers. |
Chipkill is frequently combined with [[dynamic bit-steering]], so that if a chip fails (or has exceeded a threshold of bit errors), another, spare, memory chip is used to replace the failed chip. The concept is similar to that of [[RAID]], which protects against disk failure, except that now the concept is applied to individual memory chips. The technology was developed by the [[IBM Corporation]] in the early and middle 1990s. An important [[Reliability, Availability and Serviceability|RAS]] feature, Chipkill technology is deployed primarily on [[Solid-state drive|SSD]]s, [[Mainframe computer|mainframes]] and midrange [[Unix]] or [[Linux]] servers. |
||
Line 5: | Line 5: | ||
==References== |
==References== |
||
* Timothy J. Dell, ''[http://www.ece.umd.edu/courses/enee759h.S2003/references/chipkill_white_paper.pdf A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory]'', (1997), IBM Microelectronics Division. |
* Timothy J. Dell, ''[http://www.ece.umd.edu/courses/enee759h.S2003/references/chipkill_white_paper.pdf A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory]'', (1997), IBM Microelectronics Division. |
||
* [http://www.intel.com/Assets/PDF/appnote/292274.pdf Intel® E7500 Chipset MCH Intel®x4 Single Device Data Correction (x4 SDDC) Implementation and Validation], Intel Application note AP-726, August 2002. |
|||
[[Category:Computer memory]] |
[[Category:Computer memory]] |
||
[[Category:Error detection and correction]] |
[[Category:Error detection and correction]] |
Revision as of 17:25, 13 November 2008
In computer memory systems, Chipkill is IBM's trademark for a form of advanced Error Checking and Correcting (ECC) computer memory technology that protects computer memory systems from any single memory chip failure as well as multi-bit errors from any portion of a single memory chip. It performs this function by scattering the bits of an ECC word across multiple memory chips, such that the failure of any one memory chip will affect only one ECC bit. This allows memory contents to be reconstructed despite the complete failure of one chip. The equivalent system from Sun Microsystems is called Extended ECC. The equivalent system from HP is called Chipspare. A similar system from Intel is called SDDC.
Chipkill is frequently combined with dynamic bit-steering, so that if a chip fails (or has exceeded a threshold of bit errors), another, spare, memory chip is used to replace the failed chip. The concept is similar to that of RAID, which protects against disk failure, except that now the concept is applied to individual memory chips. The technology was developed by the IBM Corporation in the early and middle 1990s. An important RAS feature, Chipkill technology is deployed primarily on SSDs, mainframes and midrange Unix or Linux servers.
References
- Timothy J. Dell, A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory, (1997), IBM Microelectronics Division.
- Intel® E7500 Chipset MCH Intel®x4 Single Device Data Correction (x4 SDDC) Implementation and Validation, Intel Application note AP-726, August 2002.