Data scrubbing

Data scrubbing is an error correction technique which uses a background task that periodically inspects memory for errors, and then corrects the error using a copy of the data. It reduces the likelihood that single correctable errors will accumulate; thus, reducing the risk of uncorrectable errors.

Introduction

Data integrity is a high priority in the writing, reading, storage, transmission, or processing of computer data, in computer operating-systems and in computer storage and transmission systems. However, it is known^{[by whom?]} that hardware RAID has some issues with data integrity and that none of the currently widely-used file-systems provide sufficient protection against data corruption.^[1]^[2]^[3]^[4]^[5] To address this issue, data scrubbing provides routine checks of all inconsistencies in data and, in general, prevention of hardware or software failure. This "scrubbing" feature occurs commonly in memory, disk arrays, FPGAs or file systems as a mechanism of data error-detection and -correction.^[6]

In RAID or Similar Disk Filesystems

With data scrubbing, the RAID controller periodically reads all the disks in a RAID array and checks for defective blocks before applications actually access them. This reduces the probability of silent data corruption and data loss due to bit errors.^[7]

In Dell PowerEdge RAID environments a "patrol read" can perform data scrubbing and preventative maintenance.^[8]

In Linux MD RAID

Linux MD RAID, as a software RAID implementation, makes data consistency checks available and provides automated repairing of detected data inconsistencies. Such procedures are usually performed by setting up a weekly cron job. Maintenance is performed by issuing operations check, repair or idle to each of the examined MD devices. Statuses of all performed operations, as well as general RAID statuses, are always available.^[9]^[10]^[11]

In Btrfs

Btrfs — as a still experimental copy-on-write (CoW) file system for Linux — provides fault isolation, corruption detection and correction, and file system scrubbing. If the file system detects a checksum mismatch while reading a block, it first tries to obtain (or create) a good copy of this block from another device — if internal mirroring or RAID techniques are in use. Btrfs can initiate an online check of the entire file system by triggering a file system scrub job that is performed in the background. The scrub job scans the entire file system for integrity and automatically attempts to report and repair any bad blocks it finds along the way.^[12]^[13] ^[14]

In ZFS

ZFS, a combined file system and logical volume manager, features (among other things) verification against data corruption modes, continuous integrity checking and automatic repair. Sun Microsystems designed ZFS from the ground up with a focus on data integrity and to protect the data on disks against bugs in disk firmware, ghost writes, and so on.^[15]

ZFS has a repair software-tool called "scrub" which examines and repairs silent data corruption caused by bit rot and other problems.

In FPGA

Scrubbing is a technique used to reprogram an FPGA. It can be used periodically to avoid the accumulation of errors without the need to find one in the configuration bitstream, thus simplifying the design.^[16]

In memory

Memory scrubbing does error-detection and correction of bit errors in computer memory by using ECC memory or other copies of the data or other error-detecting codes.

References

^ E. g. such as UFS, Ext, XFS, JFS, or NTFS Vijayan Prabhakaran (2006). "IRON FILE SYSTEMS" (PDF). Doctor of Philosophy in Computer Sciences. University of Wisconsin-Madison. Retrieved 9 June 2012.
^ "Parity Lost and Parity Regained".
^ "An Analysis of Data Corruption in the Storage Stack" (PDF).
^ "Impact of Disk Corruption on Open-Source DBMS" (PDF).
^ "Baarf.com". Baarf.com. Retrieved November 4, 2011.
^ "Checking ZFS File System Integrity". Oracle Solaris ZFS Administration Guide. Oracle. Retrieved 25 November 2012.
^ Ulf Troppens, Wolfgang Mueller-Friedt, Rainer Erkens, Rainer Wolafka, Nils Haustein. Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, ISCSI,InfiniBand and FCoE. John Wiley and Sons, 2009. p.39
^ "About PERC 6 and CERC 6i Controllers". Retrieved 2013-06-20. The Patrol Read feature is designed as a preventative measure to ensure physical disk health and data integrity. Patrol Read scans for and resolves potential problems on configured physical disks.
^ "RAID Administration". Linux Raid Wiki. Retrieved 2013-09-20.
^ "Software RAID and LVM: Data scrubbing". Arch Linux. Retrieved 2013-09-20.
^ "md.txt kernel documentation". Linux Kernel Documentation. Retrieved 2013-09-20.
^ "btrfs Wiki: Features". The btrfs Project. Retrieved 2013-09-20.
^ Bierman, Margaret; Grimmer, Lenz (August 2012). "How I Use the Advanced Capabilities of Btrfs". Retrieved 2013-09-20.
^ Coekaerts, Wim (2011-09-28). "btrfs scrub - go fix corruptions with mirror copies please!". Retrieved 2013-09-20.
^ Bonwick, Jeff (2005-12-08). "ZFS End-to-End Data Integrity". Retrieved 2013-09-19.
^ FPGAs on Mars

External links

Soft Errors in Electronic Memory

[1] E. g. such as UFS, Ext, XFS, JFS, or NTFS Vijayan Prabhakaran (2006). "IRON FILE SYSTEMS" (PDF). Doctor of Philosophy in Computer Sciences. University of Wisconsin-Madison. Retrieved 9 June 2012.

[2] "Parity Lost and Parity Regained".

[3] "An Analysis of Data Corruption in the Storage Stack" (PDF).

[4] "Impact of Disk Corruption on Open-Source DBMS" (PDF).

[5] "Baarf.com". Baarf.com. Retrieved November 4, 2011.

[oracle-scrubbing-6] "Checking ZFS File System Integrity". Oracle Solaris ZFS Administration Guide. Oracle. Retrieved 25 November 2012.

[7] Ulf Troppens, Wolfgang Mueller-Friedt, Rainer Erkens, Rainer Wolafka, Nils Haustein. Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, ISCSI,InfiniBand and FCoE. John Wiley and Sons, 2009. p.39

[8] "About PERC 6 and CERC 6i Controllers". Retrieved 2013-06-20. The Patrol Read feature is designed as a preventative measure to ensure physical disk health and data integrity. Patrol Read scans for and resolves potential problems on configured physical disks.

[9] "RAID Administration". Linux Raid Wiki. Retrieved 2013-09-20.

[10] "Software RAID and LVM: Data scrubbing". Arch Linux. Retrieved 2013-09-20.

[11] "md.txt kernel documentation". Linux Kernel Documentation. Retrieved 2013-09-20.

[12] "btrfs Wiki: Features". The btrfs Project. Retrieved 2013-09-20.

[13] Bierman, Margaret; Grimmer, Lenz (August 2012). "How I Use the Advanced Capabilities of Btrfs". Retrieved 2013-09-20.

[14] Coekaerts, Wim (2011-09-28). "btrfs scrub - go fix corruptions with mirror copies please!". Retrieved 2013-09-20.

[15] Bonwick, Jeff (2005-12-08). "ZFS End-to-End Data Integrity". Retrieved 2013-09-19.

[16] FPGAs on Mars

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]