In computer main memory, auxiliary storage and computer buses, data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can be simply a complete copy of the actual data, or only select pieces of data that allow detection of errors and reconstruction of lost or damaged data up to a certain level.
For example, by including additional data checksums, ECC memory is capable of detecting and correcting single-bit errors within each memory word, while RAID 1 combines two hard disk drives (HDDs) into a logical storage unit that allows stored data to survive a complete failure of one drive. Data redundancy can also be used as a measure against silent data corruption; for example, file systems such as Btrfs and ZFS use data and metadata checksumming in combination with copies of stored data to detect silent data corruption and repair its effects.
Bearing a different nature, data redundancy also occurs in database systems that have a field repeated in two or more tables. Also called database denormalization, it is usually used to improve performance of database queries (shorten the database response time), at the expense of complicating the database management, introducing the risk of corrupting the data, and increasing the required amount of storage.
For instance, when customer data are duplicated and attached with each product bought, then redundancy of data is a known source of inconsistency since customer might appear with different values for given attribute. Data redundancy leads to data anomalies and corruption and generally should be avoided by design; applying database normalization prevents redundancy and makes the best possible usage of storage. At the same time, proper use of foreign keys can minimize data redundancy and chance of destructive anomalies. However, concerns of efficiency and convenience can sometimes result in redundant data design despite the risk of corrupting the data.