|This article needs additional citations for verification. (July 2010)|
Data loss is an error condition in information systems in which information is destroyed by failures or neglect in storage, transmission, or processing. Information systems implement backup and disaster recovery equipment and processes to prevent data loss or restore lost data.
Data loss is distinguished from data unavailability, which may arise from a network outage. Although the two have substantially similar consequences for users, data unavailability is temporary, while data loss may be permanent. Data loss is also distinct from data breach, incident where data falls into the wrong hands, although the term data loss has been used in those incidents.
Types of data loss
- Intentional Action
- Intentional deletion of a file or program
- Unintentional Action
- Accidental deletion of a file or program
- Misplacement of CDs or Memory sticks
- Administration errors
- Inability to read unknown file format
- Power failure, resulting in data in volatile memory not being saved to permanent memory.
- Hardware failure, such as a head crash in a hard disk.
- A software crash or freeze, resulting in data not being saved.
- Software bugs or poor usability, such as not confirming a file delete command.
- Business failure (vendor bankruptcy), where data is stored with a software vendor using Software-as-a-service and SaaS data escrow has not been provisioned.
- Data corruption, such as file system corruption or database corruption.
Studies show hardware failure and human error are the two most common causes of data loss, accounting for roughly three quarters of all incidents. Another cause of data loss is a natural disaster. While the probability of data loss due to natural disaster is small, the only way to prepare for such an event is to store backup data in a separate physical location.
Cost of data loss
The cost of a data loss event is directly related to the value of the data and the length of time that it is needed, but unavailable. Consider:
- The cost of continuing without the data
- The cost of recreating the data
- The cost of notifying users in the event of a compromise
The frequency of data loss and the impact can be greatly mitigated by taking proper precautions. The different types of data loss demand different types of precautions. For example, multiple power circuits with battery backup and a generator only protect against power failures. Similarly, using a journaling file system and RAID storage only protect against certain types of software and hardware failure. Regular data backups are an important asset to have when trying to recover after a data loss event, but they do not prevent user errors or system failures.
Data recovery is often performed by specialized commercial services that have developed often proprietary methods to recover data from physically damaged media. Service costs at data recovery labs are usually dependent on type of damage and type of storage medium, as well as the required security or cleanroom procedures.
File system corruption can frequently be repaired by the user or the system administrator. For example, a deleted file is typically not immediately overwritten on disk, but more often simply has its entry deleted from the file system index. In such a case, the deletion can be easily reversed.
Successful recovery from data loss generally requires implementation of an effective backup strategy. Without an implemented backup strategy, recovery requires reinstallation of programs and regeneration of data. Even with an effective backup strategy, restoring a system to the precise state it was in prior to the Data Loss Event is extremely difficult. Some level of compromise between granularity of recoverability and cost is necessary. Furthermore, a Data Loss Event may not be immediately apparent. An effective backup strategy must also consider the cost of maintaining the ability to recover lost data for long periods of time.
A highly effective backup system would have duplicate copies of every file and program that were immediately accessible whenever a Data Loss Event was noticed. However, in most situations, there is an inverse correlation between the value of a unit of data and the length of time it takes to notice the loss of that data. Taking this into consideration, many backup strategies decrease the granularity of restorability as the time increases since the potential Data Loss Event. By this logic, recovery from recent Data Loss Events is easier and more complete than recovery from Data Loss Events that happened further in the past.
Recovery is also related to the type of Data Loss Event. Recovering a single lost file is substantially different from recovering an entire system that was destroyed in a disaster. An effective backup regimen has some proportionality between the magnitude of Data Loss and the magnitude of effort required to recover. For example, it should be far easier to restore the single lost file than to recover the entire system.
Initial steps upon data loss
If a data loss occurs, there are steps to take to increase the chances of a successful recovery. First, avoid all write operations to the affected storage device. Avoiding write operations includes not starting the system connected to the affected device. Many operating systems create temporary files or files required for booting, and these files may occupy or overwrite the area of lost data, rendering it partially or completely unrecoverable. Other write operations such as copying, deleting, or altering the files should also be avoided, as well.
Upon realizing data loss has occurred, it is often best to shut down the computer and remove the drive in question from the unit. Re-attach this drive to a secondary computer with a write blocker device, and attempt to recover lost data.