Jump to content

Disk image: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
merge from virtual disk image
Line 15: Line 15:


Disk images are used heavily for duplication of optical media including DVDs, Blu-ray disks, etc. It is also used to make perfect clones of hard disks.
Disk images are used heavily for duplication of optical media including DVDs, Blu-ray disks, etc. It is also used to make perfect clones of hard disks.

=== Virtualization ===

A hard disk image is interpreted by a [[Virtual Machine Monitor]] as a system [[hard disk drive]]. IT administrators and software developers administer them through offline operations using built-in or third-party tools. In terms of naming, a hard disk image for a certain Virtual Machine monitor has a specific file type extension, e.g., .vmdk for [[VMDK|VMware VMDK]], .vhd for [[VHD (file format)|Xen and Microsoft Hyper-V]], .vdi for [[VDI (file format)|Oracle VM VirtualBox]], etc.

Hard drive imaging is used in several major application areas:
*''Forensic imaging'' or ''acquisition'' is the process where the entire drive contents are imaged to a file and checksum values are calculated to verify the integrity (in court cases) of the image file (often referred to as a “hash value”). Forensic images are acquired with the use of software tools. (Some hardware cloning tools have added forensic functionality.)
* ''Drive cloning'', as previously mentioned, is typically used to replicate the contents of the hard drive for use in another system. This can typically be done by software-only programs as it typically only requires the cloning of file structure and files themselves.
* ''Data recovery imaging'' (like forensic imaging) is the process of imaging every single sector on the source drive to another medium from which required files can be retrieved. In data recovery situations, one cannot rely on the integrity of the file structure and therefore a complete sector copy is mandatory (also similar to forensic imaging). The similarities to forensic imaging end there though. Forensic images are typically acquired using software tools such as [[EnCase]] and FTK. However, forensic imaging software tools have significantly limited ability to deal with drives that have hard errors (which is often the case in data recovery and why the drive was submitted for recovery in the first place).
:Data recovery imaging must have the ability to pre-configure drives by disabling certain attributes (such as SMART and G-List re-mapping) and the ability to work with unstable drives (drive instability/read instability can be caused by minute mechanical wear and other issues). Data recovery imaging must have the ability to read data from “bad sectors.” Read instability is a major factor when working with drives in operating systems such as Windows. A typical [[operating system]] is limited in its ability to deal with drives that take a long time to read. For these reasons, software that relies on the BIOS and operating system to communicate with the hard drive is often unsuccessful in data recovery imaging; separate hardware control of the source hard drive is required to achieve the full spectrum of data recovery imaging. This is because the operating system (through the BIOS) has a certain set of protocols or rules for communication with the drive that cannot be violated (such as when the hard drive detects a bad sector). A hard drive’s protocols may not allow “bad” data to be propagated through to the operating system; firmware on the drive may compensate by rereading sectors until checksums, CRCs, or ECCs pass, or use ECC data to recreate damaged data.
:Data recovery images may or may not make use of any type of image file. Typically, a data recovery image is performed drive to drive and therefore no image file is required.

There are two schemes predominant across all Virtual Machine Monitor implementations:

# Preallocate the entire storage for the virtual disk upon creation
# Dynamically grow the storage on demand

The virtual disk is implemented as either split over a collection of flat files, typically each one is 2GB in size, collectively called a ''split'' flat file, or as a single, large ''monolithic'' flat file. The pre-allocated storage scheme is also referred to as a ''thick provisioning'' <ref name="vmwesx">{{Cite web | url=http://www.vmware.com/pdf/vsphere4/r40/vsp_40_esx_server_config.pdf | title=VMWare ESX Configuration Guide | format=PDF | date=18 May 2010 | publisher = VMware, Inc. | accessdate=10 December 2010}}</ref> scheme.

The virtual disk can again be implemented using split or monolithic files, except that storage is allocated on demand. Several Virtual Machine Monitor implementations initialize the storage with zeros before providing it to the virtual machine that is in operation. The dynamic growth storage scheme is also referred to as a ''thin provisioning''<ref name="vmwesx" /> scheme.

There are two modes in which a raw disk can be mapped for use by a virtual machine:

; Virtual mode: The mapped disk is presented as if it is a logical volume, or a virtual disk file, to the guest operating system and its real hardware characteristics are hidden. In this mode, file locking provides data protection through isolation for concurrent updates; the copy on write operation enables snapshots. Virtual mode also offers portability across storage hardware because it presents the consistent behavior as a virtual disk file.
; Physical mode: In this mode, also called the pass through mode, the Virtual Machine Monitor bypasses the I/O virtualization layer and passes all I/O commands directly to the device. All physical characteristics of the underlying hardware are exposed to the guest operating system. There is no file locking to provide data protection.


===Software distribution===
===Software distribution===
Line 37: Line 62:
==Imaging process==
==Imaging process==
Creating a disk image is achieved with a suitable program. Different disk imaging programs have varying capabilities, and may focus on hard drive imaging (including [[hard drive]] [[backup]], restore and rollout), or [[optical media]] imaging (CD/DVD images).
Creating a disk image is achieved with a suitable program. Different disk imaging programs have varying capabilities, and may focus on hard drive imaging (including [[hard drive]] [[backup]], restore and rollout), or [[optical media]] imaging (CD/DVD images).

===Hard drive imaging===
Hard drive imaging is used in several major application areas:
*''Forensic imaging'' or ''acquisition'' is the process where the entire drive contents are imaged to a file and checksum values are calculated to verify the integrity (in court cases) of the image file (often referred to as a “hash value”). Forensic images are acquired with the use of software tools. (Some hardware cloning tools have added forensic functionality.)
* ''Drive cloning'', as previously mentioned, is typically used to replicate the contents of the hard drive for use in another system. This can typically be done by software-only programs as it typically only requires the cloning of file structure and files themselves.
* ''Data recovery imaging'' (like forensic imaging) is the process of imaging every single sector on the source drive to another medium from which required files can be retrieved. In data recovery situations, one cannot rely on the integrity of the file structure and therefore a complete sector copy is mandatory (also similar to forensic imaging). The similarities to forensic imaging end there though. Forensic images are typically acquired using software tools such as [[EnCase]] and FTK. However, forensic imaging software tools have significantly limited ability to deal with drives that have hard errors (which is often the case in data recovery and why the drive was submitted for recovery in the first place).
:Data recovery imaging must have the ability to pre-configure drives by disabling certain attributes (such as SMART and G-List re-mapping) and the ability to work with unstable drives (drive instability/read instability can be caused by minute mechanical wear and other issues). Data recovery imaging must have the ability to read data from “bad sectors.” Read instability is a major factor when working with drives in operating systems such as Windows. A typical [[operating system]] is limited in its ability to deal with drives that take a long time to read. For these reasons, software that relies on the BIOS and operating system to communicate with the hard drive is often unsuccessful in data recovery imaging; separate hardware control of the source hard drive is required to achieve the full spectrum of data recovery imaging. This is because the operating system (through the BIOS) has a certain set of protocols or rules for communication with the drive that cannot be violated (such as when the hard drive detects a bad sector). A hard drive’s protocols may not allow “bad” data to be propagated through to the operating system; firmware on the drive may compensate by rereading sectors until checksums, CRCs, or ECCs pass, or use ECC data to recreate damaged data.
:Data recovery images may or may not make use of any type of image file. Typically, a data recovery image is performed drive to drive and therefore no image file is required.


==File formats==
==File formats==
Line 74: Line 91:
* [[Virtual disk image]]
* [[Virtual disk image]]
* [[Virtual drive]]
* [[Virtual drive]]

== References ==

{{refs|30em}}


== External links ==
== External links ==
Line 81: Line 102:
{{Disk images}}
{{Disk images}}


[[Category:Disk images]]
[[Category:Computer file formats]]
[[Category:Archive formats]]
[[Category:Archive formats]]
[[Category:Computer file formats]]
[[Category:Disk images]]
[[Category:Virtualization software]]
[[Category:Virtual machines]]


[[cs:Diskový obraz]]
[[cs:Diskový obraz]]
[[de:Speicherabbild]]
[[de:Speicherabbild]]
[[es:Imagen de disco]]
[[es:Imagen de disco]]
[[fi:Levykuva]]
[[fr:Image disque]]
[[fr:Image disque]]
[[ko:디스크 이미지]]
[[it:Immagine disco]]
[[he:אימג']]
[[he:אימג']]
[[hu:Lemezképfájl]]
[[hu:Lemezképfájl]]
[[it:Immagine disco]]
[[ko:디스크 이미지]]
[[ms:Imej cakera]]
[[ms:Imej cakera]]
[[pl:Obraz (informatyka)]]
[[pl:Obraz (informatyka)]]
[[pl:Plikopartycja]]
[[pt:Imagem de disco]]
[[pt:Imagem de disco]]
[[ru:Образ диска]]
[[ru:Образ диска]]
[[simple:Disk image]]
[[simple:Disk image]]
[[fi:Levykuva]]
[[sv:Skivavbild]]
[[sv:Skivavbild]]
[[ta:வட்டுப் படிவம்]]
[[ta:வட்டுப் படிவம்]]

Revision as of 09:39, 13 December 2011

A disk image is a single file or storage device containing the complete contents and structure representing a data storage medium or device, such as a hard drive, tape drive, floppy disk, optical disc, or USB flash drive. A disk image is usually created by creating a complete sector-by-sector copy of the source medium and thereby perfectly replicating the structure and contents of a storage device.

Some disk imaging utilities omit unused file space from source media, or compress the disk they represent to reduce storage requirements, though these are typically referred to as archive files, as they are not literally disk images.

Disk image file formats may be open standards, such as the ISO image format for optical disc images, or proprietary to particular software applications.

History

Disk images were originally used for backup and disk cloning of floppy disk media, where replication or storage of an exact structure was necessary and efficient.

Uses

Disk images are used heavily for duplication of optical media including DVDs, Blu-ray disks, etc. It is also used to make perfect clones of hard disks.

Virtualization

A hard disk image is interpreted by a Virtual Machine Monitor as a system hard disk drive. IT administrators and software developers administer them through offline operations using built-in or third-party tools. In terms of naming, a hard disk image for a certain Virtual Machine monitor has a specific file type extension, e.g., .vmdk for VMware VMDK, .vhd for Xen and Microsoft Hyper-V, .vdi for Oracle VM VirtualBox, etc.

Hard drive imaging is used in several major application areas:

  • Forensic imaging or acquisition is the process where the entire drive contents are imaged to a file and checksum values are calculated to verify the integrity (in court cases) of the image file (often referred to as a “hash value”). Forensic images are acquired with the use of software tools. (Some hardware cloning tools have added forensic functionality.)
  • Drive cloning, as previously mentioned, is typically used to replicate the contents of the hard drive for use in another system. This can typically be done by software-only programs as it typically only requires the cloning of file structure and files themselves.
  • Data recovery imaging (like forensic imaging) is the process of imaging every single sector on the source drive to another medium from which required files can be retrieved. In data recovery situations, one cannot rely on the integrity of the file structure and therefore a complete sector copy is mandatory (also similar to forensic imaging). The similarities to forensic imaging end there though. Forensic images are typically acquired using software tools such as EnCase and FTK. However, forensic imaging software tools have significantly limited ability to deal with drives that have hard errors (which is often the case in data recovery and why the drive was submitted for recovery in the first place).
Data recovery imaging must have the ability to pre-configure drives by disabling certain attributes (such as SMART and G-List re-mapping) and the ability to work with unstable drives (drive instability/read instability can be caused by minute mechanical wear and other issues). Data recovery imaging must have the ability to read data from “bad sectors.” Read instability is a major factor when working with drives in operating systems such as Windows. A typical operating system is limited in its ability to deal with drives that take a long time to read. For these reasons, software that relies on the BIOS and operating system to communicate with the hard drive is often unsuccessful in data recovery imaging; separate hardware control of the source hard drive is required to achieve the full spectrum of data recovery imaging. This is because the operating system (through the BIOS) has a certain set of protocols or rules for communication with the drive that cannot be violated (such as when the hard drive detects a bad sector). A hard drive’s protocols may not allow “bad” data to be propagated through to the operating system; firmware on the drive may compensate by rereading sectors until checksums, CRCs, or ECCs pass, or use ECC data to recreate damaged data.
Data recovery images may or may not make use of any type of image file. Typically, a data recovery image is performed drive to drive and therefore no image file is required.

There are two schemes predominant across all Virtual Machine Monitor implementations:

  1. Preallocate the entire storage for the virtual disk upon creation
  2. Dynamically grow the storage on demand

The virtual disk is implemented as either split over a collection of flat files, typically each one is 2GB in size, collectively called a split flat file, or as a single, large monolithic flat file. The pre-allocated storage scheme is also referred to as a thick provisioning [1] scheme.

The virtual disk can again be implemented using split or monolithic files, except that storage is allocated on demand. Several Virtual Machine Monitor implementations initialize the storage with zeros before providing it to the virtual machine that is in operation. The dynamic growth storage scheme is also referred to as a thin provisioning[1] scheme.

There are two modes in which a raw disk can be mapped for use by a virtual machine:

Virtual mode
The mapped disk is presented as if it is a logical volume, or a virtual disk file, to the guest operating system and its real hardware characteristics are hidden. In this mode, file locking provides data protection through isolation for concurrent updates; the copy on write operation enables snapshots. Virtual mode also offers portability across storage hardware because it presents the consistent behavior as a virtual disk file.
Physical mode
In this mode, also called the pass through mode, the Virtual Machine Monitor bypasses the I/O virtualization layer and passes all I/O commands directly to the device. All physical characteristics of the underlying hardware are exposed to the guest operating system. There is no file locking to provide data protection.

Software distribution

On computers running Mac OS X, disk images are now ubiquitous for software downloads, typically downloaded with a web browser. The images are typically compressed Apple Disk Image (.dmg suffix) files. They are usually opened by directly mounting them without using a real disk.

Software packages for Windows are also sometimes distributed as disk images including ISO images. While Windows computers (although this functionality was introduced in Windows "7") do not natively support mounting disk images to the files system, several software options are available to do this; see Comparison of ISO image software.

System backup

Some backup programs only back up user files; boot information and files locked by the operating system, such as those in use at the time of the backup, may not be saved on some operating systems. A disk image contains all files, faithfully replicating all data. For this reason, it is also used for backing up CDs and DVDs.

Non-software type files can usually be backed up with file-based backup software, and this is preferred because file-based backup usually saves more time or space because they never copy unused space (as a bit-identical image does), they usually are capable of incremental backups, and generally have more flexibility. But for software files, file-based backup solutions may fail to reproduce all necessary characteristics, particularly with Windows systems. For example, in Windows certain registry keys use short filenames, which are sometimes not reproduced by file-based backup, some commercial software uses copy protection that will cause problems if a file is moved to a different disk sector, and file-based backups do not always reproduce metadata such as security attributes. Creating a bit-identical disk image is one way to ensure the system backup will be exactly as the original. Bit-identical images can be made in Linux with dd, available on nearly all live CDs.

Most commercial imaging software is "user-friendly" and "automatic" but may not create bit-identical images. These programs have most of the same advantages, except that they may allow restoring to partitions of a different size or file-allocation size, and thus may not put files on the same exact sector. Additionally, if they do not support Windows Vista, they may slightly move or realign partitions and thus make Vista unbootable (see Windows Vista startup process).

Rapid deployment of clone systems

Large enterprises often need to buy or replace new computer systems in large numbers. Installing operating system and programs in to each of them one by one requires a lot of time and effort and has a significant possibility of human error. Therefore, system administrators use disk imaging to quickly clone the fully prepared software environment of a reference system. This method saves time and effort and allows administrators to focus on unique distinctions that each system must bear.

Emulation

Emulators frequently use disk images to simulate the floppy drive of the computer being emulated. This is usually simpler to program than accessing a real floppy drive (particularly if the discs are in a format not supported by the host operating system), and allows a large library of software to be managed.

Imaging process

Creating a disk image is achieved with a suitable program. Different disk imaging programs have varying capabilities, and may focus on hard drive imaging (including hard drive backup, restore and rollout), or optical media imaging (CD/DVD images).

File formats

In most cases, a file format is tied to a particular software package. The software defines and uses its own, often proprietary, image format, though some formats are widely supported by competing products. An exception to proprietary image formats is the ISO image for optical discs, which collectively includes the ISO 9660 and Universal Disk Format (UDF) formats, both defined by open standards. These formats are supported by nearly all optical disc software packages.

Software

Software which can be used to write and access disk images include:

  • dd (Unix)
  • RawWrite and RawWrite2 (MS-DOS)
  • RawWrite for Windows (Microsoft Windows)

See also

References

  1. ^ a b "VMWare ESX Configuration Guide" (PDF). VMware, Inc. 18 May 2010. Retrieved 10 December 2010.