Disk compression

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A disk compression software utility increases the amount of information that can be stored on a hard disk drive of given size. Unlike a file compression utility which compresses only specified files - and which requires the user designate the files to be compressed - a disk compression utility works automatically without the user needing to be aware of its existence. When information needs to be stored to the hard disk, the utility will compress the information. When information needs to be read, the utility will decompress the information.

Early Disk Compression Utilities[edit]

Early disk compression utilities override the standard operating system routines. Since all software applications access the hard disk using these routines, they continue to work after disk compression has been installed.

Disk compression utilities were popular especially in the early 1990s, when microcomputer hard disks were still relatively small (20 to 80 megabytes). Hard drives were also rather expensive at the time, costing roughly 10 USD per megabyte. For the users who bought disk compression applications, the software proved to be in the short term a more economic means of acquiring more disk space as opposed to replacing their current drive with a larger one. A good disk compression utility could, on average, double the available space with negligible speed loss. Disk compression fell into disuse by the late 1990s, as advances in hard drive technology and manufacturing led to increased capacities and lower prices.

Note: While the most familiar disk compression utilities were designed to work on DOS systems, the concept was not specific to DOS. The utility DiskDoubler, for example, worked on the Apple Macintosh platform.

Modern Disk Compression Utilities[edit]

Starting in the 2010s, increased hard drive costs due to the advent of fast but expensive SSD technology, and the growth of fixed storage slate tablets with non-upgradeable storage, led to a resurgence of disk compression utilities. Windows 10 includes a new option[1] to compress Windows binaries and program files. ZIPmagic offers[2] three disk compression solutions.

Modern disk compression solutions do not override the native operating system routines, but leverage compression technologies built into the operating system. For example, ZIPmagic DriveSpace[3] is based on NTFS compression, a feature of Windows[4] for two decades, first introduced in Windows NT 3.51; ZIPmagic DoubleSpace[5] similarly extends WIMBoot,[6] a very recent Windows feature.

Common early disk compression utilities[edit]

Standalone utilities[edit]

The initial compression utilities were sold independently. A user had to specifically choose to install and configure the software.

  • Stacker from Stac Electronics
  • XtraDrive from Integrated Information Technology ( IIT)
  • SuperStor Pro from AddStor
  • DoubleDisk Gold from Vertisoft Systems
  • DiskDoubler from Salient Software

Bundled utilities[edit]

The idea of bundling disk compression into new machines appealed to resellers and users. Resellers liked that they could claim more storage space; users liked that they did not have to configure the software. Bundled utilities included:

Other utilities[edit]

While Windows XP, from Microsoft, included both a native support and a command line utility named 'compact' that compresses files on NTFS systems, that is not implemented as a separate "compressed drive" like those above.

How early stage disk compression works[edit]

Disk compression usually creates a single large file, which becomes a virtual hard drive. This is similar to how a single physical hard drive can be partitioned into multiple virtual drives. The compressed drive is accessed via a device driver.

Compressing existing drives[edit]

All drives would initially be empty. The utility to create a drive would usually offer to "compress a current drive". This meant the utility would:

  1. Create an empty compressed drive, stored on the existing drive.
  2. Transfer existing files on the old drive to the new compressed drive.
  3. Increase the size of the new compressed drive as necessary to accommodate more files and allow empty space when done.
  4. When all files were transferred, the drive letters would be swapped.

Usually certain system files would not be transferred. For example, OS swap files would remain only on the host drive.

Compressing the boot drive[edit]

Note that the device driver had to be loaded to access the compressed drive. A compressed drive C: required changes to the boot process as follows:

  1. BIOS loads sector 0 of the first physical hard drive (partition sector)
  2. Partition sector loads sector 0 of the bootable partition. In this case, it's the host drive.
  3. Host drive sector 0 loads (in the case of MS-DOS) IO.SYS and begins Config.Sys processing
  4. Compression device driver is loaded. Compressed drive becomes C; host drive usually became F.
  5. Processing continues from compressed drive.

Performance Impacts[edit]

On systems with slower hard drives, disk compression could actually increase system performance. This was accomplished two ways:

  1. Once compressed, there was less data to be stored.
  2. Disk accesses would often be batched together for efficiency.

If the system had to frequently wait for hard drive access to complete (IO bound) converting the hard drive to compressed drives could speed up the system significantly. Compression and decompression of the data will increase the CPU utilization. If the system was already CPU bound, disk compression will decrease overall performance.

Drawbacks[edit]

Some common drawbacks to using disk compression:

  • Not all compression utilities would confirm the absence of errors in the file system before compressing a disk in place. Some errors, such as crosslinked files, could result in additional data loss during the transfer process.[7]
  • The compressed drive is only visible if the device driver is loaded and the compressed drive is mounted. A boot disk, for example, might not contain the driver.
  • Users did not always realize that the large file on the host drive contained the compressed drive. While it was usually "hidden" by default,[8] users who did find the large file curious or suspicious were able to delete it. This would normally result in data loss.

Modern disk compression utilities[edit]

  • Windows 10 Disk Cleanup Utility from Microsoft (OS based)
  • DoubleSpace from ZIPmagic Software (WIMBoot based)
  • DriveSpace from ZIPmagic Software (NTFS compression based)

How modern disk compression works[edit]

WIMBoot[edit]

WIMBoot creates a single compressed WIM file which is read-only and highly compressed. Pointers are extracted from the WIM file to the target file system; these pointers work in conjunction with the WoF (Windows overlay Filter) driver to serve actual file data from the compressed WIM file. Unlike early stage disk compression tools, when these files are updated, the changes are not compressed or stored back inside the WIM file; instead, any time a file is opened for write access, the WoF drive fully extracts the original compressed file to the target file system - from this point onwards, the file is uncompressed. One drawback of this approach is that space is wasted on disk because of duplicated storage for files extracted in this way: Files opened for write access under WIMBoot consume space both inside the WIM file and outside of it, uncompressed.

NTFS[edit]

NTFS compression is transparent and works on a file-by-file basis. Compressing an entire disk with NTFS compression processes each file on disk individually. As files are read from or written to, data is transparently decompressed and recompressed. In the case of large files, the recompression process often continues in the background for an indeterminate amount of time, until the file system has been able to transparently compress all file data. While NTFS compression is very fast (having been designed in the early 1990s), its compression ratios are not nearly as impressive as WIMBoot.

Performance[edit]

NTFS compression has been demonstrated[9] to significantly improve read performance on compressed systems, with similar expectations for WIMBoot. Write performance is typically not enhanced but tolerably degraded with either technology.

See also[edit]

References[edit]

  1. ^ http://www.zdnet.com/hands-on-with-windows-10-preview-build-9879_p3-7000035759/
  2. ^ http://tabtimes.com/news/ittech-os-windows/2014/07/21/zipmagic-compression-doubles-storage-windows-8-tablets
  3. ^ http://www.zip-magic.com/drive-space.html
  4. ^ http://technet.microsoft.com/en-us/library/cc767961.aspx
  5. ^ http://www.zip-magic.com/doublespace.html
  6. ^ http://technet.microsoft.com/en-us/library/dn594399.aspx
  7. ^ In crosslinked files, two files are storing at least part of their data in the same location. At least part of one file (the "bad" file) is always lost in this instance. However, if the "bad" file is copied and then deleted, part of the "good" file is deleted as well. Microsoft ScanDisk was created, in part, to perform a better check of the file system prior to compression than the old MS-DOS CHKDSK utility.
  8. ^ For example, DOS associated up to four attributes with files: System, Hidden, Read-Only, and Archivable. Files with the System or Hidden attributes are often not displayed by default. Files with the System or Read-Only attribute cannot be deleted with the "Erase" (or "Del") DOS command. Most compression utilities would mark the drive file with at least one or more of the System, Hidden, and Read-Only attributes (many would use all three). However, files marked with such attributes can be viewed and deleted by other means. In addition, the user can also remove attributes.
  9. ^ http://forums.anandtech.com/showthread.php?t=2249021

External links[edit]