Magnetic tape data storage
Magnetic tape data storage uses digital recording on magnetic tape to store digital information. Modern magnetic tape is most commonly packaged in cartridges and cassettes. The device that performs actual writing or reading of data is a tape drive. Autoloaders and tape libraries are frequently used to automate cartridge handling.
When storing large amounts of data, tape can be substantially less expensive than disk or other data storage options. Tape storage has always been used with large computer systems. Modern usage is primarily as a high capacity medium for backups and archives. As of 2013[update], the highest capacity tape cartridges can store 8.5 TB of uncompressed data.
- 1 Open reels
- 2 Cartridges and cassettes
- 3 Technical details
- 4 Viability
- 5 Chronological list of tape formats
- 6 See also
- 7 References
- 8 External links
Initially, magnetic tape for data storage was wound on large (10.5 in/26.67 cm) reels. This defacto standard for large computer systems persisted through the late 1980s. Tape cartridges and cassettes were available as early as the mid-1970s and were frequently used with small computer systems. With the introduction of the IBM 3480 cartridge in 1984, large computer systems started to move away from open reel tapes and towards cartridges.
Magnetic tape was first used to record computer data in 1951 on the Eckert-Mauchly UNIVAC I. The UNISERVO drive recording medium was a thin metal strip of 0.5 inch (12.7 mm) wide nickel-plated phosphor bronze. Recording density was 128 characters per inch (198 micrometre/character) on eight tracks at a linear speed of 100 in/s (2.54 m/s), yielding a data rate of 12,800 characters per second. Of the eight tracks, six were data, one was a parity track, and one was a clock, or timing track. Making allowance for the empty space between tape blocks, the actual transfer rate was around 7,200 characters per second. A small reel of mylar tape provided separation from the metal tape and the read/write head.
IBM computers from the 1950s used ferrous-oxide coated tape similar to that used in audio recording. IBM's technology soon became the de facto industry standard. Magnetic tape dimensions were 0.5 inch (12.7 mm) wide and wound on removable reels up to 10.5 inches (267 mm) in diameter. Different tape lengths were available with 1,200 feet (370 m) and 2,400 feet (730 m) on mil and one half thickness being somewhat standard.[clarification needed] During the 1980s, longer tape lengths such as 3,600 feet (1,100 m) became available using a much thinner PET film. Most tape drives could support a maximum reel size of 10.5 inches (267 mm).
A so-called mini-reel was common for software distribution. These were 7 inches (18 cm) reels, often with no fixed length—the tape was sized to fit the amount of data recorded on it as a cost-saving measure.
Early IBM tape drives, such as the IBM 727 and IBM 729, were mechanically sophisticated floor-standing drives that used vacuum columns to buffer long u-shaped loops of tape. Between active control of powerful reel motors and vacuum control of these u-shaped tape loops, fast start and stop of the tape at the tape-to-head interface could be achieved: 1.5 ms from stopped tape to full speed of 112.5 inches per second (2.86 m/s). The fast acceleration is possible because the tape mass in the vacuum columns is small; the length of tape buffered in the columns provides time to spin the high inertia reels. When active, the two tape reels thus fed tape into or pulled tape out of the vacuum columns, intermittently spinning in rapid, unsynchronized bursts resulting in visually striking action. Stock shots of such vacuum-column tape drives in motion were widely used to represent "the computer" in movies and television.
Early half-inch tape had 7 parallel tracks of data along the length of the tape allowing, six-bit characters plus one bit of parity written across the tape. This was known as 7-track tape. With the introduction of the IBM System 360 mainframe, 9 track tapes were developed to support the new 8-bit characters that it used. Effective recording density increased over time. Common 7-track densities started at 200, then 556, and finally 800 cpi and 9-track tapes had densities of 800, 1600, and 6250 cpi. This translates into about 5 MB to 140 MB per standard length (2400 ft) reel of tape. End of file was designated by a tape mark and end of tape by two tape marks.
At least partly due to the success of the S/360, 9-track tapes were very widely used throughout the industry during the 1970s and 1980s.
LINCtape, and its derivative, DECtape, were variations on this "round tape." They were essentially a personal storage medium. The tape was .75 inches (19 mm) wide and featured a fixed formatting track which, unlike standard tape, made it feasible to read and rewrite blocks repeatedly in place. LINCtapes and DECtapes had similar capacity and data transfer rate to the diskettes that displaced them, but their "seek times" were on the order of thirty seconds to a minute. The IRG of Magnetic tape is .75 inches (19 mm).
Cartridges and cassettes
In the context of magnetic tape, the term cassette usually refers to an enclosure that holds two reels with a single span of magnetic tape. The term cartridge is more generic, but frequently means a single reel of tape in a plastic enclosure.
The type of packaging is a large determinant of the load and unload times as well as the length of tape that can be held. A tape drive that uses a single reel cartridge has a takeup reel in the drive while cassettes have the take up reel in the cassette. A tape drive (or "transport" or "deck") uses precisely controlled motors to wind the tape from one reel to the other, passing a read/write head as it does.
A different type of tape cartridge has a continuous loop of tape wound on a special reel that allows tape to be withdrawn from the center of the reel and then wrapped up around the edge. This type is similar to a cassette in that there is no take-up reel inside the tape drive.
The IBM 7340 Hypertape drive, introduced in 1961 used a cassette with a 1 inch (2.5 cm) wide tape capable of holding 2 million 6-bit characters per cassette.
In the 1970s and 1980s, audio Compact Cassettes were frequently used as an inexpensive data storage system for home computers. Compact cassettes were logically, as well as physically, sequential; they had to be rewound and read from the start to load data. Early cartridges were available before personal computers had affordable disk drives, and could be used as random access devices, automatically winding and positioning the tape, albeit with access times of many seconds.
Most modern magnetic tape systems use reels that are fixed inside a cartridge to protect the tape and facilitate handling. Modern cartridge formats include DDS/DAT, DLT and LTO with capacities in the tens to thousands of gigabytes.
Medium width is the primary classification criterion for tape technologies. Half inch has historically been the most common width of tape for high capacity data storage. Many other sizes exist and most were developed to either have smaller packaging or higher capacity.
Recording method is also an important way to classify tape technologies, generally falling into two categories:
The linear method arranges data in long parallel tracks that span the length of the tape. Multiple tape heads simultaneously write parallel tape tracks on a single medium. This method was used in early tape drives. It is the simplest recording method, but has the lowest data density.
A variation on linear technology is linear serpentine recording, which uses more tracks than tape heads. Each head still writes one track at a time. After making a pass over the whole length of the tape, all heads shift slightly and make another pass in the reverse direction, writing another set of tracks. This procedure is repeated until all tracks have been read or written. By using the linear serpentine method, the tape medium can have many more tracks than read/write heads. Compared to simple linear recording, using the same tape length and the same number of heads, the data storage capacity is substantially higher.
Scanning recording methods write short dense tracks across the width of the tape medium, not along the length. Tape heads are placed on a drum or disk which rapidly rotates while the relatively slowly moving tape passes it.
An early method used to get a higher data rate than the prevailing linear method was transverse scan. In this method a spinning disk, with the tape heads embedded in the outer edge, is placed perpendicular to the path of the tape. This method is used in Ampex's DCRsi instrumentation data recorders and the old Ampex quadruplex videotape system. Another early method was arcuate scan. In this method, the heads are on the face of a spinning disk which is laid flat against the tape. The path of the tape heads makes an arc.
In a typical format, data is written to tape in blocks with inter-block gaps between them, and each block is written in a single operation with the tape running continuously during the write. However, since the rate at which data is written or read to the tape drive is not deterministic, a tape drive usually has to cope with a difference between the rate at which data goes on and off the tape and the rate at which data is supplied or demanded by its host.
Various methods have been used alone and in combination to cope with this difference. The tape drive can be stopped, backed up, and restarted (known as shoe-shining, because of increased wear of both medium and head). A large memory buffer can be used to queue the data. The host can assist this process by choosing appropriate block sizes to send to the tape drive. There is a complex tradeoff between block size, the size of the data buffer in the record/playback deck, the percentage of tape lost on inter-block gaps, and read/write throughput.
Finally modern tape drives offer speed matching feature, where drive can dynamically decrease physical tape speed as much as 50% to avoid shoe-shining.
The size of the inter-block gap is constant, while the size of the data block is based on the number of bytes in the block. Thus a given length tape reel can hold much less data when written with smaller block sizes than with larger.
Sequential access to data
Tape is characterized by sequential access to data. While tape can provide a very high data transfer rate for streaming long contiguous sequences of data, it takes in the tens of seconds to reposition the tape head to an arbitrarily chosen place on the tape. In contrast, hard disk technology can perform the equivalent action in tens of milliseconds (3 orders of magnitude faster) and can be thought of as offering random access to data.
Logical filesystems require data and metadata to be stored on the data storage medium. Storing metadata in one place and data in another requires lots of slow repositioning activity on most tape systems. As a result, most tape systems use a very trivial filesystem in which files are addressed by number not by filename. Metadata such as file name or modification time is typically not stored at all. Tape labels store such metadata, and they are used for interchanging data between systems. File archiver and backup tools have been created to pack multiple files along with the related metadata into a single 'tape file'. Serpentine tape drives (e.g. QIC) can improve access time by switching to the appropriate track; tape partitions were used for directory information. The Linear Tape File System is a method of storing file metadata on a separate part of the tape. This makes it possible to copy and paste files or directories to a tape as if it were just like another disk, but does not change the fundamental sequential access nature of tape.
Tape has quite a long latency for random accesses since the deck must wind an average of one-third the tape length to move from one arbitrary data block to another. Most tape systems attempt to alleviate the intrinsic long latency, either using indexing, where a separate lookup table (tape directory) is maintained which gives the physical tape location for a given data block number (a must for serpentine drives), by marking blocks with a tape mark that can be detected while winding the tape at high speed.
Most tape drives now include some kind of data compression. There are several algorithms which provide similar results: LZ (most), IDRC (Exabyte), ALDC (IBM, QIC) and DLZ1 (DLT). Embedded in tape drive hardware, these compress a relatively small buffer of data at a time, so cannot achieve extremely high compression even of highly redundant data. A ratio of 2:1 is typical, with some vendors claiming 2.6:1 or 3:1. The ratio actually obtained with real data is often less than the stated figure; the compression ratio cannot be relied upon when specifying the capacity of equipment, e.g., a drive claiming a compressed capacity of 500GB may not be adequate to back up 500GB of real data. Data that is already stored efficiently may not allow any significant compression; a sparse database may offer much larger factors. Software compression can achieve much better results with sparse data, but uses the host computer's processor, and can slow the backup if it is unable to compress as fast as the data is written.
Some enterprise tape drives can encrypt data (this must be done after compression, as encrypted data cannot be compressed effectively). Symmetric streaming encryption algorithms are also implemented to provide high performance.
The compression algorithms used in low-end products are not the most effective known today, and better results can usually be obtained by turning off hardware compression, using software compression (and encryption if desired) instead.
For decades tape storage has offered cost and storage density advantages over many other storage technologies, such as disk storage. And for decades medium-sized and large-sized data centers have deployed both tape and disk storage to complement each other, with tape the favorite choice for tertiary and archival data storage. Storage technologies continue to advance both functionally and economically, and storage vendors compete aggressively against each other. Analysts are lining up on both sides of the "disk versus tape" argument.
The costs of disk storage have decreased faster than that of tapes. Until about the end of the twentieth century prices and capacities allowed backing up a desktop hard drive to tape, such as inexpensive Travan, much more cheaply and more compactly than backing up to an additional, external or removable, drive. Later drive prices dropped, drives with capacities of hundreds to a few thousands of megabytes started to be used on relatively inexpensive machines, and backing up to an external USB drive became cheaper, and the drive more compact, than tape for a non-networked machine used by a business or serious user.
As a basic comparison, mainframe-class tape drives, such as Oracle's Sun StorageTek T10000B, are priced[when?] at approximately US$37,000 each, excluding tape libraries. (IBM's TS1130 is also representative of this storage class.) At any single moment in time each T10000C tape drive can read and/or write to one tape cartridge which can contain up to 5TB of uncompressed data. Real-world sequential data transfer speeds are high (sustained 240MB/second for the T10000C and 160MB/second for the TS1130) compared to disk. However, PC-class hard disks are priced below $200 for 3TB. One mainframe-class hard disk still has a much lower price than one mainframe-class tape drive, so the economics might favor disk.
However, the key difference is that tape drives can exchange their magnetic media (the cartridges) frequently, while the magnetic media installed inside each hard disk is fixed and cannot be swapped. (The drives themselves could be moved if installed in swappable caddies at extra cost, with extra cost hot-swappable infrastructure.) Mainframe-class tape drives are almost always installed in robotic tape libraries which are often quite large and can hold thousands of cartridges. The StorageTek SL8500 library is one representative example. The smallest SL8500 library holds up to 1,448 tape cartridges, for 1.4 Petabytes of online uncompressed storage. An equivalent amount of PC-class hard disk storage would be priced at $100,000 or less for the drives. The tape library would likely deliver a higher sustained sequential write speed, the media would be more rugged (for off-site storage), the media would meet or exceed long-term archival storage requirements (for reliable retrieval decades into the future), and the data center power and cooling requirements would be considerably lower. The economics of this comparison are more complicated than a single-spindle versus tape drive comparison.
Whether tape's characteristics versus disk are useful or not will depend on the particular data center and its data storage requirements. What has tended to happen in recent years is that the amount of data has grown exponentially, with both disk (especially) and tape participating in the growth. In the early twenty-first century solid state storage encroached on disk's previous near-monopoly in random access non-volatile data storage, while disk pushed into tape's territory to some extent, particularly in situations where sequential data access is only a relatively small part of a particular data center's storage requirements.
Chronological list of tape formats
- Computer data storage
- Magnetic storage
- Tape drive
- Information repository
- Data proliferation
- Tape mark
- Wangtek Corporation, OEM Manual, Series 5099ES/5125ES/5150ES SCSI Interface Streaming 1/4 Inch Tape Cartridge Drive, Rev D, 1991. QFA (Quick File Access) Partition, page 4-29–4-31.
- 02/17/2009 (2009-02-17). "In the Tape vs. Disk War, Think Tape AND Disk - Enterprise Systems". Esj.com. Retrieved 2012-01-31.
- "HP article on backup for home users, recommending several methods, but not tape, 2011". H71036.www7.hp.com. 2010-03-25. Retrieved 2012-01-31.
- Mainframe-class tape vendors specify terabytes of capacity, while hard disk vendors typically specify trillions of bytes of capacity. Two trillion bytes is approximately 1.8 terabytes.