Jump to content

ISO 9660

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 201.212.44.26 (talk) at 23:50, 14 December 2007 (The 2 GiB (or 4 GiB depending on implementation) file size limit: split infinitive fix). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

ISO 9660, a standard published by the International Organization for Standardization (ISO), defines a file system for CD-ROM media. It aims at supporting different computer operating systems such as Unix, Windows and Mac OS, so that data may be exchanged.

An extension to ISO 9660, the Joliet format, adds support for longer file names and non-ASCII character sets.

DVDs may also use the ISO 9660 file system. However, the UDF file system is more appropriate on DVDs as it has better support for the larger media and is better suited for modern operating system needs.

History

A CD-ROM may be mastered with any kind of data on it. Sun Microsystems, for example, uses the Berkeley UNIX UFS file systems on many CD-ROMs. Silicon Graphics' IRIX installation media uses EFS. Mac OS uses HFS Plus. This restricts them to the producer's operating environment, which, while beneficial in the case of platform-specific software distributions, is not appropriate for widely distributing content. Hence, the need for one volume format that would be accessible on a variety of equipment arose.

Before there was a standard on this matter some were using the High Sierra format on CD-ROM, which arranged file information in a dense, sequential layout to minimise nonsequential access. The High Sierra file system format uses a hierarchical (eight levels of directories deep) tree file system arrangement, similar to UNIX and FAT. High Sierra has a minimal set of file attributes (directory or ordinary file and time of recording) and name attributes (name, extension, and version). The designers realised they could never get people to agree on a unified definition of file attributes, so the minimum common information was encoded, and a place for future optional extensions (system use area) was defined for each file.

High Sierra was adopted in December 1986 (with changes) as an international standard by Ecma International as ECMA-119 [1] and submitted for the fast tracking to the International Organization for Standardization, where it was eventually accepted as ISO 9660:1988. The ISO 9660 file system format is now used throughout the industry.

Specifications

CD-ROM Specifications

The smallest entity in the CD format is called a frame, and holds 24 bytes. Data in a CD-ROM is organized in frames and sectors. A CD-ROM sector contains 98 frames, and holds 2352 bytes.

CD-ROM Mode 1, usually used for computer data, divides the 2352 byte data area defined by the Red Book standards into 12 bytes of synchronisation information, 4 bytes of header data, 2048 bytes of user data and 288 bytes of error correction and detection codes. These codes help prevent the data from becoming corrupted, which could lead to errors for executable data.

CD-ROM Mode 2 Form 1, usually used for computer data, uses the same format as Mode 1. Its use is not recommended for compatibility reasons. [2]

CD-ROM Mode 2 Form 2, intended to be used for error-tolerant data such as audio and video, divides the 2352 bytes into 12 bytes of synchronisation information, 4 bytes of header data and 2336 bytes of user data. Mode 2 provides 14% more user data space than Mode 1 by omitting error correction, since a read error in audio or video will only cause a small flaw which may not even be detectable to humans. Video CDs are classified as Mode 2 Form 2.

ISO 9660 Specifications

The first 32768 bytes of the disk are unused by ISO 9660 data structure, and therefore available for other use. For example, a CD-ROM may contain an alternative file system descriptor in this area, as it is often used by Hybrid CDs to offer Mac OS-specific content.

Immediately afterwards, a series of volume descriptors details the contents and kind of information contained on the disk (similar to the BIOS parameter block used by FAT and NTFS formatted disks).

A volume descriptor describes the characteristics of the file system information present on a given CD-ROM, or volume. It is divided into two parts: the type of volume descriptor, and the characteristics of the descriptor.

The volume descriptor is constructed in this manner so that if a program reading the disk does not understand a particular descriptor, it can just skip over it until it finds one it recognises, thus allowing the use of many different types of information on one CD-ROM. Also, if an error were to render a descriptor unreadable, a subsequent redundant copy of a descriptor could then allow for fault recovery.

An ISO 9660 compliant disk contains at least a primary descriptor describing the ISO 9660 file system and a terminating descriptor for indicating the end of the descriptor sequence. Joliet and UDF are examples of file systems adding more descriptors to this sequence.

The primary volume descriptor acts much like the superblock of the Unix File System, providing details on the ISO 9660 compliant portion of the disk. Contained within the primary volume descriptor is the root directory record describing the location of the contiguous root directory. (As in UNIX, directories appear as files for the operating system special use). Directory entries are successively stored within this region. Evaluation of the ISO 9660 filenames is begun at this location. The root directory is stored as an extent, or sequential series of sectors, that contains each of the directory entries appearing in the root. In addition, since ISO 9660 works by segmenting the CD-ROM into logical blocks, the size of these blocks is found in the primary volume descriptor as well.

The first field in a Volume Descriptor is the Volume Descriptor Type (type), which can have the following values:

  • Number 0: shall mean that the Volume Descriptor is a Boot Record
  • Number 1: shall mean that the Volume Descriptor is a Primary Volume Descriptor
  • Number 2: shall mean that the Volume Descriptor is a Supplementary Volume Descriptor
  • Number 3: shall mean that the Volume Descriptor is a Volume Partition Descriptor
  • Number 255: shall mean that the Volume Descriptor is a Volume Descriptor Set Terminator.

The second field is called the Standard Identifier and is set to CD001 for a CD-ROM compliant to the ISO 9660 standard.

Another interesting field is the Volume Space Size which contains the amount of data available on the CD-ROM.

File attributes are very simple in ISO-9660. The most important file attribute is determining whether the file is a directory or an ordinary file. File attributes for the file described by the directory entry are stored in the directory entry and optionally, in the extended attribute record.

Overview of the ISO9660 Directory Structure

There are two ways to locate a file on an ISO 9660 file system. One way is to successively interpret the directory names and look through each directory file structure to find the file (much the way MS-DOS and UNIX work to find a file). The other way is through the use of a precompiled table of paths, where all the entries are enumerated in the successive contents of a file with the corresponding entries. Some systems do not have a mechanism for wandering through directories and they obtain a match by consulting the table.

While a large linear table seems a bit arcane, it can be of great value, as one can quickly search without wandering across the disk (thus reducing seek time).

All multi-byte values are stored twice, in little-endian and big-endian format, either one-after-another in what the specification calls "both-endian format", or in duplicated data structures such as the path table. It is therefore theoretically possible to author an ISO-9660 image which delivers different content on different architectures.

Restrictions

File and directory name restrictions

There are different levels to this standard.

  • Level 1: File names are restricted to eight characters with a three-character extension, upper case letters, numbers and underscore; maximum depth of directories is eight.
  • Level 2: File names are not limited to 8.3 format, but may be up to the maximum allowed by the 1 byte counter in the dir entry and the filename length byte counter. Typically this is close to 180 characters depending on how many extended attributes are present.
  • Level 3: Files allowed to be fragmented (mainly to allow packet writing, or incremental CD recording).

Other name restrictions:

  • All levels restrict names to upper case letters, digits, underscores ("_") and a dot. Linux converts uppercase letters to lower case while mounting ISO filesystems.
  • File names cannot start or end with the dot character.
  • File names cannot have more than one dot.
  • Directory names cannot use dots at all.

Some CD authoring applications allow the user to use almost any character. While this does not strictly conform to the ISO 9660 standard, most operating systems that can read ISO 9660 file systems have no problem with out-of-spec names, however, the names may appear wrong to the user.

Directory depth limit

The restrictions on filename length and directory depth (to 8 levels including the main directory) have been seen by many as a more serious limitation of the file system. Many CD authoring applications attempt to work around this by truncating filenames automatically, but at the risk of breaking applications that rely on a specific file structure.

The 2 GiB (or 4 GiB depending on implementation) file size limit

All numbers in ISO 9660 filesystems except the single byte value used for the GMT offset are unsigned numbers. As the length of a file's extent on disk is stored in a 32 bit value[3], it allows for a maximum length of 4 GiB. (Note: Some older operating systems may handle such values incorrectly, i.e. signed instead of unsigned, which would make it impossible to access files larger than 2 GiB in size.)

Based on this, it is often assumed that a file on an ISO 9660 formatted disc cannot be larger than 232-1 in size, as the file's size is stored in a unsigned 32 bit value, for which 232-1 is the maximum.

It is, however, possible to circumvent this limitation by using the multi-extent (fragmentation) feature of ISO 9660 Level 3. With this, files larger than 4 GiB can be split up into multiple extents (sequential series of sectors), each not exceeding the 4 GiB limit. For example, the free software mkisofs as well as Roxio Toast are able to create ISO 9660 filesystems that use multi-extent files to store files larger than 4 GiB on appropriate media such as recordable DVDs.

Empirical tests with a 4.2 GiB fragmented file on a DVD media have shown that Microsoft Windows XP supports this, while Mac OS X (as of 10.4.8) does not handle this case properly. In the case of Mac OS X, the driver appears not to support file fragmentation at all (i.e. it only supports ISO 9660 Level 2 but not Level 3). Linux supports multiple extents [4]; FreeBSD only shows and reads the last extent of a multi-extent file.

Limit of number of directories

There is also the other, less known limitation: There is a structure in the ISO image called “path table”. For each directory in the image, the path table provides the identifier of its parent directory. The problem is that the directory identifier is a 16-bit number, limiting its range from 1 to 65535[5]. Content of each directory is written also in a different place, therefore path table is redundant, and intended only for fast searching. Some operating systems (Windows) use it, while others (Linux) don't. If an ISO image or disk consists of more than 65535 directories, it will be readable in Linux, while in Windows environment all files from the additional directories will be visible, but empty (zero length). A popular application using ISO format, mkisofs aborts if there is a path table overflow. Nero Burning ROM (Windows) doesn't check if the problem occurs, and produces an invalid ISO file or disk without warning. isovfy cannot easily report this problem. There is no other place in the ISO format where a 16-bit number is used, causing similar limitations.

Multisession support

ISO 9660 is by design a read-only, pre-mastered, file system. This means that all the data has to be collected and then written in one go to the medium. Once written, there's no provision to alter the stored content. Therefore, ISO 9660 is not suitable to be used on random-writable media, such as Hard Disks.

Recordable CD media (CD-R) provides for multiple session writing, which means that data can be written to disc and made accessible, then later more data can be added to the disc as long as there is unused space left on the disc. (CD-Rs do not support erasing or overwriting once-written data.)

The Multisession extension to ISO 9660 makes use of this feature by mainly defining a rule for operating systems reading how to read a ISO 9660 volume from a CD-R:

Instead of looking for the volume descriptors at offset 32768 (block number 16 on a CD) from the start of the disc, it shall start reading from the 16th block starting in the first track of the last session. Block numbers forms a contiguous sequence starting at the first session, continuing over added sessions and their gaps. Hence, if a CD mastering program wants to add a single file to a CD-R containing a ISO 9660 volume, it has to append a session containing at most an updated copy of the entire directory tree, plus the new file. The duplicated directory entries can still reference the data files in the previous session(s). In a similar way, file data can be updated or even removed. Removal is, however, only virtual: The removed content does not appear any more in the directory shown to the user, but it can still be recovered as it's still available on the disc.

ISO 9660:1999

ISO 9660:1999 is the latest update to the ISO 9660 standard. It improves on various restrictions imposed by the old standard, such as extending the maximum path length to 207 characters, removing the eight-level maximum directory nesting limit, and removing the special meaning of the dot character in filenames.

Disc images

ISO 9660 file system images (ISO images) are a common way to electronically transfer the contents of CD-ROMs. They often have the filename extension .iso (.iso9660 is less common, but also in use) and are commonly referred to as "ISOs". It should be noted an .iso file may be:

  1. A single ISO 9660 file system image
  2. A multi-track disc image with a table of contents

Extensions

There are common extensions to ISO 9660 to deal with the limitations. Rock Ridge supports the preservation of Unix-style permissions and longer ASCII-coded names; Joliet supports names stored in Unicode, thus allowing almost any character to be used, even from non-Latin scripts; El Torito enables CDs to be bootable on PC; Apple ISO9660 Extensions adds support for Mac OS specific file properties such as Resource forks, file backup date and more.

ISO 13490 is basically ISO 9660 with multisession support.

For operating systems which do not support any extensions, there is a name translation file TRANS.TBL. It should be located in each directory, including root directory. Now obsolete.

Operating system support

Most operating systems support reading of ISO 9660 formatted discs, and most new versions support the extensions such as Rock Ridge and Joliet. Operating systems that do not support the extensions usually show the basic (non-extended) features of a plain ISO 9660 disc.

Here are some operating systems and their support for ISO 9660 and extensions:

  • DOS: access with extensions, such as MSCDEX.EXE (Microsoft CDROM Extension) or CORELCD.EXE
  • Microsoft Windows 95, Windows 98, Windows ME: can read ISO 9660 Level 1, 2, 3, and Joliet
  • Microsoft Windows NT 4, Windows 2000
  • Windows XP can read ISO 9660 Level 1, 2, 3, Joliet, and ISO 9660:1999
  • Linux and BSD: ISO 9660 Level 1, 2, 3, Joliet, Rock Ridge, and ISO 9660:1999
  • Mac OS 7 to 9: ISO Level 1, 2. Optional free software supports Rock Ridge and Joliet (including ISO Level 3): Joke Ridge and Joliet Volume Access.
  • Mac OS X 10.2 Jaguar, 10.3 Panther, 10.4 Tiger: ISO Level 1, 2, Joliet and Rock Ridge Extensions. Level 3 is not currently supported, although users have been able to mount these disks: [[1]]
  • AmigaOS supports the "AS" extensions (which preserve the Amiga protection bits and file comments)

See also

References

  1. ^ "Volume and File Structure of CDROM for Information Interchange". Ecma International. december 1987. {{cite web}}: Check date values in: |date= (help)
  2. ^ Media Sciences - Mode and Form differences
  3. ^ ECMA-119 9.1.4
  4. ^ http://lists.freebsd.org/pipermail/freebsd-bugs/2006-April/017786.html
  5. ^ ECMA-119 6.9