Jump to content

End-of-file

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Comp.arch (talk | contribs) at 18:50, 31 March 2023. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computing, end-of-file (EOF)[1] is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.

Details

In the C standard library, the character reading functions such as getchar return a value equal to the symbolic value (macro) EOF to indicate that an end-of-file condition has occurred. The actual value of EOF is implementation-dependent and must be negative (but is commonly −1, such as in glibc[2]). Block-reading functions return the number of bytes read, and if this is fewer than asked for, then the end of file was reached or an error occurred (checking of errno or dedicated function, such as ferror is required to determine which).

EOF character

Input from a terminal never really "ends" (unless the device is disconnected), but it is useful to enter more than one "file" into a terminal, so a key sequence is reserved to indicate end of input. In UNIX the translation of the keystroke to EOF is performed by the terminal driver, so a program does not need to distinguish terminals from other input files. By default, the driver converts a Control-D character at the start of a line into an end-of-file indicator. To insert an actual Control-D (ASCII 04) character into the input stream, the user precedes it with a "quote" command character (usually Control-V). AmigaDOS is similar but uses Control-\ instead of Control-D.

In DOS and Windows (and in CP/M and many DEC operating systems such as the PDP-6 monitor,[3] RT-11, VMS or TOPS-10[4]), reading from the terminal will never produce an EOF. Instead, programs recognize that the source is a terminal (or other "character device") and interpret a given reserved character or sequence as an end-of-file indicator; most commonly this is an ASCII Control-Z, code 26. Some MS-DOS programs, including parts of the Microsoft MS-DOS shell (COMMAND.COM) and operating-system utility programs (such as EDLIN), treat a Control-Z in a text file as marking the end of meaningful data, and/or append a Control-Z to the end when writing a text file. This was done for two reasons:

  • Backward compatibility with CP/M. The CP/M file system (and also the original 8-bit FAT implemented in Microsoft BASIC) only recorded the lengths of files in multiples of 128-byte "records", so by convention a Control-Z character was used to mark the end of meaningful data if it ended in the middle of a record. The FAT12 filesystem introduced with 86-DOS and MS-DOS has always recorded the exact byte-length of files, so this was never necessary on DOS.
  • It allows programs to use the same code to read input from both a terminal and a text file.

In the ANSI X3.27-1969 magnetic tape standard, the end of file was indicated by a tape mark, which consisted of a gap of approximately 3.5 inches of tape followed by a single byte containing the character 13 (hex) for nine-track tapes and 17 (octal) for seven-track tapes.[5] The end-of-tape, commonly abbreviated as EOT, was indicated by two tape marks. This was the standard used, for example, on IBM 360. The reflective strip that was used to announce impending physical end of tape was also called an EOT marker.

See also

References

  1. ^ Pollock, Wayne. "Shell Here Document Overview". hccfl.edu. Archived from the original on 2014-05-29. Retrieved 2014-05-28.
  2. ^ "The GNU C Library". www.gnu.org.
  3. ^ "Table of IO Device Characteristics - Console or Teletypewriters". PDP-6 Multiprogramming System Manual (PDF). Maynard, Massachusetts, USA: Digital Equipment Corporation (DEC). 1965. p. 43. DEC-6-0-EX-SYS-UM-IP-PRE00. Archived (PDF) from the original on 2014-07-14. Retrieved 2014-07-10. (1+84+10 pages)
  4. ^ "5.1.1.1. Device Dependent Functions - Data Modes - Full-Duplex Software A(ASCII) and AL(ASCII Line)". PDP-10 Reference Handbook: Communicating with the Monitor - Time-Sharing Monitors (PDF). Vol. 3. Digital Equipment Corporation (DEC). 1969. pp. 5-3 – 5-6 [5-5 (431)]. Archived (PDF) from the original on 2011-11-15. Retrieved 2014-07-10. (207 pages)
  5. ^ "Tape Transfer (Pre-1977): Exchange Media: MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media (Library of Congress)". www.loc.gov.