Underscore

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 140.163.254.158 (talk) at 16:49, 10 May 2012. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The underscore [ _ ] (also called understrike, underbar, low line, underdash, lower part of z or low dash) is a character that originally appeared on the typewriter and was primarily used to underline words. To produce an underlined word, the word was typed, the typewriter carriage was moved back to the beginning of the word, and the word was overtyped with the underscore character.

This character is sometimes used to create visual spacing within a sequence of characters, where a whitespace character is not permitted, e.g., in computer filenames, e-mail addresses, and in World Wide Web URLs. Some computer applications will automatically underline text surrounded by underscores: _underlined_ will render underlined. It is often used in ASCII-only media (E-mail, IRC, Instant Messaging) for this purpose. When the underscore is used for emphasis in this fashion, it is usually interpreted as indicating that the enclosed text is underlined or italicized (as opposed to bold, which is indicated by *asterisks*).

The underscore is not the same character as the dash character, although one convention for text news wires is to use an underscore when an em-dash or en-dash is desired, or when other non-standard characters such as bullets would be appropriate. A series of underscores (like [ _________ ]) may be used to create a blank to be filled in on a form. It is also sometimes used to create a horizontal line, if no other method is available; hyphens and dashes are often used for a similar purpose.

The ASCII value of this character is 95. On the standard US or UK 101/102 computer keyboard it shares a key with the hyphen on the top row, to the right of the 0 key.

Diacritic

The underscore is used as a diacritic mark, "combining low line", in some African languages (some languages using the Orthography of Gabon languages or Rapidolangue in Gabon, Izere in Nigeria) and Native American languages (Shoshoni).

Not to be confused is the combining macron below.

Usage in computing

Origins in identifiers

In programs of any significant size, there is a need for descriptive (hence multi-word) identifiers, like "previous balance" or "end of file". However, spaces are not typically permitted inside identifiers, as they are treated as delimiters between tokens. Writing the words together as in "endoffile" is not satisfactory because the names often become unreadable. Therefore, the programming language COBOL allowed a hyphen ("-") to be used between words of compound identifiers, as in "END-OF-FILE". LISP also allowed the hyphen in names, treating the subtraction operator as an identifier.

Most programming languages, however, interpret the hyphen as a subtraction operator and do not allow the character in identifier names. The common punched card character sets of the early 1960s had no lower-case letters and no special character that would be adequate as a word separator in identifiers. IBM's EBCDIC character coding system, introduced in 1964 at the same time as the IBM System/360 computer series, uses 8 bits per byte. A modest increase in the character set size over earlier character sets added a few punctuation characters, including the underscore, which IBM referred to as the break character, but not lower case (later editions of EBCDIC added lower case). IBM's report on NPL (the early name of what is now called PL/I) leaves the character set undefined, but specifically mentions the break character, and gives RATE_OF_PAY as an example identifier.[1] By 1967, the underscore had spread to ASCII,[2] replacing the similarly shaped left-arrow character (←) previously residing at code point 95 (5F hex) in ASCII-1963 (see also: PIP). C, developed at Bell Labs in the early 1970s, allowed the underscore as an alphabetic character.[3]

Use in other languages

Ruby and Perl use $_ as a special variable described as the “default input and pattern matching space” — any output defaults to that variable, and may be omitted. In Perl, @_ is a special array variable that holds the arguments to a function.

In some languages with pattern matching, such as Standard ML, OCaml, and Haskell, the pattern _ matches any value, but does not perform binding.

See also

References

  1. ^ NPL Technical Report (PDF). IBM. 1964. p. 23. Retrieved 9 June 2011.
  2. ^ Fischer, Eric. "The Evolution of Character Codes, 1874-1968" (PDF). Retrieved 9 June 2011. {{cite journal}}: Cite journal requires |journal= (help)
  3. ^ Ritchie, Dennis (1975?). "C Reference Manual" (PDF). Retrieved 9 June 2011. {{cite journal}}: Check date values in: |date= (help); Cite journal requires |journal= (help)