Non-breaking space

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In word processing and digital typesetting, a non-breaking space, also known as a no-break space or non-breakable space (NBSP), is a variant of the space character that prevents an automatic line break (line wrap) at its position.

In certain formats (such as HTML), it also prevents the “collapsing” of multiple consecutive whitespace characters into a single space. The non-breaking space is also known as a hard space or fixed space. In Unicode, the "common" non-breaking space is encoded as U+00A0 (decimal 160, HTML  ), but there are other width variations.

Uses and variations[edit]

Despite having similar layout and uses with whitespace, it differs in contextual behavior.,[1][2]

Non-breaking behavior[edit]

Text-processing software typically assumes that an automatic line break may be inserted anywhere a space character occurs; a non-breaking space prevents this from happening (provided the software recognizes the character). For example, if the text "100 km" (according to the style guide) will not quite fit at the end of a line, the software may insert a line break between "100" and "km". To avoid this undesirable behaviour, the editor may choose to use a non-breaking space between "100" and "km". This guarantees that the text "100 km" will not be broken: if it does not fit at the end of a line it is moved in its entirety to the next line.

Non-collapsing behavior[edit]

A second common application of non-breaking spaces is in plain text file formats such as SGML, HTML, TeX and LaTeX, which treat sequences of whitespace characters (space, newline, tab, form feed, etc.) as if they were a single character. Such "collapsing" of whitespace allows the author to neatly arrange the source text using line breaks, indentation and other forms of spacing without affecting the final typeset result.[3][4]

In contrast, non-breaking spaces are not merged with neighboring whitespace characters when displayed, and can therefore be used by an author to insert additional visible space in the resulting output. Conversely, indiscriminate use (see the recommended use in style guides), in addition to a normal space, gives extraneous extra space in the output.

Width variations[edit]

For non-usual spacing, there are some non-breaking size variations (compare examples with this standard nbsp example " "):

  • en space: see UTF8 ensp, U+2002 (8194). Example: " "
  • em space: see UTF8 emsp, U+2003 (8195). Example: " "
  • thin space: see UTF8 thinsp, U+2009 (8201). Example: " ".
    Known in Unicode as “Narrow No-Break Space” (U+202F narrow no-break space (HTML:  )). It was introduced in Unicode 3.0 for Mongolian, to separate a suffix from the word stem without indicating a word boundary. Also required for French (before ?, ! or ;) and Russian (before ) punctuation.
  • zero-width space (word-joiner): encoded in Unicode 3.2 and above as U+2060 and HTML as ⁠. The word-joiner does not normally produce any space but prohibits a line break on either side of it.

Encodings[edit]

Format Representation of non-breaking space
Unicode and ISO/IEC 10646 U+00A0   no-break space (HTML:    )
Can be encoded in UTF-8 as C2 A0
ISO/IEC 8859 A0
CP1252 (MS Windows default in most countries using Germanic or Romance languages) A0
KOI8-R 9A
EBCDIC 41 – RSP, Required Space
CP437, CP850, CP866 FF
HTML (including Wikitext) Character entity reference:  
Numeric character references:   or  
TeX tilde (~)
ASCII Not available

Unicode defines several other non-break space characters. See #Size variations. Encoding remarks:

  • Word joiner, encoded in Unicode 3.2 and above as U+2060, and in HTML as ⁠ or ⁠.
  • The Byte Order Mark, U+FEFF, officially named "Zero Width No-Break Space", can also be used with the same meaning as the word joiner, but in current documents this use is deprecated. See also Zero-width non-breaking space.

Keyboard entry methods[edit]

It is rare for national or international standards on keyboard layouts to define an input method for the non-breaking space. An exception is the Finnish multilingual keyboard, accepted as the national standard SFS 5966 in 2008. According to the SFS setting, the non-breaking space can be entered with the key combination AltGr + Space.[5]

Typically, authors of keyboard drivers and application programs (e.g., word processors) have devised their own keyboard shortcuts for the non-breaking space. For example:

System/application Entry method
Microsoft Windows Alt+0+1+6+0
Apple Mac OS X Opt+Space
Linux or Unix using X11 Compose, Space, Space
GNU Emacs Ctrl+X 8 Space
Vim Ctrl+K, Space, Space; or Ctrl+K, Shift+N, Shift+S
Dreamweaver, LibreOffice, Microsoft Word,
OpenOffice.org (since 3.0)
Ctrl+ Shift+Space
FrameMaker, LyX, OpenOffice.org (before 3.0),
WordPerfect
Ctrl+Space
Mac Adobe InDesign Opt+ Cmd+X

Apart from this, applications and environments often have methods of entering unicode entities directly via their code point, e.g. via the Alt Numpad input method. (Non-breaking space has codepoint 255 decimal (FF hex) in codepage 437 and codepage 850, and codepoint 160 decimal (A0 hex) in codepage 1252.)

See also[edit]

References[edit]

  1. ^ "Justify Just or Just Justify", M. Elyaakoubi and A. Lazrek. Journal of Electronic Publishing, vol. 13, issue 1, 2010. DOI 10.3998/3336451.0013.105.
  2. ^ http://www.chicagomanualofstyle.org/qanda/data/faq/topics/SpecialCharacters.html
  3. ^ "Structure", HTML 4.01, W3, 1999-12-24 .
  4. ^ "Text", CSS 2.1, W3 .
  5. ^ Kotoistus (2006-12-28), Uusi näppäinasettelu [Status of the new keyboard layout] (presentation) (in Finnish, English), CSC – IT Center for Science . Drafts of the Finnish multilingual keyboard.