Character entity reference

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Ligand (talk | contribs) at 10:09, 8 April 2012. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the markup languages SGML and HTML, a character entity reference is a reference to a particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition (DTD). The "replacement text" of the entity consists of a single character from the Universal Character Set/Unicode.

The purpose of a character entity reference is to provide a way to refer to a universal character in a limited character encoding, like ASCII.

Although in popular usage character references are often called "entity references" or even "entities", this usage is wrong.[citation needed] A character reference is a reference to a character, not to an entity. Entity reference refers to the content of a named entity. An entity declaration is created by using the <!ENTITY name "value"> syntax in a document type definition (DTD) or XML schema. Then, the name defined in the entity declaration is subsequently used in the XML. When used in the XML, it is called an entity reference.

Concepts

HTML and SGML have two relevant concepts:

Predefined entity

A "predefined entity reference" is a reference to one of the special characters denoted by:

name value character code (dec) meaning
quot &#34; " x22 (34) (double) quotation mark
amp &#38; & x26 (38) ampersand
apos &#39; ' x27 (39) apostrophe (= apostrophe-quote)
lt &#60; < x3C (60) less-than sign
gt &#62; > x3E (62) greater-than sign

Character coding

A "character reference" is a construct such as &#xa0; or equally &#160; that refers to a character by means of its numeric Unicode code point, i.e. here, the character code 160 (or xA0 in hexa) refers the &nbsp; character, the non-breaking space.

See also

External links