Tag omission

From Wikipedia, the free encyclopedia

Tag omission is an optional feature to minimize an SGML document. Whenever a tag can be implicitly anticipated by the parser from the structure of the document, the tag can be omitted.[1] The tag omission feature can be generally enabled or disabled in the SGML Declaration. The Document Type Definition is used to enable or disable the tag for a specific tag.

Tag omission is one of the main features of SGML which was removed from XML to simplify parsing.


In this example the <document> tag has a distinct order of the tags. The information that <title> needs to come first and is followed by <p> can be used to omit the title start and end tag. Furthermore, the end tag of <p> can also be omitted, because it will be delimited by the next <p> tag or the document end.

<!ELEMENT document - O  (title, p+) >
   <!ELEMENT title O O  (#PCDATA)>
   <!ELEMENT p     - O  (#PCDATA)>

In this DTD specification, the behavior of the tag omission feature is specified for each element by the two characters following the element name. The values can be - or O for disabling and enabling the features. The first character specifies the behavior of the start tag and the second the behavior of the end tag.

A valid document not using tag omission:

   <title>Tag Omission</title>
   <p>first paragraph</p>
   <p>second paragraph</p>
   <p>third paragraph</p>

A valid document simplified by using tag omission:

   Tag Omission
   <p>first paragraph
   <p>second paragraph
   <p>third paragraph