Document type declaration
A document type declaration, or DOCTYPE, is an instruction that associates a particular SGML or XML document (for example, a webpage) with a document type definition (DTD) (for example, the formal definition of a particular version of HTML). In the serialized form of the document, it manifests as a short string of markup that conforms to a particular syntax.
The HTML layout engines in modern web browsers perform DOCTYPE "sniffing" or "switching", wherein the DOCTYPE in a document served as
text/html determines a layout mode, such as "quirks mode" or "standards mode". The
text/html serialization of HTML5, which is not SGML-based, uses the DOCTYPE only for mode selection. Since web browsers are implemented with special-purpose HTML parsers, rather than general-purpose DTD-based parsers, they don't use DTDs and will never access them even if a URL is provided. The DOCTYPE is retained in HTML5 as a "mostly useless, but required" header only to trigger "standards mode" in common browsers.
The general syntax for a document type declaration is:
<!DOCTYPE root-element PUBLIC "FPI" ["URI"] [ <!-- internal subset declarations --> ]>
<!DOCTYPE root-element SYSTEM "URI" [ <!-- internal subset declarations --> ]>
In XML, the root element that represents the document is the first element in the document. For example, in XHTML, the root element is <html>, being the first element opened (after the doctype declaration) and last closed. The keywords SYSTEM and PUBLIC suggest what kind of Document Type Definition (DTD) it is (one that is on a private system or one that is open to the public). If the PUBLIC keyword is chosen then this keyword is followed by a restricted form of "public identifier" called Formal Public Identifier (FPI) enclosed in double quote marks. After that, necessarily, a "system identifier" enclosed in double quote marks, too, is provided. For example, the FPI for XHTML 1.1 is "-//W3C//DTD XHTML 1.1//EN" and, there are 3 possible system identifiers available for XHTML 1.1 depending on the needs, one of them is the URI reference "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd". If, instead, the SYSTEM keyword is chosen, only a system identifier must be given. It means that the XML parser must locate the DTD in a system specific fashion, in this case, by means of a URI reference of the DTD enclosed in double quote marks. The last part, surrounded by literal square brackets (), is called an internal subset which can be used to add/edit entities or add/edit PUBLIC keyword behaviors. The internal subset is always optional (and sometimes even forbidden within simple SGML profiles, notably those for basic HTML parsers that don't implement a full SGML parser).
On the other hand, document type declarations are slightly different in SGML-based documents such as HTML, where the public identifier may be associated with the system identifier. This association might be performed, e. g., by means of a catalog file resolving the FPI to a system identifier.
The first line of many World Wide Web pages reads as follows:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html lang="ar" dir="ltr" xmlns="http://www.w3.org/1999/xhtml">
This document type declaration for XHTML includes by reference a DTD, whose public identifier is
-//W3C//DTD XHTML 1.0 Transitional//EN and whose system identifier is
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd. An entity resolver may use either identifier for locating the referenced external entity. No internal subset has been indicated in this example or the next ones. The root element is declared to be
html and, therefore, it is the first tag to be opened after the end of the doctype declaration in this example and the next ones, too. The html tag is not part of the doctype declaration but has been included in the examples for orientation purposes.
HTML 4.01 DTDs
Strict DTD does not allow presentational markup with the argument that Cascading Style Sheets should be used for that instead. This is how the Strict DTD looks:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html>
Transitional DTD allows some older PUBLIC and attributes that have been deprecated:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html>
If frames are used, the Frameset DTD must be used instead, like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd"> <html>
XHTML 1.0 DTDs
XHTML's DTDs are also Strict, Transitional and Frameset.
XHTML Strict DTD. No deprecated tags are supported and the code must be written correctly.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
XHTML Transitional DTD is like the XHTML Strict DTD, but deprecated tags are allowed.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
XHTML Frameset DTD is the only XHTML DTD that supports Frameset. The DTD is below.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
XHTML 1.1 DTD
XHTML 1.1 is the most current finalized revision of XHTML, introducing support for XHTML Modularization. XHTML 1.1 has the stringency of XHTML 1.0 Strict.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
XHTML Basic DTDs
XHTML Basic 1.0
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">
XHTML Basic 1.1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd">
XHTML Mobile Profile DTDs
XHTML Mobile Profile 1.0
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
XHTML Mobile Profile 1.1
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.1//EN" "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd">
XHTML Mobile Profile 1.2
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.2//EN" "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile12.dtd">
XHTML + RDFa DTD
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"> <html lang="ar" dir="rtl" xmlns="http://www.w3.org/1999/xhtml">
HTML5 DTD-less DOCTYPE
HTML5 uses a
DOCTYPE declaration which is very short, due to its lack of references to a DTD in the form of a URL or FPI. All it contains is the tag name of the root element of the document,
HTML. In the words of the specification draft itself:
In other words,
<!DOCTYPE html>, case-insensitively.
With the exception of the lack of a URI or the FPI string (the FPI string is treated case sensitively by validators), this format (a case-insensitive match of the string
!DOCTYPE HTML) is the same as found in the syntax of the SGML based HTML 4.01
DOCTYPE. Both in HTML4 and in HTML5, the formal syntax is defined in upper case letters, even if both lower case and mixes of lower case upper case are also treated as valid.
In XHTML5 the
DOCTYPE must be a case-sensitive match of the string "
<!DOCTYPE html>". This is because in XHTML syntax all HTML PUBLIC are required to be in lower case, including the root element referenced inside the HTML5
DOCTYPE. As well, XHTML only accepts the upper case inside the
DOCTYPE string. These rules are not defined by the HTML5 specification itself but by XML and the syntax rules for XHTML DTDs. For the XHTML5 syntax, then DTDs are permitted as well.
- "The HTML syntax ― HTML5". Retrieved 2011-06-05.
- "The HTML syntax ― HTML5". Web Hypertext Application Technology Working Group. Retrieved 2011-06-05. "3. A string that is an ASCII case-insensitive match for the string "DOCTYPE". ... 5. A string that is an ASCII case-insensitive match for the string "HTML"."
- "The XHTML syntax ― HTML5". Web Hypertext Application Technology Working Group. Retrieved 2009-09-01.
- "Polyglot Markup: HTML-Compatible XHTML Documents". World Wide Web Consortium. Retrieved 2012-01-17.