ISO/IEC 8859-11
ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined. (In practice, this small distinction is usually ignored.)
ISO-8859-11 is not a main registered IANA charset name despite following the normal pattern for IANA charsets based on the ISO 8859 series. However, it is defined as an alias[1] of the close equivalent TIS-620 (which lacks the non-breaking space), and which can without problems be used for ISO/IEC 8859-11, since the no-break space has a code which was unallocated in TIS-620. Microsoft has assigned code page 28601 a.k.a. Windows-28601 to ISO-8859-11 in Windows.[2] A draft had the Thai letters in different spots.[3]
As with all varieties of ISO/IEC 8859, the lower 128 codes are equivalent to ASCII. The additional characters, apart from no-break space, are found in Unicode in the same order, only shifted from 0xA1 to U+0E01 and so forth.
The Microsoft Windows code page 874 as well as the code page used in the Thai version of the Apple Macintosh, MacThai, are variants of TIS-620 — incompatible with each other, however.
Character set
Letter Number Punctuation Symbol Other Undefined
Code values D1, D4-DA, E7-EE are for combining characters.
Vendor extensions
Code page 874 (IBM) / 9066
IBM code page 874 (CP874, IBM-874, x-IBM874), also known as Code page 9066 (IBM-9066),[5] differs from ISO/IEC 8859-11 in only nine symbols shown boxed in the following table:[6][7][8]
Code page 1161
Code page 1161 (CP1161, IBM-1161), is a variant of IBM code page 874. The only difference is the euro sign (€) in position DEhex (222).[11][12]
Code page 874 (Microsoft) / 1162
Windows code page 874 (windows-874, MS874, x-windows-874), known as Code page 1162 (CP1162, IBM-1162) by IBM,[13][14] is used by Microsoft Windows. It differs from ISO/IEC 8859-11 by only nine symbols as shown in the following table:
Mac OS Thai
This is the variant used on the Classic Mac OS.
See also
Footnotes
- ^ The otherwise-duplicate diacritical marks in this line are intended to display in a "low left position" (0x83–87), "low position" (0x88–8C) or "left position" (0x8F), and are followed in Apple's round-trip mapping by an appended Private Use Area character U+F875, U+F873 or U+F874 respectively.
- ^ The otherwise-duplicate diacritical marks in this line are intended to display in a "left position", and are followed by an appended Private Use Area character U+F874 in Apple's round-trip mapping.
References
- ^ "IANA Character Sets".
- ^ "js-codepage, Getting codepages".
- ^ Everson, Michael. "Proposed ISO 8859-11".
- ^ Whistler, Ken (2002-10-07), ISO/IEC 8859-11:2001 to Unicode, Unicode Consortium
- ^ IBM; Unicode Consortium. "convrtrs.txt". International Components for Unicode. v. 59180.0.1.
Yes ibm-874 == ibm-9066. ibm-1161 has the euro update.
- ^ "Code page 874 information document". Archived from the original on 2017-01-16.
- ^ "CCSID 874 information document". Archived from the original on 2016-03-27.
- ^ "CCSID 9066 information document". Archived from the original on 2016-03-27.
- ^ IBM. "Code Page CPGID 00874" (PDF). REGISTRY: Graphic Character Sets and Code Pages.
- ^ Code Page CPGID 00874 (txt), IBM
- ^ "Code Page 01161" (PDF).
- ^ "CCSID 1161 information document". Archived from the original on 2016-03-27.
- ^ "Code page 1162 information document". Archived from the original on 2017-01-17.
{{cite web}}
:|archive-date=
/|archive-url=
timestamp mismatch; 2016-03-17 suggested (help) - ^ "CCSID 1162 information document". Archived from the original on 2016-03-31.
{{cite web}}
:|archive-date=
/|archive-url=
timestamp mismatch; 2016-03-27 suggested (help) - ^ "Code Page 01162" (PDF).
- ^ Steele, Shawn (1998-02-28). "cp874 to Unicode table". Unicode Consortium, Microsoft.
- ^ Code Page CPGID 01162 (txt), IBM
- ^ International Components for Unicode (ICU), ibm-1162_P100-1999.ucm, 2002-12-03
- ^ Apple (2005-04-05). "Map (external version) from Mac OS Thai character set to Unicode 3.2 and later". Unicode Consortium.
External links
- ISO/IEC 8859-11:2001
- ISO/IEC 8859-11:1999 - 8-bit single-byte coded graphic character sets, Part 11: Latin/Thai character set (draft dated June 22, 1999; superseded by ISO/IEC 8859-11:2001, published December 15, 2001)
- Windows code page 874
- ISO-IR 166 Thai character set (July 13, 1992, from Thai Standard TIS 620-2533 (1990))
- Standardization and Implementations of Thai Language PDF 175k