ISO-IR-111
Alias(es) | ISO-IR-111 |
---|---|
Language(s) | Russian, Belarusian, Macedonian, Serbian, Ukrainian (partial) |
Standard | ECMA-113:1986 |
Classification | Extended ASCII, KOI |
Extends | KOI8-B |
Succeeded by | ECMA-113:1988 (ISO-8859-5) |
Other related encoding(s) | KOI8-F |
ISO-IR-111[1] or KOI8-E[2] is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian (except Ґґ which is added to KOI8-F). The name "ISO-IR-111" refers to its registration number in the ISO-IR registry, and denotes it as a set usable with ISO/IEC 2022.
It was defined by the first (1986) edition of ECMA-113,[3] which is the Ecma International standard corresponding to ISO/IEC 8859-5, and as such also corresponds to a 1987 draft version of ISO-8859-5.[4] The published editions of ISO/IEC 8859-5 instead correspond to subsequent editions of ECMA-113, which defines a different encoding.[5]
Naming confusion
ISO-IR-111, the 1985 edition of ECMA-113 (also called "ECMA-Cyrillic" or "KOI8-E"), was based on the 1974 edition of GOST 19768 (i.e. KOI-8). In 1987 ECMA-113 was redesigned.[5] These newer editions of ECMA-113 are equivalent to ISO-8859-5,[5][6] and do not follow the KOI layout. This confusion has led to a common misconception that ISO-8859-5 was defined in or based on GOST 19768-74.[6]
Possibly as another consequence of this, RFC 1345 erroneously lists a different codepage under the names "ISO-IR-111" and "ECMA-Cyrillic", resembling ISO-8859-5 with re-ordered rows, and partially compatible with Windows-1251.[7][6] Due to concerns that existing implementations might use the RFC 1345 definition for those two labels, it was proposed that the IANA additionally recognise KOI8-E
as a label for ECMA-113:1985 content,[7] and the IANA presently lists that label as an alias.[2]
Character set
The following table shows the ISO-IR-111 encoding. Each character is shown with its equivalent Unicode code point.
Letter Number Punctuation Symbol Other Undefined
Extended and modified versions
A modified version named KOI8 Unified or KOI8-F was used in software produced by Fingertip Software, adding the Ґ in its KOI8-U location (replacing the soft hyphen and displacing the universal currency sign), and adding some graphical characters in the C1 control codes area, mainly from KOI8-R and Windows-1251.[4][6][8][9]
Incorrect RFC 1345 code page
Language(s) | Russian, Belarusian, Macedonian, Serbian |
---|---|
Standard | RFC 1345 |
Classification | Extended ASCII |
Transforms / Encodes | ISO-IR-111 |
Other related encoding(s) | ISO-8859-5, Windows-1251 |
RFC 1345 erroneously lists a different code page under the name ISO-IR-111, encoding the same Cyrillic characters but with a different layout. It resembles a mixture of Windows-1251 and ISO-8859-5.[7] Specifically, line A_ corresponds to ISO-8859-5, lines C_ through F_ correspond to Windows-1251[6] (equivalent to lines B_ through E_ of ISO-8859-5), and line B_ nearly corresponds to line F_ of ISO-8859-5, with the exception of the § being replaced with a ¤.
Certain codes resemble ISO-IR-111 with flipped letter case, which may have contributed to the confusion. The majority differ and are shown below with a heavy border.
Letter Number Punctuation Symbol Other Undefined Deviating from ISO-IR-111 (excluding deviations in case only)
See also
References
- ^ ECMA (1 August 1985). Right-hand Part of the Cyrillic Alphabet (PDF). ITSCJ/IPSJ. ISO-IR-111.
- ^ a b "Character Sets". IANA.
- ^ ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (1st ed., June 1986)
- ^ a b Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.
- ^ a b c ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (2nd ed., June 1988)
- ^ a b c d e Nechayev, Valentin (2013) [2001]. "Review of 8-bit Cyrillic encodings universe". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
- ^ a b c Sokolov, Michael (2003-04-05). "ECMA-cyrillic alias iso-ir-111 sore". IETF Charsets Mailing List.
- ^ "KOI8 Unified". Fingertip Software. Archived from the original on 1998-01-09. Retrieved 2020-02-11.
- ^ Leisher, Mark (2008) [1998-03-05]. "KOI8 Unified Cyrillic to Unicode 2.1 mapping table". Department of Mathematical Sciences, New Mexico State University. Retrieved 2020-05-02.