ISO/IEC 8859-9

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
ISO/IEC 8859-9
MIME / IANAISO-8859-9
Alias(es)iso-ir-148, latin5, l5, csISOLatin5[1]
StandardECMA-128, ISO/IEC 8859
ClassificationISO 8859 (extended ASCII, ISO 4873 level 1)
ExtendsUS-ASCII
Based onISO/IEC 8859-1
Preceded byISO/IEC 8859-3
Other related encoding(s)Windows-1254

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language, designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for these six replacements of Icelandic characters with characters unique to the Turkish alphabet:

Position 0xD0 0xDD 0xDE 0xF0 0xFD 0xFE
8859-9 Ğ İ Ş ğ ı ş
8859-1 Ð Ý Þ ð ý þ

ISO-8859-9 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. In modern applications Unicode and UTF-8 are preferred; authors of new web pages and the designers of new protocols are instructed to use UTF-8 instead.[2] Since August 2019, 0.1% of all web pages use ISO-8859-9,[3][4] while 3.1% of web pages located in Turkey use ISO-8859-9.[5] However, the WHATWG Encoding Standard, which specifies the character encodings which are permitted in HTML5 and which compliant browsers must support,[6] requires that web pages marked as ISO-8859-9 be handled as Windows-1254,[2] which differs from ISO-8859-9 by using the CR range which ISO-8859-9 reserves for C1 control codes for additional graphical characters instead (analogous to the relationship between ISO-8859-1 and Windows-1252).

Microsoft has assigned code page 28599 a.k.a. Windows-28599 to ISO-8859-9 in Windows. IBM has assigned code page 920 (CCSID 920) to ISO-8859-9.[7][8] It is published by Ecma International as ECMA-128.[9]

Codepage layout[edit]

ISO/IEC 8859-9[10][11][12]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ğ Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü İ Ş ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ğ ñ ò ó ô õ ö ÷ ø ù ú û ü ı ş ÿ
  Differences from ISO-8859-1

See also[edit]

References[edit]

  1. ^ Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. ^ a b van Kesteren, Anne. "Names and labels". Encoding Standard. WHATWG.
  3. ^ "Historical trends in the usage of character encodings for websites". w3techs.com.
  4. ^ "Frequently Asked Questions". w3techs.com.
  5. ^ "Distribution of character encodings among websites that use Turkey". w3techs.com.
  6. ^ "8.2.2.3. Character encodings". HTML 5.1 2nd Edition. W3C. User agents must support the encodings defined in the WHATWG Encoding standard, including, but not limited to […]
  7. ^ "Code page 920 information document". Archived from the original on 2017-01-16.
  8. ^ "CCSID 920 information document". Archived from the original on 2016-03-27.
  9. ^ Standard ECMA-128: 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabet No. 5 (2nd ed.). 1999. This Ecma publication is also approved as ISO 8859-9.
  10. ^ Code Page CPGID 00920 (pdf) (PDF), IBM
  11. ^ Code Page CPGID 00920 (txt), IBM
  12. ^ International Components for Unicode (ICU), ibm-920_P100-1995.ucm, 2002-12-03

External links[edit]