Jump to content

ISO/IEC 646

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Richardw (talk | contribs) at 08:03, 9 February 2009. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

ISO 646 is an ISO standard that since 1972 has specified a 7-bit character code from which several national standards are derived.

Since the portion of ISO 646 shared by all countries (the "invariant set") specified only those letters used in the basic modern Latin alphabet, other countries using the Latin alphabet with extensions needed to create national variants of ISO 646 to be able to use their native languages. Since universal acceptance of the 8 bit byte did not exist at that time, the national characters had to be made to fit within the constraints of 7 bits, meaning that some characters that appear in ASCII do not appear in other national variants of ISO 646.

ISO/IEC 646 was also ratified by ECMA as ECMA-6.

History

ISO/IEC 646 and its predecessor ASCII (ANSI X3.4) largely endorses existing practice regarding character encodings in the telecommunications industry.

During the 1960s, there was debate about whether character encoding standards (at either the national or international levels) for computers should follow 1) existing practice in the telecommunications industry (which was largely paper-tape based, but which was commonly transmitted on-line digitally over wires), or conversely, 2) existing practice in the punched-card portion of the computer industry, whose heritage was especially the off-line storage of World War II-era electro-mechanical punched-card machines predating electronic computers. For corporate-history reasons regarding Hollerith punched cards, IBM sided with the punched-card character encodings, embodied by EBCDIC, whereas many other computer manufacturers sided with the telecommunications industry's character encodings.

Due to the incompatibility of the various national variants, an International Reference Version (IRV) of ISO/IEC 646 was introduced. The original version (ISO 646 IRV) differed from ASCII only in that in code point 0024, ASCII's dollar sign ($) was replaced by the international currency symbol (¤). The final 1991 version of the code is identical to ASCII.[1]

The ISO 8859 series of standards governing 8-bit character encodings supersede the ISO 646 international standard and its national variants. The ISO 10646 standard, directly related to Unicode, supersedes all of the ISO 646 and ISO 8859 sets of national-variant character encodings with arguably one unified set of character encodings.

Codepage layout

The following table shows the ISO/IEC 646 character set. Each character is shown with its decimal code and its Unicode equivalent. Grey shaded cells indicate code points with character glyphs that vary from region to region. These are discussed in detail below.

ISO/IEC 646
—0 —1 —2 —3 —4 —5 —6 —7 —8 —9 —A —B —C —D —E —F
0_ Template:Chset-color-ctrl|NUL
0000
0
Template:Chset-color-ctrl|SOH
0001
1
Template:Chset-color-ctrl|STX
0002
2
Template:Chset-color-ctrl|ETX
0003
3
Template:Chset-color-ctrl|EOT
0004
4
Template:Chset-color-ctrl|ENQ
0005
5
Template:Chset-color-ctrl|ACK
0006
6
Template:Chset-color-ctrl|BEL
0007
7
Template:Chset-color-ctrl|BS
0008
8
Template:Chset-color-ctrl|HT
0009
9
Template:Chset-color-ctrl|LF
000A
10
Template:Chset-color-ctrl|VT
000B
11
Template:Chset-color-ctrl|FF
000C
12
Template:Chset-color-ctrl|CR
000D
13
Template:Chset-color-ctrl|SO
000E
14
Template:Chset-color-ctrl|SI
000F
15
1_ Template:Chset-color-ctrl|DLE
0010
16
Template:Chset-color-ctrl|DC1
0011
17
Template:Chset-color-ctrl|DC2
0012
18
Template:Chset-color-ctrl|DC3
0013
19
Template:Chset-color-ctrl|DC4
0014
20
Template:Chset-color-ctrl|NAK
0015
21
Template:Chset-color-ctrl|SYN
0016
22
Template:Chset-color-ctrl|ETB
0017
23
Template:Chset-color-ctrl|CAN
0018
24
Template:Chset-color-ctrl|EM
0019
25
Template:Chset-color-ctrl|SUB
001A
26
Template:Chset-color-ctrl|ESC
001B
27
Template:Chset-color-ctrl|FS
001C
28
Template:Chset-color-ctrl|GS
001D
29
Template:Chset-color-ctrl|RS
001E
30
Template:Chset-color-ctrl|US
001F
31
2_ Template:Chset-color-punct|SP
0020
32
Template:Chset-color-punct|!
0021
33
Template:Chset-color-undef|"
0022
34
Template:Chset-color-undef|#
0023
35
Template:Chset-color-undef|$
0024
36
Template:Chset-color-punct|%
0025
37
Template:Chset-color-punct|&
0026
38
Template:Chset-color-undef|'
0027
39
Template:Chset-color-punct|(
0028
40
Template:Chset-color-punct|)
0029
41
Template:Chset-color-punct|*
002A
42
Template:Chset-color-punct|+
002B
43
Template:Chset-color-undef|,
002C
44
Template:Chset-color-undef|-
002D
45
Template:Chset-color-punct|.
002E
46
Template:Chset-color-undef|/
002F
47
3_ Template:Chset-color-digit|0
0030
48
Template:Chset-color-digit|1
0031
49
Template:Chset-color-digit|2
0032
50
Template:Chset-color-digit|3
0033
51
Template:Chset-color-digit|4
0034
52
Template:Chset-color-digit|5
0035
53
Template:Chset-color-digit|6
0036
54
Template:Chset-color-digit|7
0037
55
Template:Chset-color-digit|8
0038
56
Template:Chset-color-digit|9
0039
57
Template:Chset-color-punct|:
003A
58
Template:Chset-color-punct|;
003B
59
Template:Chset-color-punct|<
003C
60
Template:Chset-color-punct|=
003D
61
Template:Chset-color-punct|>
003E
62
Template:Chset-color-punct|?
003F
63
4_ @
0040
64
Template:Chset-color-alpha|A
0041
65
Template:Chset-color-alpha|B
0042
66
Template:Chset-color-alpha|C
0043
67
Template:Chset-color-alpha|D
0044
68
Template:Chset-color-alpha|E
0045
69
Template:Chset-color-alpha|F
0046
70
Template:Chset-color-alpha|G
0047
71
Template:Chset-color-alpha|H
0048
72
Template:Chset-color-alpha|I
0049
73
Template:Chset-color-alpha|J
004A
74
Template:Chset-color-alpha|K
004B
75
Template:Chset-color-alpha|L
004C
76
Template:Chset-color-alpha|M
004D
77
Template:Chset-color-alpha|N
004E
78
Template:Chset-color-alpha|O
004F
79
5_ Template:Chset-color-alpha|P
0050
80
Template:Chset-color-alpha|Q
0051
81
Template:Chset-color-alpha|R
0052
82
Template:Chset-color-alpha|S
0053
83
Template:Chset-color-alpha|T
0054
84
Template:Chset-color-alpha|U
0055
85
Template:Chset-color-alpha|V
0056
86
Template:Chset-color-alpha|W
0057
87
Template:Chset-color-alpha|X
0058
88
Template:Chset-color-alpha|Y
0059
89
Template:Chset-color-alpha|Z
005A
90
Template:Chset-color-undef|[
005B
91
Template:Chset-color-undef|\
005C
92
Template:Chset-color-undef|]
005D
93
Template:Chset-color-undef|^
005E
94
Template:Chset-color-undef|_
005F
95
6_ Template:Chset-color-undef|`
0060
96
Template:Chset-color-alpha|a
0061
97
Template:Chset-color-alpha|b
0062
98
Template:Chset-color-alpha|c
0063
99
Template:Chset-color-alpha|d
0064
100
Template:Chset-color-alpha|e
0065
101
Template:Chset-color-alpha|f
0066
102
Template:Chset-color-alpha|g
0067
103
Template:Chset-color-alpha|h
0068
104
Template:Chset-color-alpha|i
0069
105
Template:Chset-color-alpha|j
006A
106
Template:Chset-color-alpha|k
006B
107
Template:Chset-color-alpha|l
006C
108
Template:Chset-color-alpha|m
006D
109
Template:Chset-color-alpha|n
006E
110
Template:Chset-color-alpha|o
006F
111
7_ Template:Chset-color-alpha|p
0070
112
Template:Chset-color-alpha|q
0071
113
Template:Chset-color-alpha|r
0072
114
Template:Chset-color-alpha|s
0073
115
Template:Chset-color-alpha|t
0074
116
Template:Chset-color-alpha|u
0075
117
Template:Chset-color-alpha|v
0076
118
Template:Chset-color-alpha|w
0077
119
Template:Chset-color-alpha|x
0078
120
Template:Chset-color-alpha|y
0079
121
Template:Chset-color-alpha|z
007A
122
Template:Chset-color-undef|{
007B
123
Template:Chset-color-undef||
007C
124
Template:Chset-color-undef|}
007D
125
Template:Chset-color-undef|~
007E
126
Template:Chset-color-ctrl |DEL
007F
127

National variants

Some national variants of ISO 646 are:

Code ISO-
IR
Standard Used in
CA-1 121 CSA Z243.4-1985 Canada (nr. 1 alternative, with “î”)
(French, classical)
CA-2 122 CSA Z243.4-1985 Canada (nr. 2 alternative, with “É”)
(French, reformed orthography)
CN 057 GB/T 1988-80 People's Republic of China (Basic Latin)
CU 151 NC 99-10:81 Cuba (Spanish)
DE 021 DIN 66003 Germany (German)
DK DS 2089 Denmark (Danish)
FR 069 AFNOR NF Z 62010-1982 France (French)
FR-0 025 AFNOR NF Z 62010-1973 France (obsolete since April 1985)
GB 004 BSI 4730 United Kingdom (English)
GR 088 HOS ELOT Greece (obsolete)
HU 086 MSZ 7795/3 Hungary(Hungarian)
IE 207 NSAI 433:1996 Ireland (Irish Goidelic)
 
Code ISO-
IR
Standard Used in
INV 170

ISO 646:1983

international (Invariant subset)
IRV 002

ISO 646:1983

International Reference Variant
JA 014 JIS C 6220-1969 Japan (Romaji)
JA-O 092 JIS C 6229-1984 Japan (OCR-B)
KR ? South Korea
MT ? Malta (Maltese, English)
NO 060 NS 4551 version 1 Norway
NO-2 061 NS 4551 version 2 Norway (obsolete since June 1987)
SE 010 SEN 85 02 00 Annex B Sweden (basic Swedish)
SE-C 011 SEN 85 02 00 Annex C Sweden (extended Swedish for names)
T.61 102 ITU/CCITT T.61 Recommendation International (Teletex)
US 006 ANSI X3.4-1968 United States (ASCII)
YU 141 JUS I.B1.002 (YUSCII) former Yugoslavia (Croatian, Slovenian, Serbian, Bosnian)

Other proprietary standards approved later for international use by some standard committees:

Code ISO-
IR
Approved by Origin Used in
ES 085 ECMA IBM Spain (Basque, Castilian, Catalan, Galician)
esp 017 ECMA Olivetti Spanish (international)
DK-SE 009-1 SIS NATS, main set Sweden and Denmark (journalistic texts)
FI-SE 008-1 SIS NATS, main set Sweden and Finland (journalistic texts)
 
Code ISO-
IR
Approved by Origin Used in
ita 015 ECMA Olivetti Italian
PT 084 ECMA IBM Portugal (Portuguese, Spanish)
por 016 ECMA Olivetti Portuguese (international)

The specifics of the changes for some of these variants are given in this table:

Codes Characters for each ISO 646 compatible charset
binary decimal hexa INV US T.61 JA JA-O KR CN IRV GB DK NO NO-2 SE SE-C DE HU FR FR-0 CA-1 CA-2 IE IS ita por PT esp ES CU MT YU
010 0010 34 22 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
010 0011 35 23   # # # # # # # £ # # § # # # # £ £ # # £ # £ # £ # # # # #
010 0100 36 24   $ ¤ $ $ $ ¥ $ $ $ $ $ ¤ ¤ $ ¤ $ $ $ $ $ $ $ $ $ $ $ ¤ $ $
010 1001 39 27 ' ' ' ' ' ' ' '
010 1100 44 2C , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
010 1101 45 2D - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
010 1111 47 2F / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
100 0000 64 40   @ @ @ @ @ @ @ @ @ @ @ @ É § Á à à à à Ó Ð § § ´ § · @ @ Ž
101 1011 91 5B   [ [ [ [ [ [ [ [ Æ Æ Æ Ä Ä Ä É ° ° â â É Þ ° Ã Ã ¡ ¡ ¡ ġ Š
101 1100 92 5C   \   ¥ ¥ \ \ \ Ø Ø Ø Ö Ö Ö Ö ç ç ç ç Í \ ç Ç Ç Ñ Ñ Ñ ż Đ
101 1101 93 5D   ] ] ] ] ] ] ] ] Å Å Å Å Å Ü Ü § § ê ê Ú Æ é Õ Õ ¿ Ç ] ħ Ć
101 1110 94 5E   ^   ^ ^ ^ ^ ˆ ˆ ˆ ˆ ˆ ˆ Ü ˆ ˆ ^ ˆ î É Á Ö ˆ ˆ ˆ ˆ ¿ ¿ ˆ Č
101 1111 95 5F _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
110 0000 96 60   `   `   ` ` ` ` ` ` ` ` é ` á µ µ ô ô ó ð ù ` ` ` ` ` ċ ž
111 1011 123 7B   {   { { { { { { æ æ æ ä ä ä é é é é é é þ à ã ã ° ´ ´ Ġ š
111 1100 124 7C   | | | | | | | | ø ø ø ö ö ö ö ù ù ù ù í | ò ç ç ñ ñ ñ Ż đ
111 1101 125 7D   }   } } } } } } å å å å å ü ü è è è è ú æ è õ õ ç ç [ Ħ ć
111 1110 126 7E   ~     ~ ˜ ˜ ˜ ¯ | ˜ ü ß ˝ ¨ ¨ û û á ö ì ° ˜ ˜ ¨ ¨ Ċ č

In the table above, the cells with non-white background emphasize the differences from the US variant used in the Basic Latin subset of ISO/IEC 10646 and Unicode.

The characters displayed in cells with red background could be used as combining diacritics, when preceded or followed with a backspace C0 control (this encoding method is deprecated or is not recommended as it was part of some withdrawn national standards). Without such complex encoding, they are no different from the symbols used in the US variant (although glyph variants are still possible, especially on the quotation marks, and circumflex or tilde symbols).

Later, when 8 bit character sets gained more acceptance, ISO 8859-1, ISO 8859-2, and ISO 8859-3 became the preferred method of coding most of these variants.

Variants of ASCII that are not ISO 646

There are also some 7-bit character sets that are not officially part of the ISO 646 standard. Examples include:

  • 7-bit Greek, ELOT 927. The Greek alphabet is mapped to positions 0x61–0x71 and 0x73–0x79, on top of the Latin lowercase letters. This mapping with the high bit set is ISO 8859-7.
  • 7-bit Cyrillic, KOI-7 or Short KOI. The Cyrillic characters are mapped to positions 0x60–0x7E, on top of the Latin lowercase letters. Superseded by the KOI-8 variants.
  • 7-bit Hebrew, SI 960. The Hebrew alphabet is mapped to positions 0x60–0x7A, on top of the lowercase Latin letters (and grave accent for aleph). 7-bit Hebrew was always stored in visual order. This mapping with the high bit set, i.e. with the Hebrew letters in 0xE0–0xFA, is ISO 8859-8.
  • 7-bit Arabic, ASMO 449. The Arabic alphabet is mapped to positions 0x41–0x5A and 0x60–0x6A, on top of both uppercase and lowercase Latin letters. This mapping with the high bit set is ISO 8859-6.

See also

References