ISO/IEC 646

From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article is about a character encoding standard. For the ISO C header file, see iso646.h.

ISO/IEC 646:1991, Information technology — ISO 7-bit coded character set for information interchange, is an ISO standard that since its first edition in 1972 has specified a 7-bit character code from which several national standards are derived. ISO/IEC 646 was also ratified by ECMA as ECMA-6.

Characters in the ISO/IEC 646 Basic Character Set are invariant characters.[1] Since that portion of ISO/IEC 646, that is the invariant character set shared by all countries, specified only those letters used in the ISO basic Latin alphabet, countries using additional letters needed to create national variants of ISO 646 to be able to use their native scripts. Since universal acceptance of the 8-bit byte did not exist at that time, the national characters had to be made to fit within the constraints of 7 bits, meaning that some characters that appear in ASCII do not appear in other national variants of ISO 646.

History[edit]

ISO/IEC 646 and its predecessor ASCII (ANSI X3.4) largely endorsed existing practice regarding character encodings in the telecommunications industry.

As ASCII did not provide a number of characters needed for languages other than English, a number of national variants were made that substituted some less-used characters with needed ones. Due to the incompatibility of the various national variants, an International Reference Version (IRV) of ISO/IEC 646 was introduced, in an attempt to at least restrict the replaced set to the same characters in all variants. The original version (ISO 646 IRV) differed from ASCII only in that in code point 0024, ASCII's dollar sign ($) was replaced by the international currency symbol (¤). The final 1991 version of the code ISO 646:1991 is also known as ITU T.50, International Reference Alphabet or IRA, formerly International Alphabet No. 5, IA5. This standard allows users to exercise the 12 variable characters(i.e., 2 alternative graphic characters and 10 national defined characters). Among these exercises, ISO 646:1991 IRV(International Reference Version) is explicitly defined and identical to ASCII.[2]

The ISO 8859 series of standards governing 8-bit character encodings supersede the ISO 646 international standard and its national variants, by providing 96 additional characters with the additional bit and thus avoiding any substitution of ASCII codes. The ISO 10646 standard, directly related to Unicode, supersedes all of the ISO 646 and ISO 8859 sets with one unified set of character encodings using a larger 21-bit value.

A legacy of ISO/IEC 646 is visible on Windows, where in some fonts or locales, the backslash character used in filenames is rendered as ¥ or other characters. Despite the fact that a different code for ¥ was available even on the original IBM PC, so much text was created with the backslash code used for ¥ that even modern Windows fonts have found it necessary to render the code that way. Another legacy is the existence of trigraphs in the C programming language.

Codepage layout[edit]

The following table shows the ISO/IEC 646 character set. Each character is shown with the hex code of its Unicode equivalent and the decimal value of the ISO/IEC 646 code. Grey shaded cells indicate code points with character glyphs that vary from region to region. These are discussed in detail below.

Legend:

  Alphabetic
  Control character
  Numeric digit
  Punctuation
  Extended punctuation
  Graphic character
  International
  Undefined
ISO/IEC 646
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
 
0_
 
NUL
0000
0
SOH
0001
1
STX
0002
2
ETX
0003
3
EOT
0004
4
ENQ
0005
5
ACK
0006
6
BEL
0007
7
BS
0008
8
HT
0009
9
LF
000A
10
VT
000B
11
FF
000C
12
CR
000D
13
SO
000E
14
SI
000F
15
 
1_
 
DLE
0010
16
DC1
0011
17
DC2
0012
18
DC3
0013
19
DC4
0014
20
NAK
0015
21
SYN
0016
22
ETB
0017
23
CAN
0018
24
EM
0019
25
SUB
001A
26
ESC
001B
27
FS
001C
28
GS
001D
29
RS
001E
30
US
001F
31
 
2_
 
SP
0020
32
!
0021
33
"
0022
34


35


36
%
0025
37
&
0026
38
'
0027
39
(
0028
40
)
0029
41
*
002A
42
+
002B
43
,
002C
44
-
002D
45
.
002E
46
/
002F
47
 
3_
 
0
0030
48
1
0031
49
2
0032
50
3
0033
51
4
0034
52
5
0035
53
6
0036
54
7
0037
55
8
0038
56
9
0039
57
:
003A
58
;
003B
59
<
003C
60
=
003D
61
>
003E
62
?
003F
63
 
4_
 


64
A
0041
65
B
0042
66
C
0043
67
D
0044
68
E
0045
69
F
0046
70
G
0047
71
H
0048
72
I
0049
73
J
004A
74
K
004B
75
L
004C
76
M
004D
77
N
004E
78
O
004F
79
 
5_
 
P
0050
80
Q
0051
81
R
0052
82
S
0053
83
T
0054
84
U
0055
85
V
0056
86
W
0057
87
X
0058
88
Y
0059
89
Z
005A
90


91


92


93


94
_
005F
95
 
6_
 


96
a
0061
97
b
0062
98
c
0063
99
d
0064
100
e
0065
101
f
0066
102
g
0067
103
h
0068
104
i
0069
105
j
006A
106
k
006B
107
l
006C
108
m
006D
109
n
006E
110
o
006F
111
 
7_
 
p
0070
112
q
0071
113
r
0072
114
s
0073
115
t
0074
116
u
0075
117
v
0076
118
w
0077
119
x
0078
120
y
0079
121
z
007A
122


123


124


125


126
DEL
007F
127

National variants[edit]

Some national variants of ISO 646 are:

Code ISO-
IR
Standard Used in
CA-1 121 CSA Z243.4-1985 Canada (nr. 1 alternative, with “î”)
(French, classical)
CA-2 122 CSA Z243.4-1985 Canada (nr. 2 alternative, with “É”)
(French, reformed orthography)
CN 057 GB/T 1988-80 People's Republic of China (Basic Latin)
CU 151 NC 99-10:81 Cuba (Spanish)
DE 021 DIN 66003 Germany (German)
DK DS 2089 Denmark (Danish)
FI 010 SFS 4017 Finland (basic version)
FR 069 AFNOR NF Z 62010-1982 France (French)
FR-0 025 AFNOR NF Z 62010-1973 France (obsolete since April 1985)
GB 004 BS 4730 United Kingdom (English)
GR 088 HOS ELOT Greece (obsolete)
HU 086 MSZ 7795/3 Hungary (Hungarian)
IE 207 NSAI 433:1996 Ireland (Irish)
 
Code ISO-
IR
Standard Used in
INV 170 ISO 646:1983 Invariant subset
IRV 002 ISO 646:1983 International Reference Variant
JA 014 JIS C 6220-1969 Japan (Romaji)
JA-O 092 JIS C 6229-1984 Japan (OCR-B)
KR KS C 5636-1989 South Korea
MT ? Malta (Maltese, English)
NO 060 NS 4551 version 1 Norway
NO-2 061 NS 4551 version 2 Norway (obsolete since June 1987)
SE 010 SEN 85 02 00 Annex B Sweden (basic Swedish)
SE-C 011 SEN 85 02 00 Annex C Sweden (extended Swedish for names)
T.61 102 ITU/CCITT T.61 Recommendation International (Teletex)
TW CNS 5205-1996 Republic of China (Taiwan)
US 006 ANSI X3.4-1968 United States (ASCII)
YU 141 JUS I.B1.002 (YUSCII) former Yugoslavia (Croatian, Slovene, Serbian, Bosnian)

Other proprietary standards approved later for international use by some standard committees:

Code ISO-
IR
Approved by Origin Used in
ES 085 ECMA IBM Spain (Basque, Castilian, Catalan, Galician)
esp 017 ECMA Olivetti Spanish (international)
DK-SE 009-1 SIS NATS, main set Sweden and Denmark (journalistic texts)
 
Code ISO-
IR
Approved by Origin Used in
FI-SE 008-1 SIS NATS, main set Sweden and Finland (journalistic texts)
ita 015 ECMA Olivetti Italian
PT 084 ECMA IBM Portugal (Portuguese, Spanish)
por 016 ECMA Olivetti Portuguese (international)

The specifics of the changes for some of these variants are given in this table:

Codes Characters for each ISO 646 compatible charset
binary dec hex INV T.61 US JA JA-O KR CN TW IRV GB DK NO NO-2 FI,SE SE-C DE HU FR FR-0 CA-1 CA-2 IE IS ita por PT esp ES CU MT YU
010 0010 34 22 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "
010 0011 35 23   # # # # # # # # £ # # § # # # # £ £ # # £ # £ # £ # # # # #
010 0100 36 24   ¤ $ $ $ $ ¥ $ $ $ $ $ $ ¤ ¤ $ ¤ $ $ $ $ $ $ $ $ $ $ $ ¤ $ $
010 1001 39 27 ' ' ' ' ' ' ' ' '
010 1100 44 2C , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
010 1101 45 2D - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
010 1111 47 2F / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
100 0000 64 40   @ @ @ @ @ @ @ @ @ @ @ @ @ É § Á à à à à Ó Ð § § ´ § · @ @ Ž
101 1011 91 5B   [ [ [ [ [ [ [ [ [ Æ Æ Æ Ä Ä Ä É ° ° â â É Þ ° Ã Ã ¡ ¡ ¡ ġ Š
101 1100 92 5C     \ ¥ ¥ \ \ \ \ Ø Ø Ø Ö Ö Ö Ö ç ç ç ç Í \ ç Ç Ç Ñ Ñ Ñ ż Đ
101 1101 93 5D   ] ] ] ] ] ] ] ] ] Å Å Å Å Å Ü Ü § § ê ê Ú Æ é Õ Õ ¿ Ç ] ħ Ć
101 1110 94 5E     ^ ^ ^ ^ ^ ^ ˆ ˆ ˆ ˆ ˆ ˆ Ü ˆ ˆ ^ ˆ î É Á Ö ˆ ˆ ˆ ˆ ¿ ¿ ˆ Č
101 1111 95 5F _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
110 0000 96 60     ` `   ` ` ` ` ` ` ` ` ` é ` á µ µ ô ô ó ð ù ` ` ` ` ` ċ ž
111 1011 123 7B     { { { { { { { { æ æ æ ä ä ä é é é é é é þ à ã ã ° ´ ´ Ġ š
111 1100 124 7C   | | | | | | | | | ø ø ø ö ö ö ö ù ù ù ù í | ò ç ç ñ ñ ñ Ż đ
111 1101 125 7D     } } } } } } } } å å å å å ü ü è è è è ú æ è õ õ ç ç [ Ħ ć
111 1110 126 7E     ~   ˜ ˜ ˜ ¯ | ˜ ü ß ˝ ¨ ¨ û û á ö ì ° ˜ ˜ ¨ ¨ Ċ č

In the table above, the cells with non-white background emphasize the differences from the US variant used in the Basic Latin subset of ISO/IEC 10646 and Unicode.

The characters displayed in cells with red background could be used as combining characters, when preceded or followed with a backspace C0 control. This encoding method may be considered deprecated.

Later, when wider character sets gained more acceptance, ISO 8859, vendor-specific character sets and eventually Unicode became the preferred methods of coding most of these variants.

Variants of ASCII that are not ISO 646[edit]

There are also some 7-bit character sets that are not officially part of the ISO 646 standard. Examples include:

  • 7-bit Greek, ELOT 927. The Greek alphabet is mapped to positions 0x61–0x71 and 0x73–0x79, on top of the Latin lowercase letters.
  • 7-bit Cyrillic, KOI-7 or Short KOI. The Cyrillic characters are mapped to positions 0x60–0x7E, on top of the Latin lowercase letters. Superseded by the KOI-8 variants.
  • 7-bit Hebrew, SI 960. The Hebrew alphabet is mapped to positions 0x60–0x7A, on top of the lowercase Latin letters (and grave accent for aleph). 7-bit Hebrew was always stored in visual order. This mapping with the high bit set, i.e. with the Hebrew letters in 0xE0–0xFA, is ISO 8859-8.
  • 7-bit Arabic, ASMO 449. The Arabic alphabet is mapped to positions 0x41–0x5A and 0x60–0x6A, on top of both uppercase and lowercase Latin letters. This mapping with the high bit set is ISO 8859-6.

See also[edit]

References[edit]

  1. ^ "Invariant Character Handling". NISO Circulation Interchange Protocol. NCIP Standing Committee (NCIP-SC). 
  2. ^ Yuri Demchenko. "Section 4. INTERNATIONAL STANDARDIZATION OF 7-BIT CODES, ISO 646". Terena.org. Retrieved 2012-08-13. 

External links[edit]