Jump to content

EBCDIC

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Rwwww (talk | contribs) at 17:02, 14 April 2010 (punch card -> punched card). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Extended Binary Coded Decimal Interchange Code (EBCDIC) is an 8-bit character encoding (code page) used on IBM mainframe operating systems such as z/OS, OS/390, VM and VSE, as well as IBM midrange computer operating systems such as OS/400 and i5/OS (see also Binary Coded Decimal). It is also employed on various non-IBM platforms such as Fujitsu-Siemens' BS2000/OSD, HP MPE/iX, and Unisys MCP. EBCDIC descended from the code used with punched cards and the corresponding six bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s.

History

EBCDIC (pronounced /ˈɛbsəˌdɪk/) was devised in 1963 and 1964 by IBM and was announced with the release of the IBM System/360 line of mainframe computers. It was created to extend the Binary-Coded Decimal encoding that existed at the time. It is an 8-bit character encoding, in contrast to, and developed separately from, the 7-bit ASCII encoding scheme.

While IBM was a chief proponent of the ASCII standardization committee, they did not have time to prepare ASCII peripherals (such as card punch machines) to ship with its System/360 computers, so the company settled on EBCDIC at the time. The System/360 became wildly successful, and thus so did EBCDIC.

All IBM mainframe peripherals and operating systems (except Linux on zSeries or iSeries) use EBCDIC as their inherent encoding,[1] but software can translate to and from other encodings. Many hardware peripherals provide translation as well and modern mainframes (such as IBM zSeries) include processor instructions, at the hardware level, to accelerate translation between character sets.

At the time it was devised, EBCDIC made it relatively easy to enter data into a computer with punched cards. Since punched cards are no longer used on mainframes, EBCDIC is used in modern mainframes primarily for backwards compatibility. It does have an advantage of limiting the number of hole punches per column to 2 holes for uppercase and numbers, which increases the durability of these punched cards as they are handled by a card reader. This encoding is also known as Hollerith code. [2]

EBCDIC has no modern technical advantage over ASCII-based code pages such as the ISO-8859 series or Unicode. There are some technical niceties in each, e.g., ASCII and EBCDIC both have one bit which indicates upper or lower case. But there are some aspects of EBCDIC which make it much less pleasant to work with than ASCII (such as a non-contiguous alphabet). As with single-byte extended ASCII codepages, most EBCDIC codepages only allow up to 2 languages (English and one other language) to be used in a database or text file.

Where true support for multilingual text is desired, a system supporting far more characters is needed. Generally this is done with some form of Unicode support. There is an EBCDIC Unicode Transformation Format called UTF-EBCDIC proposed by the Unicode consortium, but it is not intended to be used in open interchange environments and, even on EBCDIC-based systems, it is almost never used. IBM mainframes support UTF-16, but they do not support UTF-EBCDIC natively.

Arabic EBCDIC versions are typically in presentation order, in left to right order as displayed by an older mainframe or line printer, rather than in the right to left logical order used by modern encodings such as Unicode.

Codepage layout

The table below is derived from[clarification needed] CCSID 500, one of the code page variants of EBCDIC, showing only the basic (English) EBCDIC characters. Characters 00–3F and FF are controls, 40 is space, 41 is no-break space (RSP: "Required Space"), E1 is numeric space (NSP: "Numeric Space"), and CA is soft hyphen. Characters are shown with their equivalent Unicode codes. Invariant alphanumeric, punctuation, and control characters common to all EBCDIC code pages are shown in color. Unassigned codes are typically filled with international or region-specific characters in the various EBCDIC code page variants.

EBCDIC
—0 —1 —2 —3 —4 —5 —6 —7 —8 —9 —A —B —C —D —E —F
0_ Template:Chset-color-ctrl|NUL
0000
0
Template:Chset-color-ctrl|SOH
0001
1
Template:Chset-color-ctrl|STX
0002
2
Template:Chset-color-ctrl|ETX
0003
3
Template:Chset-color-ctrl|SEL
 
4
Template:Chset-color-ctrl|HT
0009
5
Template:Chset-color-ctrl|RNL
 
6
Template:Chset-color-ctrl|DEL
007F
7
Template:Chset-color-ctrl|GE
 
8
Template:Chset-color-ctrl|SPS
 
9
Template:Chset-color-ctrl|RPT
 
10
Template:Chset-color-ctrl|VT
000B
11
Template:Chset-color-ctrl|FF
000C
12
Template:Chset-color-ctrl|CR
000D
13
Template:Chset-color-ctrl|SO
000E
14
Template:Chset-color-ctrl|SI
000F
15
1_ Template:Chset-color-ctrl|DLE
0010
16
Template:Chset-color-ctrl|DC1
0011
17
Template:Chset-color-ctrl|DC2
0012
18
Template:Chset-color-ctrl|DC3
0013
19
Template:Chset-color-ctrl|RES ENP
 
20
Template:Chset-color-ctrl|NL
0085
21
Template:Chset-color-ctrl|BS
0008
22
Template:Chset-color-ctrl|POC
 
23
Template:Chset-color-ctrl|CAN
0018
24
Template:Chset-color-ctrl|EM
0019
25
Template:Chset-color-ctrl|UBS
 
26
Template:Chset-color-ctrl|CU1
 
27
Template:Chset-color-ctrl|IFS
001C
28
Template:Chset-color-ctrl|IGS
001D
29
Template:Chset-color-ctrl|IRS
001E
30
Template:Chset-color-ctrl|IUS ITB
001F
31
2_ Template:Chset-color-ctrl|DS
 
32
Template:Chset-color-ctrl|SOS
 
33
Template:Chset-color-ctrl|FS
 
34
Template:Chset-color-ctrl|WUS
 
35
Template:Chset-color-ctrl|BYP INP
 
36
Template:Chset-color-ctrl|LF
000A
37
Template:Chset-color-ctrl|ETB
0017
38
Template:Chset-color-ctrl|ESC
001B
39
Template:Chset-color-ctrl|SA
 
40
Template:Chset-color-ctrl|SFE
 
41
Template:Chset-color-ctrl|SM SW
 
42
Template:Chset-color-ctrl|CSP
 
43
Template:Chset-color-ctrl|MFA
 
44
Template:Chset-color-ctrl|ENQ
0005
45
Template:Chset-color-ctrl|ACK
0006
46
Template:Chset-color-ctrl|BEL
0007
47
3_ Template:Chset-color-ctrl|
 
48
Template:Chset-color-ctrl|
 
49
Template:Chset-color-ctrl|SYN
0016
50
Template:Chset-color-ctrl|IR
 
51
Template:Chset-color-ctrl|PP
 
52
Template:Chset-color-ctrl|TRN
 
53
Template:Chset-color-ctrl|NBS
 
54
Template:Chset-color-ctrl|EOT
0004
55
Template:Chset-color-ctrl|SBS
 
56
Template:Chset-color-ctrl|IT
 
57
Template:Chset-color-ctrl|RFF
 
58
Template:Chset-color-ctrl|CU3
 
59
Template:Chset-color-ctrl|DC4
0014
60
Template:Chset-color-ctrl|NAK
0015
61
Template:Chset-color-ctrl|
 
62
Template:Chset-color-ctrl|SUB
001A
63
4_ Template:Chset-color-punct|SP
0020
64
Template:Chset-color-punct|RSP
00A0
65

 
66

 
67

 
68

 
69

 
70

 
71

 
72

 
73

 
74
Template:Chset-color-punct|.
002E
75
Template:Chset-color-punct|<
003C
76
Template:Chset-color-punct|(
0028
77
Template:Chset-color-punct|+
002B
78
|
007C
79
5_ Template:Chset-color-punct|&
0026
80

 
81

 
82

 
83

 
84

 
85

 
86

 
87

 
88

 
89
!
0021
90
$
0024
91
Template:Chset-color-punct|*
002A
92
Template:Chset-color-punct|)
0029
93
Template:Chset-color-punct|;
003B
94
¬
00AC
95
6_ Template:Chset-color-punct|-
002D
96
/
002F
97

 
98

 
99

 
100

 
101

 
102

 
103

 
104

 
105
Template:Chset-color-punct|¦
00A6
106
Template:Chset-color-punct|,
002C
107
Template:Chset-color-punct|%
0025
108
Template:Chset-color-punct|_
005F
109
Template:Chset-color-punct|>
003E
110
Template:Chset-color-punct|?
003F
111
7_
 
112

 
113

 
114

 
115

 
116

 
117

 
118

 
119

 
120
`
0060
121
Template:Chset-color-punct|:
003A
122
#
0023
123
Template:Chset-color-punct|@
0040
124
Template:Chset-color-punct|'
0027
125
Template:Chset-color-punct|=
003D
126
Template:Chset-color-punct|"
0022
127
8_
 
128
Template:Chset-color-alpha|a
0061
129
Template:Chset-color-alpha|b
0062
130
Template:Chset-color-alpha|c
0063
131
Template:Chset-color-alpha|d
0064
132
Template:Chset-color-alpha|e
0065
133
Template:Chset-color-alpha|f
0066
134
Template:Chset-color-alpha|g
0067
135
Template:Chset-color-alpha|h
0068
136
Template:Chset-color-alpha|i
0069
137

 
138

 
139

 
140

 
141

 
142
±
00B1
143
9_
 
144
Template:Chset-color-alpha|j
006A
145
Template:Chset-color-alpha|k
006B
146
Template:Chset-color-alpha|l
006C
147
Template:Chset-color-alpha|m
006D
148
Template:Chset-color-alpha|n
006E
149
Template:Chset-color-alpha|o
006F
150
Template:Chset-color-alpha|p
0070
151
Template:Chset-color-alpha|q
0071
152
Template:Chset-color-alpha|r
0072
153

 
154

 
155

 
156

 
157

 
158

 
159
A_
 
160
~
007E
161
Template:Chset-color-alpha|s
0073
162
Template:Chset-color-alpha|t
0074
163
Template:Chset-color-alpha|u
0075
164
Template:Chset-color-alpha|v
0076
165
Template:Chset-color-alpha|w
0077
166
Template:Chset-color-alpha|x
0078
167
Template:Chset-color-alpha|y
0079
168
Template:Chset-color-alpha|z
007A
169

 
170

 
171

 
172

 
173

 
174

 
175
B_ ^
005E
176

 
177

 
178

 
179

 
180

 
181

 
182

 
183

 
184

 
185
[
005B
186
]
005D
187

 
188

 
189

 
190

 
191
C_ {
007B
192
Template:Chset-color-alpha|A
0041
193
Template:Chset-color-alpha|B
0042
194
Template:Chset-color-alpha|C
0043
195
Template:Chset-color-alpha|D
0044
196
Template:Chset-color-alpha|E
0045
197
Template:Chset-color-alpha|F
0046
198
Template:Chset-color-alpha|G
0047
199
Template:Chset-color-alpha|H
0048
200
Template:Chset-color-alpha|I
0049
201
Template:Chset-color-punct|SHY
00AD
202

 
203

 
204

 
205

 
206

 
207
D_ }
007D
208
Template:Chset-color-alpha|J
004A
209
Template:Chset-color-alpha|K
004B
210
Template:Chset-color-alpha|L
004C
211
Template:Chset-color-alpha|M
004D
212
Template:Chset-color-alpha|N
004E
213
Template:Chset-color-alpha|O
004F
214
Template:Chset-color-alpha|P
0050
215
Template:Chset-color-alpha|Q
0051
216
Template:Chset-color-alpha|R
0052
217

 
218

 
219

 
220

 
221

 
222

 
223
E_ \
005C
224

 
225
Template:Chset-color-alpha|S
0053
226
Template:Chset-color-alpha|T
0054
227
Template:Chset-color-alpha|U
0055
228
Template:Chset-color-alpha|V
0056
229
Template:Chset-color-alpha|W
0057
230
Template:Chset-color-alpha|X
0058
231
Template:Chset-color-alpha|Y
0059
232
Template:Chset-color-alpha|Z
005A
233

 
234

 
235

 
236

 
237

 
238

 
239
F_ Template:Chset-color-digit|0
0030
240
Template:Chset-color-digit|1
0031
241
Template:Chset-color-digit|2
0032
242
Template:Chset-color-digit|3
0033
243
Template:Chset-color-digit|4
0034
244
Template:Chset-color-digit|5
0035
245
Template:Chset-color-digit|6
0036
246
Template:Chset-color-digit|7
0037
247
Template:Chset-color-digit|8
0038
248
Template:Chset-color-digit|9
0039
249

 
250

 
251

 
252

 
253

 
254
Template:Chset-color-ctrl |EO
 
255
—0 —1 —2 —3 —4 —5 —6 —7 —8 —9 —A —B —C —D —E —F

Criticism and humor

Open-source-software advocate and hacker Eric S. Raymond writes in his Jargon File that EBCDIC was almost universally loathed by early hackers and programmers because of its multitude of different versions, none of which resembled the other versions, and that IBM produced it in direct competition with the already-established ASCII.

The Jargon file 4.4.7 gives the following definition:

EBCDIC: /eb´s@·dik/, /eb´see`dik/, /eb´k@·dik/, n. [abbreviation, Extended Binary Coded Decimal Interchange Code] An alleged character set used on IBM dinosaurs. It exists in at least six mutually incompatible versions, all featuring such delights as non-contiguous letter sequences and the absence of several ASCII punctuation characters fairly important for modern computer languages (exactly which characters are absent varies according to which version of EBCDIC you're looking at). IBM adapted EBCDIC from punched card code in the early 1960s and promulgated it as a customer-control tactic (see connector conspiracy), spurning the already established ASCII standard. Today, IBM claims to be an open-systems company, but IBM's own description of the EBCDIC variants and how to convert between them is still internally classified top-secret, burn-before-reading. Hackers blanch at the very name of EBCDIC and consider it a manifestation of purest evil.

— The Jargon file 4.4.7

Another popular complaint is that the EBCDIC alphabetic characters follow an archaic punched card encoding rather than a linear ordering like ASCII. One consequence of this is that incrementing the character code for "I" does not produce the code for "J", and likewise there is a gap between the codes for "R" and "S". Thus programming a simple control loop to cycle through only the alphabetic characters is problematic.

These incompatibilities were also the source of many jokes. A popular[citation needed] one went:

Professor: "So the American government went to IBM to come up with an encryption standard, and they came up with—"
Student: "EBCDIC!"


A reference to the EBCDIC character set is made in the classic Infocom adventure game Zork II. In the "Machine Room", there is a collection of ancient computers and other machines of uncertain purpose. The following is the description of the room, with EBCDIC used to imply an incomprehensible language:

This is a large room full of assorted heavy machinery, whirring noisily. The room smells of burned resistors. Along one wall are three buttons which are, respectively, round, triangular, and square. Naturally, above these buttons are instructions written in EBCDIC...

See also

References

  1. ^ IBM (2008). "IBM confirms the use of EBCDIC in their mainframes as a default practice". Retrieved 2008-06-16.
  2. ^ Haralambous, Yannis; Horne, P Scott (2007). Fonts & Encodings. O'Reilly Media, Inc. p. 33–34. ISBN 978-0596102425.