Chen–Ho encoding

Chen–Ho encoding is a memory-efficient alternate system of binary encoding for decimal digits.

The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10).^[1]

The encoding reduces the storage requirements of two decimal digits (100 states) from 8 to 7 bits, and those of three decimal digits (1000 states) from 12 to 10 bits using only simple Boolean{dn|date=September 2018}} transformations avoiding any complex arithmetic operations like a base conversion.

History

In what appears to have been a multiple discovery, some of the concepts behind what later became known as Chen–Ho encoding were independently developed by Theodore M. Hertz in 1969^[2] and by Tien Chi Chen in 1971.^[3]

Hertz of Rockwell filed a patent for his encoding in 1969, which was granted in 1971.^[2]

Chen first discussed his ideas with Irving Tze Ho^[4] in 1971. Chen and Ho were both working for IBM at the time, although in different locations.^[5]^[6] Chen also consulted with Frank C. Tung^[7] to verify the results of his theories independently.^[6] IBM filed a patent in their name in 1973, which was granted in 1974.^[8] At least by 1973 Hertz's earlier work must have been known to them, as the patent cites his patent as prior art.^[8]

The final version of the Chen–Ho encoding was circulated inside IBM in 1974^[9] and published in 1975 in the journal Communications of the Association for Computing Machinery (CACM).^[10]^[11] This version included several refinements, primarily related to the application of the encoding system. It constitutes a Huffman-like prefix code.

The encoding became known as Chen–Ho encoding or Chen–Ho algorithm only since 2000.^[11] After having filed a patent in 2001,^[12] Michael F. Cowlishaw published a further refinement of Chen–Ho encoding known as Densely Packed Decimal (DPD) encoding in IEE Proceedings – Computers and Digital Techniques in 2002.^[13]^[14] Densely Packed Decimal has subsequently been adopted as the decimal encoding used in the IEEE 754-2008 and ISO/IEC/IEEE 60559:2011 floating-point standards.

Application

Chen noted that the digits zero through seven were simply encoded using three binary digits of the corresponding octal group. He also postulated that one could use a flag to identify a different encoding for the digits eight and nine, which would be encoded using a single bit.

In practice, a series of Boolean transformations are applied to the stream of input bits, compressing BCD encoded digits from 12 bits per three digits to 10 bits per three digits. Reversed transformations are used to decode the resulting coded stream to BCD. Equivalent results can also be achieved by the use of a look-up table.

Chen–Ho encoding is limited to encoding sets of three decimal digits into groups of 10 bits (so called declets).^[1] Of the 1024 states possible by using 10 bits, it leaves only 24 states unused^[1] (with don't care bits typically set to 0 on write and ignored on read). With only 0.34% wastage it gives a 20% more efficient encoding than BCD with one digit in 4 bits.^[6]^[11]

Both Hertz and Chen also proposed similar, but less efficient, encoding schemes to compress sets of two decimal digits (requiring 8 bits in BCD) into groups of 7 bits.^[2]^[6]

Larger sets of decimal digits could by divided into three- and two-digit groups.^[2]

The patents also discuss the possibility to adapt the scheme to digits encoded in any other decimal codes than BCD, like f.e. Excess-3.^[2] The same principles could also be applied to other bases.

In 1973, some form of Chen–Ho encoding appears to have been utilized in the address conversion hardware of the optional IBM 7070/7074 emulation feature for the IBM System/370 Model 165 and 370 Model 168 computers.^[15]^[16]

One prominent application uses a 128-bit register to store 33 decimal digits with a three digit exponent, effectively not less than what could be achieved using binary encoding (whereas BCD encoding would need 144 bits to store the same number of digits).

Storage efficiency
BCD				Necessary bits				Bit difference
Digits	States	Bits	Binary code space	Binary encoding [A]	2-digit encoding [B]	3-digit encoding [C]	Mixed encoding	Mixed vs. Binary	Mixed vs. BCD
1	10	4	16	4	(7)	(10)	4 [1×A]	0	0
2	100	8	128	7	7	(10)	7 [1×B]	0	−1
3	1000	12	1024	10	(14)	10	10 [1×C]	0	−2
4	10000	16	16384	14	14	(20)	14 [2×B]	0	−2
5	100000	20	131072	17	(21)	(20)	17 [1×C+1×B]	0	−3
6	1000000	24	1048576	20	21	20	20 [2×C]	0	−4
7	10000000	28	16777216	24	(28)	(30)	24 [2×C+1×A]	0	−4
8	100000000	32	134217728	27	28	(30)	27 [2×C+1×B]	0	−5
9	1000000000	36	1073741824	30	(35)	30	30 [3×C]	0	−6
10	10000000000	40	17179869184	34	35	(40)	34 [3×C+1×A]	0	−6
11	100000000000	44	137438953472	37	(42)	(40)	37 [3×C+1×B]	0	−7
12	1000000000000	48	1099511627776	40	42	40	40 [4×C]	0	−8
13	10000000000000	52	17592186044416	44	(49)	(50)	44 [4×C+1×A]	0	−8
14	100000000000000	56	140737488355328	47	49	(50)	47 [4×C+1×B]	0	−9
15	1000000000000000	60	1125899906842624	50	(56)	50	50 [5×C]	0	−10
16	10000000000000000	64	18014398509481984	54	56	(60)	54 [5×C+1×A]	0	−10
17	100000000000000000	68	144115188075855872	57	(63)	(60)	57 [5×C+1×B]	0	−11
18	1000000000000000000	72	1152921504606846976	60	63	60	60 [6×C]	0	−12
19	10000000000000000000	76	18446744073709551616	64	(70)	(70)	64 [6×C+1×A]	0	−12
20	…	80	…	67	70	(70)	67 [6×C+1×B]	0	−13
21	…	84	…	70	(77)	70	70 [7×C]	0	−14
22	…	88	…	74	77	(80)	74 [7×C+1×A]	0	−14
23	…	92	…	77	(84)	(80)	77 [7×C+1×B]	0	−15
24	…	96	…	80	84	80	80 [8×C]	0	−16
25	…	100	…	84	(91)	(90)	84 [8×C+1×A]	0	−16
26	…	104	…	87	91	(90)	87 [8×C+1×B]	0	−17
27	…	108	…	90	(98)	90	90 [9×C]	0	−18
28	…	112	…	94	98	(100)	94 [9×C+1×A]	0	−18
29	…	116	…	97	(105)	(100)	97 [9×C+1×B]	0	−19
30	…	120	…	100	105	100	100 [10×C]	0	−20
31	…	124	…	103	(112)	(110)	104 [10×C+1×A]	+1	−20
32	…	128	…	107	112	(110)	107 [10×C+1×B]	0	−21
33	…	132	…	110	(119)	110	110 [11×C]	0	−22
34	…	136	…	113	119	(120)	114 [11×C+1×A]	+1	−22
35	…	140	…	117	(126)	(120)	117 [11×C+1×B]	0	−23
36	…	144	…	120	126	120	120 [12×C]	0	−24
37	…	148	…	123	(133)	(130)	124 [12×C+1×A]	+1	−24
38	…	152	…	127	133	(130)	127 [12×C+1×B]	0	−25
…	…	…	…	…	…	…	…	…	…

Encodings for three decimal digits

Hertz encoding

Hertz decimal data encoding for a single declet (1969 form)^[2]
Code space (1024 states)	b9	b8	b7	b6	b5	b4	b3	b2	b1	b0	d2	d1	d0	Values encoded	Description	Possibilities (1000 states)
Binary encoding											Decimal digits
50.0% (512 states)	0	a	b	c	d	e	f	g	h	i	0abc	0def	0ghi	(0–7) (0–7) (0–7)	Three lower digits	51.2% (512 states)
37.5% (384 states)	1	0	0	c	d	e	f	g	h	i	100c	0def	0ghi	(8–9) (0–7) (0–7)	Two lower digits, one higher digit	38.4% (384 states)
	1	0	1	f	a	b	c	g	h	i	0abc	100f	0ghi	(0–7) (8–9) (0–7)
	1	1	0	i	a	b	c	d	e	f	0abc	0def	100i	(0–7) (0–7) (8–9)
9.375% (96 states)	1	1	1	f	0	0	i	a	b	c	0abc	100f	100i	(0–7) (8–9) (8–9)	One lower digit, two higher digits	9.6% (96 states)
	1	1	1	c	0	1	i	d	e	f	100c	0def	100i	(8–9) (0–7) (8–9)
	1	1	1	c	1	0	f	g	h	i	100c	100f	0ghi	(8–9) (8–9) (0–7)
3.125% (32 states, 8 used)	1	1	1	c	1	1	f	(0)	(0)	i	100c	100f	100i	(8–9) (8–9) (8–9)	Three higher digits, bits b2 and b1 are don't care	0.8% (8 states)

Early Chen–Ho encoding

Decimal data encoding for a single declet (early 1971 form)^[6]
Code space (1024 states)	b9	b8	b7	b6	b5	b4	b3	b2	b1	b0	d2	d1	d0	Values encoded	Description	Possibilities (1000 states)
Binary encoding											Decimal digits
50.0% (512 states)	0	a	b	c	d	e	f	g	h	i	0abc	0def	0ghi	(0–7) (0–7) (0–7)	Three lower digits	51.2% (512 states)
37.5% (384 states)	1	0	0	c	d	e	f	g	h	i	100c	0def	0ghi	(8–9) (0–7) (0–7)	Two lower digits, one higher digit	38.4% (384 states)
	1	0	1	f	g	h	i	a	b	c	0abc	100f	0ghi	(0–7) (8–9) (0–7)
	1	1	0	i	a	b	c	d	e	f	0abc	0def	100i	(0–7) (0–7) (8–9)
9.375% (96 states)	1	1	1	0	0	f	i	a	b	c	0abc	100f	100i	(0–7) (8–9) (8–9)	One lower digit, two higher digits	9.6% (96 states)
	1	1	1	0	1	i	c	d	e	f	100c	0def	100i	(8–9) (0–7) (8–9)
	1	1	1	1	0	c	f	g	h	i	100c	100f	0ghi	(8–9) (8–9) (0–7)
3.125% (32 states, 8 used)	1	1	1	1	1	c	f	i	(0)	(0)	100c	100f	100i	(8–9) (8–9) (8–9)	Three higher digits, bits b2 and b1 are don't care	0.8% (8 states)

Patented Chen–Ho encoding

Decimal data encoding for a single declet (patented 1973 form)^[8]
Code space (1024 states)	b9	b8	b7	b6	b5	b4	b3	b2	b1	b0	d2	d1	d0	Values encoded	Description	Possibilities (1000 states)
Binary encoding											Decimal digits
50.0% (512 states)	0	a	b	d	e	g	h	c	f	i	0abc	0def	0ghi	(0–7) (0–7) (0–7)	Three lower digits	51.2% (512 states)
37.5% (384 states)	1	0	0	d	e	g	h	c	f	i	100c	0def	0ghi	(8–9) (0–7) (0–7)	Two lower digits, one higher digit	38.4% (384 states)
	1	0	1	a	b	g	h	c	f	i	0abc	100f	0ghi	(0–7) (8–9) (0–7)
	1	1	0	d	e	a	b	c	f	i	0abc	0def	100i	(0–7) (0–7) (8–9)
9.375% (96 states)	1	1	1	1	0	a	b	c	f	i	0abc	100f	100i	(0–7) (8–9) (8–9)	One lower digit, two higher digits	9.6% (96 states)
	1	1	1	0	1	d	e	c	f	i	100c	0def	100i	(8–9) (0–7) (8–9)
	1	1	1	0	0	g	h	c	f	i	100c	100f	0ghi	(8–9) (8–9) (0–7)
3.125% (32 states, 8 used)	1	1	1	1	1	(0)	(0)	c	f	i	100c	100f	100i	(8–9) (8–9) (8–9)	Three higher digits, bits b2 and b1 are don't care	0.8% (8 states)

Final Chen–Ho encoding

Chen-Ho decimal data encoding for a single declet (final 1975 form)^[10]^[11]
Code space (1024 states)	b9	b8	b7	b6	b5	b4	b3	b2	b1	b0	d2	d1	d0	Values encoded	Description	Possibilities (1000 states)
Binary encoding											Decimal digits
50.0% (512 states)	0	a	b	c	d	e	f	g	h	i	0abc	0def	0ghi	(0–7) (0–7) (0–7)	Three lower digits	51.2% (512 states)
37.5% (384 states)	1	0	0	c	d	e	f	g	h	i	100c	0def	0ghi	(8–9) (0–7) (0–7)	Two lower digits, one higher digit	38.4% (384 states)
	1	0	1	c	a	b	f	g	h	i	0abc	100f	0ghi	(0–7) (8–9) (0–7)
	1	1	0	c	d	e	f	a	b	i	0abc	0def	100i	(0–7) (0–7) (8–9)
9.375% (96 states)	1	1	1	c	0	0	f	a	b	i	0abc	100f	100i	(0–7) (8–9) (8–9)	One lower digit, two higher digits	9.6% (96 states)
	1	1	1	c	0	1	f	d	e	i	100c	0def	100i	(8–9) (0–7) (8–9)
	1	1	1	c	1	0	f	g	h	i	100c	100f	0ghi	(8–9) (8–9) (0–7)
3.125% (32 states, 8 used)	1	1	1	c	1	1	f	(0)	(0)	i	100c	100f	100i	(8–9) (8–9) (8–9)	Three higher digits, bits b2 and b1 are don't care	0.8% (8 states)

Encodings for two decimal digits

Hertz encoding

Hertz decimal data encoding for a single heptad (1969 form)^[2]
Code space (128 states)	b6	b5	b4	b3	b2	b1	b0	d1	d0	Values encoded	Description	Possibilities (100 states)
Binary encoding								Decimal digits
50.0% (64 states)	0	a	b	c	d	e	f	0abc	0def	(0–7) (0–7)	Two lower digits	64.0% (64 states)
12.5% (16 states)	1	1	0	c	d	e	f	100c	0def	(8–9) (0–7)	One lower digit, one higher digit	16.0% (16 states)
12.5% (16 states)	1	0	1	f	a	b	c	0abc	100f	(0–7) (8–9)	One lower digit, one higher digit	16.0% (16 states)
12.5% (16 states, 4 used)	1	1	1	c	x	x	f	100c	100f	(8–9) (8–9)	Two higher digits	4.0% (4 states)

Early Chen–Ho encoding, method A

Decimal data encoding for a single heptad (early 1971 form, method A)^[6]
Code space (128 states)	b6	b5	b4	b3	b2	b1	b0	d1	d0	Values encoded	Description	Possibilities (100 states)
Binary encoding								Decimal digits
50.0% (64 states)	0	a	b	c	d	e	f	0abc	0def	(0–7) (0–7)	Two lower digits	64.0% (64 states)
25.0% (32 states, 16 used)	1	0	x	c	d	e	f	100c	0def	(8–9) (0–7)	One lower digit, one higher digit	16.0% (16 states)
12.5% (16 states)	1	1	0	f	a	b	c	0abc	100f	(0–7) (8–9)	One lower digit, one higher digit	16.0% (16 states)
12.5% (16 states, 4 used)	1	1	1	c	x	x	f	100c	100f	(8–9) (8–9)	Two higher digits	4.0% (4 states)

Patented Chen–Ho encoding

Decimal data encoding for a single heptad (patented 1973 form)^[8]
Code space (128 states)	b6	b5	b4	b3	b2	b1	b0	d1	d0	Values encoded	Description	Possibilities (100 states)
Binary encoding								Decimal digits
50.0% (64 states)	0	a	b	c	d	e	f	0abc	0def	(0–7) (0–7)	Two lower digits	64.0% (64 states)
25.0% (32 states, 16 used)	1	0	x	c	d	e	f	100c	0def	(8–9) (0–7)	One lower digit, one higher digit	16.0% (16 states)
12.5% (16 states)	1	1	1	c	a	b	f	0abc	100f	(0–7) (8–9)	One lower digit, one higher digit	16.0% (16 states)
12.5% (16 states, 4 used)	1	1	0	c	x	x	f	100c	100f	(8–9) (8–9)	Two higher digits	4.0% (4 states)

References

^ ^a ^b ^c Muller, Jean-Michel; Brisebarre, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Stehlé, Damien; Torres, Serge (2010). Handbook of Floating-Point Arithmetic (1 ed.). Birkhäuser. doi:10.1007/978-0-8176-4705-6. ISBN 978-0-8176-4704-9. LCCN 2009939668.
^ ^a ^b ^c ^d ^e ^f ^g Hertz, Theodore M. (1971-11-02) [1969-12-15]. "System for the compact storage of decimal numbers" (US Patent). Whittier, CA, USA: North American Rockwell Corporation. US3618047A. Retrieved 2018-07-18. [1] [2] (NB. A coding system very similar to Chen-Ho, also cited as prior art in the Chen–Ho patent.)
^ "CHEN Tien Chi". Archived from the original on 2015-10-23. Retrieved 2016-02-07. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ Tseng, Li-Ling (1988-04-01). "High-Tech Leadership: Irving T. Ho". Taiwan Info. Archived from the original on 2016-01-01. Retrieved 2016-02-08. {{cite web}}: |archive-date= / |archive-url= timestamp mismatch; 2016-02-08 suggested (help); Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ Chen, Tien Chi (1971-03-12). "Decimal-binary integer conversion scheme" (Internal memo to Irving Tze Ho). San Jose Research Laboratory: IBM. {{cite journal}}: Cite journal requires |journal= (help)
^ ^a ^b ^c ^d ^e ^f Chen, Tien Chi (1971-03-29). "Decimal Number Compression" (PDF) (Internal memo to Irving Tze Ho). San Jose Research Laboratory: IBM: 1–4. Archived from the original (PDF) on 2012-10-17. Retrieved 2016-02-07. {{cite journal}}: Cite journal requires |journal= (help); Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ "IBM资深专家Frank Tung博士8月4日来我校演讲". Archived from the original on 2004-12-08. Retrieved 2016-02-06. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ ^a ^b ^c ^d Chen, Tien Chi; Ho, Irving Tze (1974-10-15) [1973-06-18]. Written at San Jose, CA, USA & Poughkeepsie, NY, USA. "Binary coded decimal conversion apparatus" (US Patent). Armonk, NY, USA: International Business Machines Corporation (IBM). US3842414A. Retrieved 2018-07-18. [3] [4] (NB. This patent is about the Chen–Ho algorithm.)
^ Chen, Tien Chi; Ho, Irving Tze (1974-06-25). "Storage-Efficient Representation of Decimal Data". Research Report RJ 1420 (Technical report). IBM Research Lab, San Jose, USA: IBM.
^ ^a ^b Chen, Tien Chi; Ho, Irving Tze (January 1975). "Storage-Efficient Representation of Decimal Data". Communications of the Association for Computing Machinery (CACM). 18 (1): 49–52. doi:10.1145/360569.360660. Retrieved 2016-02-07.
^ ^a ^b ^c ^d Cowlishaw, Michael Frederic (2014) [2000]. "A Summary of Chen-Ho Decimal Data encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ Cowlishaw, Michael Frederic (2003-02-25) [2002-05-20, 2001-01-27]. Written at Coventry, UK. "Decimal to binary coder/decoder" (US Patent). Armonk, NY, USA: International Business Machines Corporation (IBM). US6525679B1. Retrieved 2018-07-18. [5] and Cowlishaw, Michael Frederic (2007-11-07) [2004-01-14, 2002-08-14, 2001-09-24, 2001-01-27]. Written at Winchester, Hampshire, UK. "Decimal to binary coder/decoder" (European Patent). Armonk, NY, USA: International Business Machines Corporation (IBM). EP1231716A2. Retrieved 2018-07-18. [6] [7] [8] (NB. This patent about DPD also discusses the Chen–Ho algorithm.)
^ Cowlishaw, Michael Frederic (May 2002). "Densely Packed Decimal Encoding". IEE Proceedings – Computers and Digital Techniques. 149 (3). London: Institution of Electrical Engineers (IEE): 102–104. doi:10.1049/ip-cdt:20020407. ISSN 1350-2387. Retrieved 2016-02-07.
^ Cowlishaw, Michael Frederic (2007-02-13) [2000]. "A Summary of Densely Packed Decimal encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ Savard, John J. G. (2018) [2007]. "Chen-Ho Encoding and Densely Packed Decimal". quadibloc. Archived from the original on 2018-07-16. Retrieved 2018-07-16. {{cite web}}: |archive-date= / |archive-url= timestamp mismatch; 2018-07-03 suggested (help); Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
^ 7070/7074 Compatibility Feature for IBM System/370 Models 165, 165 II, and 168 (PDF) (2 ed.). IBM. June 1973 [1970]. GA22-6958-1 (File No. 5/370-13). Archived from the original (PDF) on 2018-07-22. Retrieved 2018-07-21. {{cite book}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)

Chen–Ho encoding

History

Application

Encodings for three decimal digits

Hertz encoding

Early Chen–Ho encoding

Patented Chen–Ho encoding

Final Chen–Ho encoding

Encodings for two decimal digits

Hertz encoding

Early Chen–Ho encoding, method A

Patented Chen–Ho encoding

See also

References

Further reading