IBM Floating Point Architecture
In comparison to IEEE 754 floating-point, the IBM floating-point format has a longer significand, and a shorter exponent. All IBM floating-point formats have 7 bits of exponent with a bias of 64. The normalized range of representable numbers is from 16-65 to 1663 (approx. 5.39761 × 10-79 to 7.237005 × 1075).
The number is represented as the following formula: (-1)sign × 0.significand × 16exponent-64
- 1 Single-precision 32-bit
- 2 Double-precision 64-bit
- 3 Extended-precision 128-bit
- 4 Arithmetic Operations
- 5 IEEE 754 on IBM mainframes
- 6 Special uses
- 7 Systems which use Base-16 Excess-64 Floating-Point format
- 8 See also
- 9 References
- 10 Further reading
A single-precision binary floating-point number is stored in a 32-bit word:
1 7 24 (width in bits) S Exp Fraction 31 30 ... 24 23 ... 0 (bit index)* * Note that IBM documentation numbers the bits from right to left, so that the most significant bit is designated as bit number 0.
Note that in this format the initial bit is not suppressed, and the radix point is set to the left of the mantissa in increments of 4 bits.
Since the base is 16, the exponent in this form is about twice as large as the equivalent in IEEE 754, in order to have similar exponent range in binary, 9 exponent bits would be required.
Let us decode the number −118.625 using the IBM floating-point system.
We need to get the sign, the exponent and the fraction.
Because it is a negative number, the sign is "1". Let's find the others.
First, we write the number (without the sign) using binary notation. Look at binary numeral system to see how to do it. The result is 1110110.101
Now, let's move the radix point left, moving four bits at a time (because IBM exponents are written to the power of 16, not 2 as in IEEE): 1110110.101=.01110110101·162
The fraction is the part at the right of the radix point, filled with 0 on the right until we get all 24 bits. That is 011101101010000000000000.
The exponent is 2, but we need to convert it to binary and bias it (so the most negative exponent is 0, and all exponents are non-negative binary numbers). For the system/360 format, the bias is 64 and so 2 + 64 = 66. In binary, this is written as 1000010.
Putting them all together:
S Exp Fraction 1 100 0010 0111 0110 1010 0000 0000 0000
Largest representable number
S Exp Fraction 0 111 1111 1111 1111 1111 1111 1111 1111
The number represented is 0.FFFFFF16 × 16127 - 64 = (1 - 16-6) × 1663 ≈ 7.2370051 × 1075
Smallest positive normalized number
S Exp Fraction 0 000 0000 0001 0000 0000 0000 0000 0000
The number represented is 0.116 × 160 - 64 = 16-1 × 16-64 ≈ 5.397605 × 10-79
Since the base is 16, there can be three leading zero bits in the binary significand. That means when the number is converted into binary, there can be as few as 21 bits of precision. Because of the "wobbling precision" effect, this can cause some calculations to be very inaccurate.
A good example of the inaccuracy is representation of decimal value 0.1. It has no exact binary or hexadecimal representation. In hexadecimal format, it is represented as 0.19999999...16 or 0.0001 1001 1001 1001 1001 1001 1001...2, that is:
S Exp Fraction 0 100 0000 0001 1001 1001 1001 1001 1010
This has only 21 bits, whereas the binary version has 24 bits of precision.
Six hexadecimal digits of precision is roughly equivalent to six decimal digits (i.e. (6 - 1) log10(16) ≈ 6.02). A conversion of single precision hexadecimal float to decimal string would require at least 9 significant digits (i.e. 6 log10(16) + 1 ≈ 8.22) in order to convert back to the same hexadecimal float value.
Double-precision is the same except that the mantissa (fraction) field is wider and the double-precision number is stored in a double word (8 bytes):
1 7 56 (width in bits) S Exp Fraction 63 62 ... 56 55 ... 0 (bit index)* * Note that IBM documentation numbers the bits from right to left, so that the most significant bit is designated as bit number 0.
Note that the exponent range for this format is only about a quarter as large as the corresponding IEEE binary format.
14 hexadecimal digits of precision is roughly equivalent to 17 decimal digits. A conversion of double precision hexadecimal float to decimal string would require at least 18 significant digits in order to convert back to the same hexadecimal float value.
Extended-precision (quadruple-precision) was added to the System/370 series and was available on some S/360 models (S/360-85, -195, and others by special request or simulated by OS software). The extended-precision mantissa (fraction) field is wider, and the extended-precision number is stored as two double words (16 bytes):
High-order part 1 7 56 (width in bits) S Exp Fraction (high-order 14 digits) 127 126 ... 120 119 ... 64 (bit index)* Low-order part 8 56 (width in bits) Unused Fraction (low-order 14 digits) 63 ... 56 55 ... 0 (bit index)* * Note that IBM documentation numbers the bits from right to left, so that the most significant bit is designated as bit number 0.
28 hexadecimal digits of precision is roughly equivalent to 32 decimal digits. A conversion of extended precision hexadecimal float to decimal string would require at least 35 significant digits in order to convert back to the same hexadecimal float value.
Most arithmetic operations truncate like simple pocket calculators. Therefore, 1 - 16-7 = 1. In this case, the result is rounded away from zero.
IEEE 754 on IBM mainframes
Starting with the S/390 G5 in 1998, IBM mainframes have also included IEEE binary floating-point units which conform to the IEEE 754 Standard for Floating-Point Arithmetic. IEEE decimal floating-point was added to IBM System z9 GA2 in 2007 using millicode and in 2008 to the IBM System z10 in hardware.
Modern IBM mainframes support three floating-point radices with 3 hexadecimal (HFP) formats, 3 binary (BFP) formats, and 3 decimal (DFP) formats. There are two floating-point units per core; one supporting HFP and BFP, and one supporting DFP; note there is one register file, FPRs, which holds all 3 formats.
The IBM floating-point format is used in:
- SAS 5 Transport files (.XPT) as required by the Food and Drug Administration (FDA) for New Drug Application (NDA) study submissions
- GRIB (GRIdded Binary) data files to exchange the output of weather prediction models,
- GDS II (Graphic Database System II) format files, and
- SEG Y (Society of Exploration Geophysicists Y) format files.
Systems which use Base-16 Excess-64 Floating-Point format
- IBM System/360
- RCA Spectra
- English Electric System 4
- GEC 4000 series minicomputers
- Interdata 16- and 32-bit computers.
- IBM System/360 Principles of Operation, IBM Publication A22-6821-6, Seventh Edition (January 13, 1967), pp.41-50
- IBM System/370 Principles of Operation, IBM Publication GA22-7000-4, Fifth Edition (September 1, 1975), pp.157-170
- IBM System/370 Principles of Operation, IBM Publication SA22-7832-01, Second Edition (Oktober, 2001), capter 9 ff.
- ESA/390 Enhanced Floating Point Support: An Overview
- Schwarz, E. M.; Krygowski, C. A. (September 1999). "The S/390 G5 floating-point unit". IBM Journal of Research and Development 43 (5.6): 707–721. doi:10.1147/rd.435.0707.
- Duale, A. Y.; Decker, M. H.; Zipperer, H. -G.; Aharoni, M.; Bohizic, T. J. (January 2007). "Decimal floating-point in z9: An implementation and testing perspective". IBM Journal of Research and Development 51 (1.2): 217–227. doi:10.1147/rd.511.0217.
- Heller, L. C.; Farrell, M. S. (May 2004). "Millicode in an IBM zSeries processor". IBM Journal of Research and Development 48 (3.4): 425–434. doi:10.1147/rd.483.0425.
- Schwarz, E. M.; Kapernick, J. S.; Cowlishaw, M. F. (January 2009). "Decimal floating-point support on the IBM System z10 processor". IBM Journal of Research and Development 53 (1): 4:1–4:10. doi:10.1147/JRD.2009.5388585.
- The Record Layout of a Data Set in SAS Transport (XPORT) Format
- Sweeney, D. W. (1965). "An analysis of floating-point addition". IBM Systems Journal 4 (1): 31–42. doi:10.1147/sj.41.0031.
- Tomayko, J. (Summer 1995). "System 360 Floating-Point Problems". IEEE Annals of the History of Computing 17 (2): 62–63. doi:10.1109/MAHC.1995.10006. ISSN 1058-6180.
- Harding, L. J. (1966), "Idiosyncrasies of System/360 Floating-Point", Proceedings of SHARE 27, Aug. 8–12 1966, Presented at SHARE XXVII, Toronto, Canada
- Harding, L. J. (1966), "Modifications of System/360 Floating Point", SHARE Secretary Distribution, pp. 11–27, SSD 157, C4470
- Anderson, S. F.; Earle, J. G.; Goldschmidt, R. E.; Powers, D. M. (January 1967). "The IBM System/360 Model 91: Floating-Point Execution Unit". IBM Journal of Research and Development 11 (1): 34–53. doi:10.1147/rd.111.0034.
- Padegs, A. (1968). "Structural aspects of the System/360 Model 85, III: Extensions to floating-point architecture". IBM Systems Journal 7 (1): 22–29. doi:10.1147/sj.71.0022.
- Schwarz, E. M.; Sigal, L.; McPherson, T. J. (July 1997). "CMOS floating-point unit for the S/390 Parallel Enterprise Server G4". IBM Journal of Research and Development 41 (4.5): 475–488. doi:10.1147/rd.414.0475.