decimal32 floating-point format
Floating-point formats |
---|
IEEE 754 |
|
Other |
Alternatives |
Tapered floating point |
In computing, decimal32 is a decimal floating-point computer numbering format that occupies 4 bytes ( 32 bits ) in computer memory. Like the binary16 and binary32 formats, it is intended for memory saving storage.
In contrast to the binaryxxx data formats the decimalxxx formats provide exact representation of decimal fractions, exact calculations with them and enable human common 'ties away from zero' rounding ( in some range, to some precision, to some degree ). In a trade-off for reduced performance. They are intended for applications where exact schoolhouse math is requested, such as financial and tax computations. ( In short they avoid plenty of problems like 0.2 + 0.1 -> 0.30000000000000004 which happen with binary datatypes. )
Decimal32 floating point is a relatively new decimal floating-point format, formally introduced in the 2008 version[1] of IEEE 754 as well as with ISO/IEC/IEEE 60559:2011.[2]
Decimal32 supports 7 decimal digits of significand and an exponent range of −95 to +96, for 'normal' values which can have 7 digits of significance. 'denormal' values with less significant digits reach further down to 1E-101. I.e. ±0.000001×10 −95 to ±9.999999×10 96, and zero, infinities and NaN's. ( Equivalently, ±0000001×10 −101 to ±9999999×10 90. )
( What is stored in bits can be interpreted as 'stored value for the exponent minus bias of 95' times significand understood as d_0.d_-1 d_-2 d_-3 d_-4 d_-5 d_-6 ( note: radix point after first digit, significand fractional ), or 'stored value for the exponent minus bias of 101' times significand understood as d_6 d_5 d_4 d_3 d_2 d_1 d_0 ( note: no radix point, significand integral ), both lead to the same result [ 2019 version[3] of IEEE 754 in clause 3.3, page 18 ]. For decimal datatypes the second view is more common, while for binary datatypes the first. )
The corresponding binary format binary32 has an approximate range from ±1×10 −45 to ±3.4028235×10 38.
Because the significand is not normalized ( the leading digit is allowed to be "0" ), most values with less than 7 significant digits have multiple possible representations; 1000000 × 10-2=100000 × 10-1=10000 × 100 all have the value 10 000. These sets of representations for a same value are called cohorts, the different members can be used to denote how many digits of the value are known precisely. Zero has 192 possible representations ( 384 when both signed zeros are included ).
Representation of decimal32 values
[edit]decimal32 values are represented in a 'not normalized' form and with combining some bits of the exponent with the leading bits of the significand in a 'combination field'.
Sign | Combination | Trailing significand field |
---|---|---|
1 bit | 11 bits | 20 bits |
s |
ggggggggggg |
tttttttttttttttttttt
|
That enables a little more precision and range, in trade-off that some simple functions like sort and compare, very frequently used in coding, do not work on the bit pattern but require computations to extract exponent and significand and then try to obtain a exponent aligned representation. This effort is partly balanced by saving the effort for normalization, but contributes to the slower performance of the decimal datatypes.
IEEE 754 allows two alternative representation methods for decimal32 values. The standard does not specify how to signify which representation is used, for instance in a situation where decimal32 values are communicated between systems.
In one representation method, based on binary integer decimal ( BID ), the significand is represented as binary coded positive integer.
The other, alternative, representation method is based on densely packed decimal ( DPD ) for most of the significand ( except the most significant digit ).
Both alternatives provide exactly the same range of representable numbers: 7 digits of significand and 3 × 26 = 192 possible exponent values.
The position of the relevant bits in the combination field is different between BID and DPD, but the functionality is similar, see tables below. This provides some bit saving because the 2 MSBs from the exponent only use three states, and the 4 MSBs of the significand stay within 0000 .. 1001 ( 10 states ). In total we have 3*10 = 30 possible values when combined in one encoding, which is representable in 5 bits ().
In all cases, the value represented is
- (−1)sign × 10exponent−101 × significand
Alternatively it can be understood as (−1)sign × 10exponent−95 × significand with the significand digits understood as d_0.d_-1 d_-2 d_-3 d_-4 d_-5 d_-6, note the radix dot making it a fraction.
For ±Infinity, besides the sign bit, all the remaining bits are ignored ( i.e., both the exponent and significand fields have no effect ). For NaNs the sign bit has no meaning in the standard, and is ignored. Therefore, signed and unsigned NaNs are equivalent, even though some programs will show NaNs as signed. The bit g5 determines whether the NaN is quiet ( 0 ) or signaling ( 1 ). The bits of the significand are the NaN's payload and can hold user defined data ( e.g., to distinguish how NaNs were generated ). Like for normal significands, the payload of NaNs can be either in BID or DPD encoding.
Be aware that the bit numbering used here for e.g. g10 .. g0 is in opposite direction than that used in the paper for the IEEE 754 standard g0 .. g10.
Combination Field | Exponent | Significand | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
g10 | g9 | g8 | g7 | g6 | g5 | g4 | g3 | g2 | g1 | g0 | |||
combination field not! starting with '11', bits ab = 00, 01 or 10 | |||||||||||||
a | b | c | d | m | m | m | m | e | f | g | abcdmmmm | (0)efgtttttttttttttttttttt | Finite number with small first digit of significand ( 0 .. 7 ). This includes subnormal numbers where the leading significand digit is 0 and the exponent minimal. |
combination field starting with '11', but not 1111, bits ab = 11, bits cd = 00, 01 or 10 | |||||||||||||
1 | 1 | c | d | m | m | m | m | e | f | g | cdmmmmef | 100gtttttttttttttttttttt | Finite number with big first digit of significand ( 8 or 9 ). |
combination field starting with '1111', bits abcd = 1111 | |||||||||||||
1 | 1 | 1 | 1 | 0 | ±Infinity | ||||||||
1 | 1 | 1 | 1 | 1 | NaN ( with payload in Significand ) |
The resulting exponent is a 8 bit binary integer where the leading bits are not '11', thus values 0 .. 10111111b = 191d. The resulting significand could be a positive binary integer of 24 bits up to 1001 1111111111 1111111111b = 10485759d, but values above 107 − 1 = 9999999 = 98967F16 = 1001100010010110011111112 are 'illegal' and have to be treated as zeroes. To obtain the individual decimal digits the significand has to be divided by 10 repeatedly.
Combination Field | Exponent | Significand | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
g10 | g9 | g8 | g7 | g6 | g5 | g4 | g3 | g2 | g1 | g0 | |||
combination field not! starting with '11', bits ab = 00, 01 or 10 | |||||||||||||
a | b | c | d | e | m | m | m | m | m | m | abmmmmmm | (0)cde tttttttttt tttttttttt | Finite number with small first digit of significand ( 0 .. 7 ). This includes subnormal numbers where the leading significand digit is 0 and the exponent minimal. |
combination field starting with '11', but not 1111, bits ab = 11, bits cd = 00, 01 or 10 | |||||||||||||
1 | 1 | c | d | e | m | m | m | m | m | m | cdmmmmmm | 100e tttttttttt tttttttttt | Finite number with big first digit of significand ( 8 or 9 ). |
combination field starting with '1111', bits abcd = 1111 | |||||||||||||
1 | 1 | 1 | 1 | 0 | ±Infinity | ||||||||
1 | 1 | 1 | 1 | 1 | NaN ( with payload in Significand ) |
The resulting exponent is a 8 bit binary integer where the leading bits are not '11', thus values 0 .. 10111111b = 191d. The significand's leading decimal digit forms from the ( 0 )cde or 100e bits as binary integer. To obtain the trailing significand decimal digits the declet fields 'tttttttttt' have to be decoded according to the DPD rules ( see below ). The full decimal significand is then obtained by concatenating the leading and trailing decimal digits.
The 10-bit DPD to 3-digit BCD transcoding for the declets is given by the following table. b9 .. b0 are the bits of the DPD, and d2 .. d0 are the three BCD digits. Be aware that the bit numbering used here for e.g. b9 .. b0 is in opposite direction than that used in the paper for the IEEE 754 standard b0 .. b9, add. the digits are numbered 0-base here while in opposite direction and 1-based in the IEEE 754 paper.
DPD encoded value | Decimal digits | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Code space (1024 states) |
b9 | b8 | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | d2 | d1 | d0 | Values encoded | Description | Occurrences (1000 states) | |
50.0% (512 states) |
a | b | c | d | e | f | 0 | g | h | i | 0abc | 0def | 0ghi | (0–7) (0–7) (0–7) | 3 small digits | 51.2% (512 states) | |
37.5% (384 states) |
a | b | c | d | e | f | 1 | 0 | 0 | i | 0abc | 0def | 100i | (0–7) (0–7) (8–9) | 2 small digits, 1 large digit |
38.4% (384 states) | |
a | b | c | g | h | f | 1 | 0 | 1 | i | 0abc | 100f | 0ghi | (0–7) (8–9) (0–7) | ||||
g | h | c | d | e | f | 1 | 1 | 0 | i | 100c | 0def | 0ghi | (8–9) (0–7) (0–7) | ||||
9.375% (96 states) |
g | h | c | 0 | 0 | f | 1 | 1 | 1 | i | 100c | 100f | 0ghi | (8–9) (8–9) (0–7) | 1 small digit, 2 large digits |
9.6% (96 states) | |
d | e | c | 0 | 1 | f | 1 | 1 | 1 | i | 100c | 0def | 100i | (8–9) (0–7) (8–9) | ||||
a | b | c | 1 | 0 | f | 1 | 1 | 1 | i | 0abc | 100f | 100i | (0–7) (8–9) (8–9) | ||||
3.125% (32 states, 8 used) |
x | x | c | 1 | 1 | f | 1 | 1 | 1 | i | 100c | 100f | 100i | (8–9) (8–9) (8–9) | 3 large digits, b9, b8: don't care |
0.8% (8 states) |
The 8 decimal values whose digits are all 8s or 9s have four codings each. The bits marked x in the table above are ignored on input, but will always be 0 in computed results. ( The 8 × 3 = 24 non-standard encodings fill in the gap between 103 = 1000 and 210 = 1024. )
Benefit of this encoding is access to individual digits by de- / encoding only 10 bits, disadvantage is that some simple functions like sort and compare, very frequently used in coding, do not work on the bit pattern but require decoding to decimal digits ( and evtl. re-encode to binary integers ) first.
See also
[edit]References
[edit]- ^ IEEE Computer Society (2008-08-29). IEEE Standard for Floating-Point Arithmetic. IEEE. doi:10.1109/IEEESTD.2008.4610935. ISBN 978-0-7381-5753-5. IEEE Std 754-2008. Retrieved 2016-02-08.
- ^ "ISO/IEC/IEEE 60559:2011". 2011. Retrieved 2016-02-08.
- ^ "754-2019 - IEEE Standard for Floating-Point Arithmetic ( caution: paywall )". 2019. Retrieved 2019-10-24.
- ^ Cowlishaw, Michael Frederic (2007-02-13) [2000-10-03]. "A Summary of Densely Packed Decimal encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.