Unum (number format)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Unums (universal numbers[1]) are a family of formats and arithmetic, similar to floating point, proposed by John L. Gustafson in 2015.[2] They are designed as an alternative to the ubiquitous IEEE 754 floating-point standard. The latest version (known as posits[3]) can be used as a drop-in replacement for programs that do not depend on specific features of IEEE 754.

Type I Unum[edit]

The first version of unums, formally known as Type I unum, was introduced in Gustafson's book The End of Error as a superset of the IEEE-754 floating-point format.[2] The defining features of the Type I unum format are:

  • a variable-width storage format for both the significand and exponent, and
  • a u-bit, which determines whether the unum corresponds to an exact number (u = 0), or an interval between consecutive exact unums (u = 1). In this way, the unums cover the entire extended real number line [−∞,+∞].

For computation with the format, Gustafson proposed using interval arithmetic with a pair of unums, what he called a ubound, providing the guarantee that the resulting interval contains the exact solution.

William M. Kahan and Gustafson debated unums at the Arith23 conference.[4][5][6][7]

Type II Unum[edit]

Type II Unums were introduced in 2016[8] as a redesign of Unums that broke IEEE-754 compatibility.

Type III Unum (posit and valid)[edit]

In February 2017, Gustafson officially introduced Type III unums, posits for fixed floating-point-like values and valids for interval arithmetic.[3]


Posits[3][9][10] are a hardware-friendly version of unum where difficulties faced in the original type I unum due to its variable size are resolved. Similar size posits when compared to floats offer a bigger dynamic range and more fraction bits for accuracy. In an independent study, Lindstrom, Lloyd and Hittinger from Lawrence Livermore National Laboratory[11] confirmed that posits out-perform floats in accuracy.[dubious ] Posits have superior accuracy in the range near one, where most computations occur. This makes it very attractive to the current trend in deep learning to minimise the number of bits used. It potentially helps any application to accelerate by enabling the use of fewer bits (since it has more fraction bits for accuracy) reducing network and memory bandwidth and power requirements.

Posits have variable-sized index and mantissa bitfields, with the split being specified by a "regime" indicator. Gustafson claimed that they offer better precision than standard floating-point numbers while taking up fewer bits.[12][13]

Posits have a different format than IEEE 754 floats. They consist of four parts: sign, regime, exponent, and fraction (also known as significand/mantissa). For a n-bit posit, regime can be of length 2 to (n − 1). The format of the regime is such that it is a repetition of a same-sign bit and terminated by a different-sign bit.


Example 1 Example 2
000000000000001 1110

Example 1 shows a regime with 14 same-sign bits (bit 0), terminated by a different-sign bit (bit 1). As there are 14 same-sign bits, the runlength of the regime is 14.

Example 2 shows a regime with 3 same-sign bits (bit 1), terminated by a different-sign bit (bit 0). As there are 3 same-sign bits, the runlength of the regime is 3.

Sign, exponent and fraction bits are very similar to IEEE 754; however, posits may omit either or both of the exponent and fraction bits, leaving a posit that consists of only sign and regime bits. Example 3 shows the longest possible regime runlength for a 16-bit posit, where the regime terminating bit, exponent bit and fraction bits are beyond the length of the size of the posit. Example 4 illustrates the shortest possible runlength of 1 for a 16-bit posit with one exponent bit (bit value = 1) and 12 fraction bits (bit value = 100000000001).

Example 3: Regime runlength = 15 Example 4: Regime runlength = 1
0111111111111111 0101100000000001

The recommended posit sizes and corresponding exponent bits and quire sizes:

Posit size (bits) Number of exponent bits Quire size (bits)
8 0 32
16 1 128
32 2 512
64 3 2048

Note: 32-bit posit is expected to be sufficient to solve almost all classes of applications[citation needed].


Quire is one of the most useful features of posits. It is a special data type that gives posits an "near-infinity" number of bits to accumulate dot products. It is based on the work of Ulrich W. Kulisch and Willard L. Miranker.[14]


Valids are described as a Type III Unum mode that bounds results in a given range.[3]


Several software and hardware solutions implement posits.[11][15][16][17][18] The first complete parameterized posit arithmetic hardware generator was proposed in 2018.[19]

Unum implementations have been explored in Julia[20][21][22][23][24][25] and MATLAB.[26][27] A C++ version[28] with support for any posit sizes combined with any number of exponent bits is available. A fast implementation in C, SoftPosit,[29] provided by the NGA research team based on Berkeley SoftFloat adds to the available software implementations.



Type Precisions Quire


Speed Testing Notes


World's First FPGA GP-GPU 32 Yes ~3.2 Tpops Exhaustive. No known bugs. RacEr GP-GPU has 512 cores


C library based on Berkeley SoftFloat

C++ wrapper to override operators Python wrapper using SWIG of SoftPosit

8, 16, 32 published and complete; Yes ~60 to 110 Mpops/s on x86 core (Broadwell) 8: Exhaustive;

16: Exhaustive except FMA, quire 32: Exhaustive test is still in progress. No known bugs.

Open source license. Fastest and most comprehensive C library for posits presently. Designed for plug-in comparison of IEEE floats and posits.


Mathematica notebook All Yes < 80 kpops/s Exhaustive for low precisions. No known bugs. Open source (MIT license). Original definition and prototype. Most complete environment for comparing IEEE floats and posits. Many examples of use, including linear solvers


JavaScript widget Convert decimal to posit 6, 8, 16, 32; generate tables 2–17 with es 1–4. —; interactive widget Fully tested Table generator and conversion

Stillwater Supercomputing, Inc

C++ template library

C library Python wrapper Golang library

Arbitrary precision posit float valid (p)

Unum type 1 (p) Unum type 2 (p)

Arbitrary quire configurations with programmable capacity posit<4,0> 1 GPOPS

posit<8,0> 130 MPOPS posit<16,1> 115 MPOPS posit<32,2> 105 MPOPS posit<64,3> 50 MPOPS posit<128,4> 1 MPOPS posit<256,5> 800 kPOPS

Complete validation suite for arbitrary posits

Randoms for large posit configs. Uses induction to prove nbits+1 is correct no known bugs

Open source. MIT license

fully integrated with C/C++ types and automatic conversions Supports full C++ math library (native and conversion to/from IEEE) Runtime integrations: MTL4/MTL5, Eigen, Trilinos, HPR-BLAS Application integrations: G+SMO, FDBB, FEniCS, ODEintV2, TVM.ai Hardware Accelerator integration (Xilinx, Intel, Achronix)


Chung Shin Yee

Python library All No ~20 Mpops/s Extensive; no known bugs Open source (MIT license)

David Thien

SoftPosit bindings for Racket All Yes Un­known Un­known

Bill Zorn

SoftPosit bindings for Python All Yes ~20–45 Mpops/s on 4.9 GHz Skylake core Un­known

Diego Coelho

Octave Implementation All No Un­known Limited Testing; no known bugs GNU GPL
Sigmoid Numbers

Isaac Yonemoto

Julia library All <32, all ES Yes Un­known No known bugs (posits).

Division bugs (valids)

Leverages Julia's templated mathematics standard library, can natively do matrix and tensor operations, complex numbers, FFT, DiffEQ. Support for valids

Isaac Yonemoto

Julia and C/C++ library 8, 16, 32, all ES No Un­known Known bug in 32-bit multiplication Used by LLNL in shock studies

Milan Klöwer

Julia library Based on softposit;

8-bit (es=0..2) 16-bit (es=0..2) 24-bit (es=1..2) 32-bit (es=2)

Yes Similar to

A*STAR "SoftPosit" (Cerlane Leong)


Posit (8,0), Posit (16,1), Posit (32,2) Other formats lack full functionality

Open source. Issues and suggestions on GitHub.

This project was developed due to the fact that SigmoidNumbers and FastSigmoid by Isaac Yonemoto is not maintained currently.

Supports basic linear algebra functions in Julia (Matrix multiplication, Matrix solve, Elgen decomposition, etc.)


Ken Mercado

Python library All Yes < 20 Mpops/s Un­known Open source (MIT license). Easy-to-use interface. Neural net example. Comprehensive functions support.

Emanuele Ruffaldi

C++ library 4 to 64 (any es value); "Template version is 2 to 63 bits" No Un­known A few basic tests 4 levels of operations working with posits. Special support for NaN types (nonstandard)
bfp:Beyond Floating Point

Clément Guérin

C++ library Any No Un­known Bugs found; status of fixes unknown Supports + – × ÷ √ reciprocal, negate, compare

Isaac Yonemoto

Julia and Verilog 8, 16, 32, ES=0 No Un­known Comprehensively tested for 8-bit, no known bugs Intended for Deep Learning applications Addition, Subtraction and Multiplication only. A proof of concept matrix multiplier has been built, but is off-spec in its precision
Lombiq Arithmetics

Lombiq Technologies

C# with Hastlayer for hardware generation 8, 16, 32.

(64bits in progress)

Yes 10 Mpops/s

Click here for more

Partial Requires Microsoft .Net APIs
DeepfloatJeff Johnson, Facebook SystemVerilog Any (parameterized SystemVerilog) Yes

(RTL for FPGA/ASIC designs)

Limited Does not strictly conform to posit spec.

Supports +,-,/,*. Implements both logarithmic posit and normal, "linear" posits License: CC-BY-NC 4.0 at present

Tokyo Tech FPGA 16, 32, extendable No "2 GHz", not translated to Mpops/s Partial; known rounding bugs Yet to be open-source
PACoGen: Posit Arthmetic Core GeneratorManish Kumar Jaiswal Verilog HDL for Posit Arithmetic Any precision.

Able to generate any combination of word-size (N) and exponent-size (ES)

No Speed of design is based on the underlying hardware platform (ASIC/FPGA) Exhaustive tests for 8-bit posi.

Multi-million random tests are performed for up to 32-bit posit with various ES combinations

It supports rounding-to-nearest rounding method.
Vinay Saxena, Research and Technology Centre, Robert Bosch, India (RTC-IN) and Farhad Merchant, RWTH Aachen University Verilog generator for VLSI, FPGA All No Similar to floats of same bit size N=8

- ES=2 | N=7,8,9,10,11,12 Selective (20000*65536) combinations for - ES=1 | N=16

To be used in commercial products. To the best of our knowledge.

***First ever integration of posits in RISC-V***

Posit Enabled RISC-V Core

(Sugandha Tiwari, Neel Gala, Chester Rebeiro, V.Kamakoti, IIT MADRAS)

BSV (Bluespec System Verilog) Implementation 32-bit posit with (es=2) and (es=3) No Verified against SoftPosit for (es=2) and tested with several applications for (es=2) and (es=3). No known bugs. First complete posit capable RISC-V core. Supports dynamic switching between (es=2) and (es=3).

More info here.


David Mallasén

Open-Source Posit RISC-V Core with Quire Capability Posit<32,2> with 512-bit quire Yes Speed of design is based on the underlying hardware platform (ASIC/FPGA) Functionality testing of each posit instruction. Application-level posit capable RISC-V core based on CVA6 that can execute all posit instructions, including the quire fused operations. PERCIVAL is the first work that integrates the complete posit ISA and quire in hardware. It allows the native execution of posit instructions as well as the standard floating-point ones simultaneously.

REX Computing

FPGA version of the "Neo" VLIW processor with posit numeric unit 32 No ~1.2 Gpops/s Extensive; no known bugs No divide or square root. First full processor design to replace floats with posits.
PNU: Posit Numeric Unit

Calligo Tech

FPGA; first working posit hardware 32 Claimed, not yet tested ~0.5 Mpops/s Extensive tests, not exhaustive. No known bugs. Single-op accelerator approach; allows direct execution of C codes written for floats. + – × tested; ÷ √ claimed

Jianyu Chen

Specific-purpose FPGA 32 Yes 16–64 Gpops/s Only one known case tested Does 128-by-128 matrix-matrix multiplication (SGEMM) using quire.
Deep PeNSieve

Raul Murillo

Python library (software) 8, 16, 32 Yes Un­known Un­known A DNN framework using posits

Jaap Aarts

Pure Go library 16/1 32/2 (included is a generic 32/ES for ES<32)[clarification needed] No 80 Mop/s for div32/2 and similar linear functions. Much higher for truncate and much lower for exp. Fuzzing against c softposit with a lot of iterations for 16/1 and 32/2. Explicitly testing edge cases found. (MIT license) The implementations where ES is constant the code is generated. The generator should be able to generate for all sizes {8,16,32} and ES below the size. However, the ones not included into the library by default are not tested, fuzzed, or supported. Feel free use the generator to generate them for you, report bugs, supply patches, etc. For some operations on 32/ES, mixing and matching ES is possible. However, this is not tested.


SoftPosit[29] is a software implementation of posits based on Berkeley SoftFloat.[30] It allows software comparison between posits and floats. It currently supports

  • Add
  • Subtract
  • Multiply
  • Divide
  • Fused-multiply-add
  • Fused-dot-product (with quire)
  • Square root
  • Convert posit to signed and unsigned integer
  • Convert signed and unsigned integer to posit
  • Convert posit to another posit size
  • Less than, equal, less than equal comparison
  • Round to nearest integer

Helper functions[edit]

  • convert double to posit
  • convert posit to double
  • cast unsigned integer to posit

It works for 16-bit posits with one exponent bit and 8-bit posit with zero exponent bit. Support for 32-bit posits and flexible type (2-32 bits with two exponent bits) is pending validation. It supports x86_64 systems. It has been tested on GNU gcc (SUSE Linux) 4.8.5 Apple LLVM version 9.1.0 (clang-902.0.39.2).


Add with posit8_t

#include "softposit.h"

int main (int argc, char *argv[]){

    posit8_t pA, pB, pZ;
    pA = castP8(0xF2);
    pB = castP8(0x23);

    pZ = p8_add(pA, pB);

    //To check answer by converting it to double
    double dZ = convertP8ToDouble(pZ);
    printf("dZ: %.15f\n", dZ);

    //To print result in binary (warning: non-portable code)
    uint8_t uiZ = castUI8(pZ);
    printBinary((uint64_t*)&uiZ, 8);

    return 0;

Fused dot product with quire16_t

//Convert double to posit
posit16_t pA = convertDoubleToP16(1.02783203125 );
posit16_t pB = convertDoubleToP16(0.987060546875);
posit16_t pC = convertDoubleToP16(0.4998779296875);
posit16_t pD = convertDoubleToP16(0.8797607421875);

quire16_t qZ;

//Set quire to 0
qZ = q16_clr(qZ);

//accumulate products without roundings
qZ = q16_fdp_add(qZ, pA, pB);
qZ = q16_fdp_add(qZ, pC, pD);

//Convert back to posit
posit16_t pZ = q16_to_p16(qZ);

//To check answer
double dZ = convertP16ToDouble(pZ);


William M. Kahan, the principal architect of IEEE 754-1985 criticizes type I unums on the following grounds (some are addressed in type II and type III standards):[6][31]

  • The description of unums sidesteps using calculus for solving physics problems.
  • Unums can be expensive in terms of time and power consumption.
  • Each computation in unum space is likely to change the bit length of the structure. This requires either unpacking them into a fixed-size space, or data allocation, deallocation, and garbage collection during unum operations, similar to the issues for dealing with variable-length records in mass storage.
  • Unums provide only two kinds of numerical exception, quiet and signaling NaN (Not-a-Number).
  • Unum computation may deliver overly loose bounds from the selection of an algebraically correct but numerically unstable algorithm.
  • The benefits of unum over short precision floating point for problems requiring low precision are not obvious.
  • Solving differential equations and evaluating integrals with unums guarantee correct answers but may not be as fast as methods that usually work.

See also[edit]


  1. ^ Tichy, Walter F. (April 2016). "The End of (Numeric) Error: An interview with John L. Gustafson". Ubiquity – Information Everywhere. Association for Computing Machinery (ACM). 2016 (April): 1–14. doi:10.1145/2913029. Archived from the original on 2016-07-10. Retrieved 2016-07-10. JG: The word "unum" is short for "universal number," the same way the word "bit" is short for "binary digit."
  2. ^ a b Gustafson, John L. (2016-02-04) [2015-02-05]. The End of Error: Unum Computing. Chapman & Hall / CRC Computational Science. Vol. 24 (2nd corrected printing, 1st ed.). CRC Press. ISBN 978-1-4822-3986-7. Retrieved 2016-05-30. [1] [2]
  3. ^ a b c d Gustafson, John Leroy; Yonemoto, Isaac (2017). "Beating Floating Point at its Own Game: Posit Arithmetic". Supercomputing Frontiers and Innovations. Publishing Center of South Ural State University, Chelyabinsk, Russia. 4 (2). doi:10.14529/jsfi170206. Archived from the original on 2017-11-04. Retrieved 2017-11-04.
  4. ^ "Program: Special Session: The Great Debate: John Gustafson and William Kahan". Arith23: 23rd IEEE Symposium on Computer Arithmetic. Silicon Valley, USA. 2016-07-12. Archived from the original on 2016-05-30. Retrieved 2016-05-30.
  5. ^ Gustafson, John L.; Kahan, William M. (2016-07-12). The Great Debate @ARITH23: John Gustafson and William Kahan (1:34:41) (video). Retrieved 2016-07-20.
  6. ^ a b Kahan, William M. (2016-07-16) [2016-07-12]. "A Critique of John L. Gustafson's THE END of ERROR — Unum Computation and his A Radical Approach to Computation with Real Numbers" (PDF). Santa Clara, CA, USA: IEEE Symposium on Computer Arithmetic, ARITH 23. Archived (PDF) from the original on 2016-07-25. Retrieved 2016-07-25. [3]
  7. ^ Gustafson, John L. (2016-07-12). ""The Great Debate": Unum arithmetic position paper" (PDF). Santa Clara, CA, USA: IEEE Symposium on Computer Arithmetic, ARITH 23. Retrieved 2016-07-20. [4]
  8. ^ Tichy, Walter F. (September 2016). "Unums 2.0: An Interview with John L. Gustafson". Ubiquity.ACM.org. Retrieved 2017-01-30. I started out calling them "unums 2.0," which seemed to be as good a name for the concept as any, but it is really not a "latest release" so much as it is an alternative.
  9. ^ John L. Gustafson and I. Yonemoto. (February 2017) Beyond Floating Point: Next Generation Computer Arithmetic. [Online]. Available: https://www.youtube.com/watch?v=aP0Y1uAA-2Y
  10. ^ Gustafson, John Leroy (2017-10-10). "Posit Arithmetic" (PDF). Archived (PDF) from the original on 2017-11-05. Retrieved 2017-11-04.
  11. ^ a b P. Lindstrom, S. Lloyd, and J. Hittinger, "Universal Coding of the Reals: Alternatives to IEEE Floating Point". In Conference for Next Generation Arithmetic. ACM, 2018.
  12. ^ Feldman, Michael (2019-07-08). "New Approach Could Sink Floating Point Computation". www.nextplatform.com. Retrieved 2019-07-09.
  13. ^ Byrne, Michael (2016-04-24). "A New Number Format for Computers Could Nuke Approximation Errors for Good". Vice. Retrieved 2019-07-09.
  14. ^ Kulisch, Ulrich W.; Miranker, Willard L. (March 1986). "The Arithmetic of the Digital Computer: A New Approach". SIAM Rev. SIAM. 28 (1): 1–40. doi:10.1137/1028001.
  15. ^ S. Chung, "Provably Correct Posit Arithmetic with Fixed-Point Big Integer." ACM, 2018.
  16. ^ J. Chen, Z. Al-Ars, and H. Hofstee, "A Matrix-Multiply Unit for Posits in Reconfigurable Logic Using (Open)CAPI." ACM, 2018.
  17. ^ Z. Lehoczky, A. Szabo, and B. Farkas, "High-level .NET Software Implementations of Unum Type I and Posit with Simultaneous FPGA Implementation Using Hastlayer." ACM, 2018.
  18. ^ S. Langroudi, T. Pandit, and D. Kudithipudi, "Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit". In Energy Efficiency Machine Learning and Cognitive Computing for Embedded Applications (EMC), 2018. [Online]. Available: https://sites.google.com/view/asplos-emc2/program
  19. ^ Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, Rainer Leupers, "Parameterized Posit Arithmetic Hardware Generator." ICCD 2018: 334-341.
  20. ^ Byrne, Simon (2016-03-29). "Implementing Unums in Julia". Retrieved 2016-05-30.
  21. ^ "Unum arithmetic in Julia: Unums.jl". Retrieved 2016-05-30.
  22. ^ "Julia Implementation of Unums: README". Retrieved 2016-05-30.
  23. ^ "Unum (Universal Number) types and operations: Unums". Retrieved 2016-05-30.
  24. ^ "jwmerrill/Pnums.jl". Github.com. Retrieved 2017-01-30.
  25. ^ "GitHub - ityonemo/Unum2: Pivot Unums". 2019-04-29.
  26. ^ Ingole, Deepak; Kvasnica, Michal; De Silva, Himeshi; Gustafson, John L. "Reducing Memory Footprints in Explicit Model Predictive Control using Universal Numbers. Submitted to the IFAC World Congress 2017". Retrieved 2016-11-15.
  27. ^ Ingole, Deepak; Kvasnica, Michal; De Silva, Himeshi; Gustafson, John L. "MATLAB Prototype of unum (munum)". Retrieved 2016-11-15.
  28. ^ "GitHub - stillwater-sc/Universal: Universal Number Arithmetic". 2019-06-16.
  29. ^ a b "Cerlane Leong / SoftPosit · GitLab". GitLab.
  30. ^ "Berkeley SoftFloat". www.jhauser.us.
  31. ^ Kahan, William M. (2016-07-15). "Prof. W. Kahan's Commentary on "THE END of ERROR — Unum Computing" by John L. Gustafson, (2015) CRC Press" (PDF). Archived (PDF) from the original on 2016-08-01. Retrieved 2016-08-01.

Further reading[edit]

External links[edit]