= Intel 8231/8232 =

The Intel 8231 and 8232 were early designs of floating-point math coprocessors (FPUs), marketed for use with their i8080 line of primary CPUs. They were licensed versions of AMD's Am9511 and Am9512 FPUs, from 1977 and 1979, themselves claimed by AMD as the world's first single-chip FPU solutions.

== Adoption ==
While the i8231/i8232 (and their AMD-branded cousins) were primarily intended to partner the i8080 (or the AMD clone Am9080), the multiple interface options in their design, from simple wait state insertion and status polling routines to interrupt and DMA controller driven methods suitable for a peripheral processor or add-in board, meant that – with a small amount of glue logic – it was usable in almost any microprocessor system that had a DMA subsystem or a spare interrupt input/interrupt vector available, and AMD's original documentation provided several different examples. This was a valuable feature for one of the first commercially available single-chip FPUs, greatly broadening its potential market, and was in stark contrast to Intel's succeeding, in-house designed 8087 (and other x87 family) FPUs which were tightly bound to the x86 CPU line. For example, the i8231A was used in the Applied Analytics MicroSPEED II and II+ accelerator cards for the 6502-based Apple II line, but examples were also given for the Z80, MC6800, i8085, and even the 16-bit Z8000. Additionally, prior to the introduction of the 8087, Intel's own preliminary datasheets suggested the chips as suitable companions for the then-new 8086.

== Capacity ==
The Intel 8231 (and revised 8231A) was named the Arithmetic Processing Unit (APU). It offered 32-bit "double" precision (a term later and more commonly used to describe 64-bit floating-point numbers, while 32-bit is considered "single" precision) floating-point, and 16-bit or 32-bit ("single" or "double" precision) fixed-point calculation of 14 different arithmetic and trigonometric functions to a proprietary standard. The APU used Chebyshev polynomials. It was available in a 4-MHz version priced at USD $235.00 and a 2-MHz version priced at USD $149.00 in quantities of 100 or more. Later, the Intel 8232 was named the Floating-point Processor Unit (FPU). It performed 32-bit or 64-bit (true single- and double-precision) floating point calculations compliant with the (draft) IEEE-754 standard (as used by the i8087 and other later FPUs), but only on the four primary arithmetic functions (addition, subtraction, multiplication and division). The FPU was available in a 4-MHz version priced at USD $235.00 and a 2-MHz version priced at USD $149.00 in quantities of 100 or more.

All three chips used an 8-bit data bus design, in line with the i8080 and most other contemporary microprocessors. The 8231 could run at up to 3 MHz, and the 8231A and 8232 up to 4 MHz (an improvement on the Am9512 which was limited to 3 MHz), either in sync with the CPU or (in the case of the 8231A and 8232) asynchronously depending on the degree of bus separation in the host system. Async operation was a major addition to the feature set. This made it possible for a roughly-1 MHz Apple II system to be expanded with a 4 MHz 8231A and to enjoy the benefit of much faster numeric processing, or for a 5 MHz i8085-based system to host an 8231A or 8232 without itself slowing down to 4 MHz or less for compatibility. Along with the interrupt-driven peripheral design, async operation provided a degree of parallel processing between the CPU and FPU. The CPU would pass commands and data to the essentially-"offboard" coprocessor and resume its own normal processing. It would switch back to the floating-point subtask (to receive results and optionally issue further commands) only when the coprocessor signalled that processing was complete. There was no need for busy waiting or polling. This parallelization was vital to improving overall system throughput because some of the more complex functions could still take the FPU several milliseconds to complete–-an eternity in computing terms.

Instruction execution times were quite variable and, as an early generation design, typically much longer than those seen in later FPUs with more mature designs. For example, ignoring data and stack handling instructions on the 8232, execution times ranged from 56 clock periods for a single-precision (32-bit) subtraction to a whopping 4560 periods for a double-precision (64-bit) divide. The effective processing speed (if clocked at 4 MHz) is 877 to 71429 FLOPS. The 8231(A)'s instructions ranged from 17 periods for a 16-bit fixed-point addition, through 98 to 378 periods for common 32-bit float operations (heavily dependent not only on the function itself, but the actual magnitude of the operands and result, and even the number of "1" bits in each number), to as many as 12032 periods for the most complex "power" calculation. This comes to 332 FLOPS, through 10.6k–40.8k, to 235.3k FLOPS of performance (at 4 MHz) depending on the instruction and data mix. While these numbers may seem low from a modern perspective, they compare reasonably well with the successor i8087 (whose bigger advantages were a wider databus and address range which provided faster transfers in and out of a larger memory space, greater numeric precision, expanded instruction/function set, and near-IEEE-754 compliance), and are radically faster than performing the same calculations using software on a regular CPU-–even a relatively-sophisticated, 16-bit 8086 running at a full 8 MHz can only achieve somewhere between a few dozen, to no more than around 1000 FLOPS without a coprocessor. Its slower-clocked, 8-bit predecessors and rivals fared even worse.
