Charles Babbage recognized the performance penalty imposed by ripple-carry and developed mechanisms for anticipating carriage in his computing engines. Gerald B. Rosenberger of IBM filed for a patent on a modern binary carry-lookahead adder in 1957.

## Theory of operation

A ripple-carry adder works in the same way as pencil-and-paper methods of addition. Starting at the rightmost (least significant) digit position, the two corresponding digits are added and a result obtained. It is also possible that there may be a carry out of this digit position (for example, in pencil-and-paper methods, "9 + 5 = 4, carry 1"). Accordingly, all digit positions other than the rightmost one need to take into account the possibility of having to add an extra 1 from a carry that has come in from the next position to the right.

This means that no digit position can have an absolutely final value until it has been established whether or not a carry is coming in from the right. Moreover, if the sum without a carry is 9 (in pencil-and-paper methods) or 1 (in binary arithmetic), it is not even possible to tell whether or not a given digit position is going to pass on a carry to the position on its left. At worst, when a whole sequence of sums comes to …99999999… (in decimal) or …11111111… (in binary), nothing can be deduced at all until the value of the carry coming in from the right is known, and that carry is then propagated to the left, one step at a time, as each digit position evaluated "9 + 1 = 0, carry 1" or "1 + 1 = 0, carry 1". It is the "rippling" of the carry from right to left that gives a ripple-carry adder its name, and its slowness. When adding 32-bit integers, for instance, allowance has to be made for the possibility that a carry could have to ripple through every one of the 32 one-bit adders.

1. Calculating for each digit position whether that position is going to propagate a carry if one comes in from the right.
2. Combining these calculated values to be able to deduce quickly whether, for each group of digits, that group is going to propagate a carry that comes in from the right.

Supposing that groups of four digits are chosen the sequence of events goes something like this:

1. All 1-bit adders calculate their results. Simultaneously, the lookahead units perform their calculations.
2. Assuming that a carry arises in a particular group, that carry will emerge at the left-hand end of the group within at most five gate delays and start propagating through the group to its left.
3. If that carry is going to propagate all the way through the next group, the lookahead unit will already have deduced this. Accordingly, before the carry emerges from the next group, the lookahead unit is immediately (within one gate delay) able to tell the next group to the left that it is going to receive a carry – and, at the same time, to tell the next lookahead unit to the left that a carry is on its way.

The net effect is that the carries start by propagating slowly through each 4-bit group, just as in a ripple-carry system, but then move four times as fast, leaping from one lookahead-carry unit to the next. Finally, within each group that receives a carry, the carry propagates slowly within the digits in that group.

The more bits in a group, the more complex the lookahead carry logic becomes, and the more time is spent on the "slow roads" in each group rather than on the "fast road" between the groups (provided by the lookahead carry logic). On the other hand, the fewer bits there are in a group, the more groups have to be traversed to get from one end of a number to the other, and the less acceleration is obtained as a result.

Deciding the group size to be governed by lookahead carry logic requires a detailed analysis of gate and propagation delays for the particular technology being used.

Again, the group sizes to be chosen depend on the exact details of how fast signals propagate within logic gates and from one logic gate to another.

For very large numbers (hundreds or even thousands of bits), lookahead-carry logic does not become any more complex, because more layers of supergroups and supersupergroups can be added as necessary. The increase in the number of gates is also moderate: if all the group sizes are four, one would end up with one third as many lookahead carry units as there are adders. However, the "slow roads" on the way to the faster levels begin to impose a drag on the whole system (for instance, a 256-bit adder could have up to 24 gate delays in its carry processing), and the mere physical transmission of signals from one end of a long number to the other begins to be a problem. At these sizes, carry-save adders are preferable, since they spend no time on carry propagation at all.

Carry-lookahead logic uses the concepts of generating and propagating carries. Although in the context of a carry-lookahead adder, it is most natural to think of generating and propagating in the context of binary addition, the concepts can be used more generally than this. In the descriptions below, the word digit can be replaced by bit when referring to binary addition of 2.

The addition of two 1-digit inputs A and B is said to generate if the addition will always carry, regardless of whether there is an input-carry (equivalently, regardless of whether any less significant digits in the sum carry). For example, in the decimal addition 52 + 67, the addition of the tens digits 5 and 6 generates because the result carries to the hundreds digit regardless of whether the ones digit carries (in the example, the ones digit does not carry (2 + 7 = 9)).

In the case of binary addition, $A+B$ generates if and only if both A and B are 1. If we write $G(A,B)$ to represent the binary predicate that is true if and only if $A+B$ generates, we have

$G(A,B)=A\cdot B$ where $A\cdot B$ is a logical conjunction (i.e., an and).

The addition of two 1-digit inputs A and B is said to propagate if the addition will carry whenever there is an input carry (equivalently, when the next less significant digit in the sum carries). For example, in the decimal addition 37 + 62, the addition of the tens digits 3 and 6 propagate because the result would carry to the hundreds digit if the ones were to carry (which in this example, it does not). Note that propagate and generate are defined with respect to a single digit of addition and do not depend on any other digits in the sum.

In the case of binary addition, $A+B$ propagates if and only if at least one of A or B is 1. If $P(A,B)$ is written to represent the binary predicate that is true if and only if $A+B$ propagates, one has

$P(A,B)=A+B$ where $A+B$ on the right-hand side of the equation is a logical disjunction (i.e., an or).

Sometimes a slightly different definition of propagate is used. By this definition A + B is said to propagate if the addition will carry whenever there is an input carry, but will not carry if there is no input carry. Due to the way generate and propagate bits are used by the carry-lookahead logic, it doesn't matter which definition is used. In the case of binary addition, this definition is expressed by

$P'(A,B)=A\oplus B$ where $A\oplus B$ is an exclusive or (i.e., an xor).

For binary arithmetic, or is faster than xor and takes fewer transistors to implement. However, for a multiple-level carry-lookahead adder, it is simpler to use $P'(A,B)$ .

Given these concepts of generate and propagate, a digit of addition carries precisely when either the addition generates or the next less significant bit carries and the addition propagates. Written in boolean algebra, with $C_{i}$ the carry bit of digit i, and $P_{i}$ and $G_{i}$ the propagate and generate bits of digit i respectively,

$C_{i+1}=G_{i}+(P_{i}\cdot C_{i}).$ ## Implementation details

For each bit in a binary sequence to be added, the carry-lookahead logic will determine whether that bit pair will generate a carry or propagate a carry. This allows the circuit to "pre-process" the two numbers being added to determine the carry ahead of time. Then, when the actual addition is performed, there is no delay from waiting for the ripple-carry effect (or time it takes for the carry from the first full adder to be passed down to the last full adder). Below is a simple 4-bit generalized carry-lookahead circuit that combines with the 4-bit ripple-carry adder we used above with some slight adjustments:

For the example provided, the logic for the generate (g) and propagate (p) values are given below. The numeric value determines the signal from the circuit above, starting from 0 on the far right to 3 on the far left:

$C_{1}=G_{0}+P_{0}\cdot C_{0},$ $C_{2}=G_{1}+P_{1}\cdot C_{1},$ $C_{3}=G_{2}+P_{2}\cdot C_{2},$ $C_{4}=G_{3}+P_{3}\cdot C_{3}.$ Substituting $C_{1}$ into $C_{2}$ , then $C_{2}$ into $C_{3}$ , then $C_{3}$ into $C_{4}$ yields the expanded equations:

$C_{1}=G_{0}+P_{0}\cdot C_{0},$ $C_{2}=G_{1}+G_{0}\cdot P_{1}+C_{0}\cdot P_{0}\cdot P_{1},$ $C_{3}=G_{2}+G_{1}\cdot P_{2}+G_{0}\cdot P_{1}\cdot P_{2}+C_{0}\cdot P_{0}\cdot P_{1}\cdot P_{2},$ $C_{4}=G_{3}+G_{2}\cdot P_{3}+G_{1}\cdot P_{2}\cdot P_{3}+G_{0}\cdot P_{1}\cdot P_{2}\cdot P_{3}+C_{0}\cdot P_{0}\cdot P_{1}\cdot P_{2}\cdot P_{3}.$ To determine whether a bit pair will generate a carry, the following logic works:

$G_{i}=A_{i}\cdot B_{i}.$ To determine whether a bit pair will propagate a carry, either of the following logic statements work:

$P_{i}=A_{i}\oplus B_{i},$ $P_{i}=A_{i}+B_{i}.$ The reason why this works is based on evaluation of $C_{1}=G_{0}+P_{0}\cdot C_{0}$ . The only difference in the truth tables between ($A\oplus B$ ) and ($A+B$ ) is when both $A$ and $B$ are 1. However, if both $A$ and $B$ are 1, then the $G_{0}$ term is 1 (since its equation is $A\cdot B$ ), and the $P_{0}\cdot C_{0}$ term becomes irrelevant. The XOR is used normally within a basic full adder circuit; the OR is an alternative option (for a carry-lookahead only), which is far simpler in transistor-count terms.

The carry-lookahead 4-bit adder can also be used in a higher-level circuit by having each CLA logic circuit produce a propagate and generate signal to a higher-level CLA logic circuit. The group propagate ($PG$ ) and group generate ($GG$ ) for a 4-bit CLA are:

$PG=P_{0}\cdot P_{1}\cdot P_{2}\cdot P_{3},$ $GG=G_{3}+G_{2}\cdot P_{3}+G_{1}\cdot P_{3}\cdot P_{2}+G_{0}\cdot P_{3}\cdot P_{2}\cdot P_{1}.$ They can then be used to create a carry-out for that particular 4-bit group:

$CG=GG+PG\cdot C_{in}.$ It can be seen that this is equivalent to $C_{4}$ in previous equations.

Putting four 4-bit CLAs together yields four group propagates and four group generates. A lookahead-carry unit (LCU) takes these 8 values and uses identical logic to calculate $C_{i}$ in the CLAs. The LCU then generates the carry input for each of the 4 CLAs and a fifth equal to $C_{16}$ .

The calculation of the gate delay of a 16-bit adder (using 4 CLAs and 1 LCU) is not as straight forward as the ripple carry adder.

Starting at time of zero:

• calculation of $P_{i}$ and $G_{i}$ is done at time 1,
• calculation of $C_{i}$ is done at time 3,
• calculation of the $PG$ is done at time 2,
• calculation of the $GG$ is done at time 3,
• calculation of the inputs for the CLAs from the LCU are done at:
• time 0 for the first CLA,
• time 5 for the second, third and fourth CLA,
• calculation of the $S_{i}$ are done at:
• time 4 for the first CLA,
• time 8 for the second, third & fourth CLA,
• calculation of the final carry bit ($C_{16}$ ) is done at time 5.

The maximal time is 8 gate delays (for $S_{[8-15]}$ ).

A standard 16-bit ripple-carry adder would take 16 × 3 − 1 = 47 gate delays.

## Manchester carry chain

The Manchester carry chain is a variation of the carry-lookahead adder that uses shared logic to lower the transistor count. As can be seen above in the implementation section, the logic for generating each carry contains all of the logic used to generate the previous carries. A Manchester carry chain generates the intermediate carries by tapping off nodes in the gate that calculates the most significant carry value. However, not all logic families have these internal nodes, CMOS being a major example. Dynamic logic can support shared logic, as can transmission gate logic. One of the major downsides of the Manchester carry chain is that the capacitive load of all of these outputs, together with the resistance of the transistors causes the propagation delay to increase much more quickly than a regular carry lookahead. A Manchester-carry-chain section generally doesn't exceed 4 bits.