Euclidean algorithm

In number theory, the Euclidean algorithm (also called Euclid's algorithm) is an algorithm to determine the greatest common divisor (GCD) of two elements of any Euclidean domain (for example, the integers). Its major significance is that it does not require factoring the two integers, and it is also significant in that it is one of the oldest algorithms known, dating back to the ancient Greeks.

History of the Euclidean algorithm

The Euclidean algorithm is one of the oldest algorithms known, since it appeared in Euclid's Elements around 300 BC (7th book, Proposition 2). Euclid originally formulated the problem geometrically, as the problem of finding the greatest common "measure" for two line lengths (a line that could be used to measure both lines without a remainder), and his algorithm proceeded by repeated subtraction of the shorter from the longer segment. However, the algorithm was probably not discovered by Euclid and it may have been known up to 200 years earlier. It was almost certainly known by Eudoxus of Cnidus (about 375 BC), and Aristotle (about 330 BC) hinted at it in his Topics, 158b, 29–35.

Description of the algorithm

Given two natural numbers a and b: check if b is zero; if yes, a is the gcd. If not, repeat the process using, respectively, b, and the remainder after dividing a by b. The remainder after dividing a by b is usually written as a mod b.

These algorithms can be used in any context where division with remainder is possible. This includes rings of polynomials over a field as well as the ring of Gaussian integers, and in general all Euclidean domains. Applying the algorithm to the more general case other than natural numbers will be discussed in more detail later in the article.

Using recursion

Using recursion, the algorithm can be expressed:

 function gcd(a, b)
     if b = 0 return a
     else return gcd(b, a mod b)

or in C/C++ as

int gcd(int a, int b) 
{ 
   return ( b == 0 ? a : gcd(b, a % b) ); 
}

Using iteration

An efficient, iterative method, for compilers that don't optimize tail recursion:

 function gcd(a, b)
     while b ≠ 0
         t := b
         b := a mod b
         a := t
     return a

The extended Euclidean algorithm

By keeping track of the quotients occurring during the algorithm, one can also determine integers p and q with ap + bq = gcd(a, b). This is known as the extended Euclidean algorithm.

Original algorithm

The original algorithm as described by Euclid treated the problem geometrically, using repeated subtraction rather than mod (remainder). This algorithm is exponentially slower than the division-based algorithm in the worst case, but for small enough input values may provide a benefit on machines without a division instruction.

 function gcd(a, b)
     if a = 0 return b
     while b ≠ 0
         if a > b
             a := a − b
         else
             b := b − a
     return a

An example

As an example, consider computing the gcd of 1071 and 1029, which is 21. Recall that “mod” means “the remainder after dividing.”

With the recursive algorithm:

		a	b	Explanations
	gcd(	1071,	1029)	The initial arguments
=	gcd(	1029,	42)	The second argument is 1071 mod 1029
=	gcd(	42,	21)	The second argument is 1029 mod 42
=	gcd(	21,	0)	The second argument is 42 mod 21
=		21		Since `b=0`, we `return a`

With the iterative algorithm:

a	b	Explanation
1071	1029	Step 1: The initial inputs
1029	42	Step 2: The remainder of 1071 divided by 1029 is 42, which is put on the right, and the divisor 1029 is put on the left.
42	21	Step 3: We repeat the loop, dividing 1029 by 42, and get 21 as remainder.
21	0	Step 4: Repeat the loop again, since 42 is divisible by 21, we get 0 as remainder, and the algorithm terminates. The number on the left, that is 21, is the gcd as required.

Observe that a ≥ b in each call. If initially, b > a, there is no problem; the first iteration effectively swaps the two values.

Proof

Suppose a and b are the natural numbers whose gcd has to be determined. Now, suppose b > 0, and the remainder of the division of a by b is r. Therefore a = qb + r where q is the quotient of the division.

Any common divisor of a and b is also a divisor of r. To see why this is true, consider that r can be written as r = a − qb. Now, if there is a common divisor d of a and b such that a = sd and b = td, then r = (s−qt)d. Since all these numbers, including s−qt, are whole numbers, it can be seen that r is divisible by d. Also b is divisible by d (this was a premise) and therefore d is a common divisor of r and b.

With the same argumentation you can see that any common divisor of r and b is also a common divisor of a and b (because a = qb + r ). Thus it follows that the set of common divisors of a and b is identical to the set of common divisors of r an b and therefore the maximum (which is the gcd) is identical.

Therefore it is enough if we continue searching for the greatest common divisor with the numbers b and r. Since r is smaller in absolute value than b, we will reach r = 0 after finitely many steps.

Running time

When analyzing the running time of Euclid's algorithm, the inputs requiring the most divisions are two successive Fibonacci numbers (because their ratios are the convergents in the slowest continued fraction expansion to converge, that of the golden ratio) as proved by Gabriel Lamé, and the worst case requires O(n) divisions, where n is the number of digits in the input. However, the divisions themselves are not constant time operations; the actual time complexity of the algorithm is $O(n^{2})$ . The reason is that division of two n-bit numbers takes time $O(n(m+1))$ , where m is the length of the quotient. Consider the computation of gcd(a,b) where a and b have at most n bits, let $a_{0},\dots ,a_{k}$ be the sequence of numbers produced by the algorithm, and let $n_{0},\dots ,n_{k}$ be their lengths. Then $k=O(n)$ , and the running time is bounded by

O{\Big (}\sum _{i<k}n_{i}(n_{i}-n_{i+1}+2){\Big )}\subseteq O{\Big (}n\sum _{i<k}(n_{i}-n_{i+1}+2){\Big )}\subseteq O(n(n_{0}+2k))\subseteq O(n^{2}).

This is considerably better than Euclid's original algorithm, in which the modulus operation is effectively performed using repeated subtraction in $O(2^{n})$ steps. Consequently, that version of the algorithm requires $O(2^{n}n)$ time for n-digit numbers, or $O(m\log {m})\,$ time for the number m.

Euclid's algorithm is widely used in practice, especially for small numbers, due to its simplicity. An alternative algorithm, the binary GCD algorithm, exploits the binary representation used by computers to avoid divisions and thereby increase efficiency, although it too is O(n²); it merely shrinks the constant hidden by the big-O notation on many real machines.

There are more complex algorithms that can reduce the running time to $O(n(\log n)^{2}(\log \log n))\,$ . See Computational complexity of mathematical operations for more details.

Relation with continued fractions

The quotients that appear when the Euclidean algorithm is applied to the inputs a and b are precisely the numbers occurring in the continued fraction representation of a/b. Take for instance the example of a = 1071 and b = 1029 used above. Here is the calculation with highlighted quotients:

1071 = 1029 × 1 + 42

1029 = 42 × 24 + 21

42 = 21 × 2 + 0

Consequently,

{\frac {1071}{1029}}=\mathbf {1} +{\frac {1}{\mathbf {24} +{\frac {1}{\mathbf {2} }}}}

.

This method applies to arbitrary real inputs a and nonzero b; if a/b is irrational, then the Euclidean algorithm does not terminate, but the computed sequence of quotients still represents the (now infinite) continued fraction representation of a/b.

The quotients 1,24,2 count certain squares nested within a rectangle R having length 1071 and width 1029, in the following manner:

(1) there is 1 1029×1029 square in R whose removal leaves a 42×1029 rectangle, R₁;

(2) there are 24 42×42 squares in R₁ whose removal leaves a 21×42 rectangle, R₂;

(3) there are 2 21×21 squares in R₂ whose removal leaves nothing.

The "visual Euclidean algorithm" of nested squares applies to an arbitrary rectangle R. If the (length)/(width) of R is an irrational number, then the visual Euclidean algorithm extends to a visual continued fraction.

Generalization to Euclidean domains

The Euclidean algorithm can be applied to some rings, not just the integers. The most general context in which the algorithm terminates with the greatest common divisor is in a Euclidean domain. For instance, the Gaussian integers and polynomial rings over a field are both Euclidean domains.

As an example, consider the ring of polynomials with rational coefficients. In this ring, division with remainder is carried out using long division. The resulting polynomials are then made monic by factoring out the leading coefficient.

We calculate the greatest common divisor of

x^{4}-4x^{3}+4x^{2}-3x+14=(x^{2}-5x+7)(x^{2}+x+2)

and

x^{4}+8x^{3}+12x^{2}+17x+6=(x^{2}+7x+3)(x^{2}+x+2).

Following the algorithm gives these values:

a	b
$x^{4}+8x^{3}+12x^{2}+17x+6$	$x^{4}-4x^{3}+4x^{2}-3x+14$
$x^{4}-4x^{3}+4x^{2}-3x+14$	$x^{3}+{\tfrac {2}{3}}x^{2}+{\tfrac {5}{3}}x-{\tfrac {2}{3}}$
$x^{3}+{\tfrac {2}{3}}x^{2}+{\tfrac {5}{3}}x-{\tfrac {2}{3}}$	$x^{2}+x+2$
$x^{2}+x+2$	$0$

This agrees with the explicit factorization. For general Euclidean domains, the proof of correctness is by induction on some size function. For the integers, this size function is just the identity. For rings of polynomials over a field, it is the degree of the polynomial (note that each step in the above table reduces the degree by at least one).

References

Donald Knuth. The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89684-2. Sections 4.5.2–4.5.3, pp.333–379.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 31.2: Greatest common divisor, pp.856–862.
Clark Kimberling. "A Visual Euclidean Algorithm," Mathematics Teacher 76 (1983) 108-109.

v t e Number-theoretic algorithms
Primality tests	AKS APR Baillie–PSW Elliptic curve Pocklington Fermat Lucas Lucas–Lehmer Lucas–Lehmer–Riesel Proth's theorem Pépin's Quadratic Frobenius Solovay–Strassen Miller–Rabin
Prime-generating	Sieve of Atkin Sieve of Eratosthenes Sieve of Pritchard Sieve of Sundaram Wheel factorization
Integer factorization	Continued fraction (CFRAC) Dixon's Lenstra elliptic curve (ECM) Euler's Pollard's rho p − 1 p + 1 Quadratic sieve (QS) General number field sieve (GNFS) Special number field sieve (SNFS) Rational sieve Fermat's Shanks's square forms Trial division Shor's
Multiplication	Ancient Egyptian Long Karatsuba Toom–Cook Schönhage–Strassen Fürer's
Euclidean division	Binary Chunking Fourier Goldschmidt Newton-Raphson Long Short SRT
Discrete logarithm	Baby-step giant-step Pollard rho Pollard kangaroo Pohlig–Hellman Index calculus Function field sieve
Greatest common divisor	Binary Euclidean Extended Euclidean Lehmer's
Modular square root	Cipolla Pocklington's Tonelli–Shanks Berlekamp Kunerth
Other algorithms	Chakravala Cornacchia Exponentiation by squaring Integer square root Integer relation (LLL; KZ) Modular exponentiation Montgomery reduction Schoof Trachtenberg system
Italics indicate that algorithm is for numbers of special forms