Sieve of Eratosthenes

In mathematics, the Sieve of Eratosthenes (Greek: κόσκινον Ἐρατοσθένους) is a simple, ancient algorithm for finding all prime numbers up to a specified integer.^[1] It works efficiently for the smaller primes (below 10 million).^[2] It was created by Eratosthenes, an ancient Greek mathematician; however, none of his mathematical works survived—the sieve was described and attributed to Eratosthenes in the Introduction to Arithmetic by Nicomachus.^[3]

Algorithm description

A prime number is a natural number which has exactly two distinct natural number divisors: 1 and itself.

To find all the prime numbers less than or equal to a given integer n by Eratosthenes' method:

Create a list of consecutive integers from two to n: (2, 3, 4, ..., n).
Initially, let p equal 2, the first prime number.
Starting from p, count up by p and cross out thus found numbers in the list (which will be 2p, 3p, 4p, etc.).
Find the first number not yet crossed out after p; let p now equal this number (which is the next prime).
Repeat steps 3 and 4 until p is greater than n.
All the numbers in the list which are not crossed out are prime.

As a refinement, it is sufficient to cross out the numbers in step 3 starting from p² because all the lesser ones will be already crossed out at that point. That means it's OK to stop in step 5 when p² reaches the top limit n, as well. This does not appear in the original algorithm.^[4]

A common improvement is to work with odd numbers only (3, 5, ..., n), and count up using 2p as a step (thus excluding all even numbers in advance). This actually appears in the original description.^[5]

Unbounded sieve

An unbounded formulation of the sieve^[6] states that primes are all the numbers above 1 with all the composites removed from them (defined as multiples of primes, generated by counting up from each prime as in the usual formulation). Generated segment-wise through use of fixed-size array it is known as the segmented sieve.

Example

To find all the prime numbers less than or equal to 30, proceed as follows.

First generate a list of integers from 2 to 30:

 2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

First number in the list is 2; cross out every 2nd number in the list after it (by counting up in steps of 2), i.e. all the multiples of 2:

 2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Next number in the list after 2 is 3; cross out every 3-rd number in the list after it (by counting up in steps of 3), i.e. all the multiples of 3:

 2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Next number not yet crossed out in the list after 3 is 5; cross out every 5-th number in the list after it (by counting up in steps of 5), i.e. all the multiples of 5:

 2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Next number not yet crossed out in the list after 5 is 7 so we have to cross out every 7-th number in the list after it, but they are all already crossed out at this point, as they are also multiples of smaller primes (14, 21, 28); same with 11 and 13 (this is because 7*7 is greater than 30). The numbers left not crossed out in the list at this point are all the prime numbers below 30:

 2  3     5     7           11    13          17    19          23                29

Algorithm complexity

The bit complexity of the algorithm is $O(n(\log n)(\log \log n))$ bit operations with a memory requirement of $O(n)$ .^[7]

There are $O(n/p_{i})$ multiples to remove on each step for each prime $p_{i}<{\sqrt {n}}$ ; there are thus $O(n\log \log n)$ composites to be removed generated by the sieve, a direct consequence of the fact that the prime harmonic series asymptotically approaches $\log \log n$ .

Time complexity in RAM machine model is thus $O(n\log \log n)$ operations, predicated on $O(1)$ direct generation of each composite (i.e. by counting up in steps of $p_{i}$ ) and $O(1)$ removal of each directly generated composite (ensuring $O(n/p_{i})$ time complexity of each step in the algorithm), usually achieved using random access mutable array, where the conflation of value and address is the key to improving the theoretical complexity as it becomes essentially a duplicates-removing pigeonhole sort (a variety of integer sorting as opposed to comparison sorting).

The segmented version of the sieve of Eratosthenes, with basic optimizations, uses $O(n)$ operations and $O(n^{1/2}\log \log n/\log n)$ bits of memory.^[8]

Implementation

In pseudocode:

Input: an integer n > 1

Let A be an array of bool values, indexed by integers 2 to n,
initially all set to true.

count = n - 1
for i = 2, 3, 4, ..., while i^2 ≤ n:
  if A[i] is true:
    for j = i^2, i^2 + i, i^2 + 2i, ..., while j ≤ n:
      if A[j] is true:
        A[j] = false
        count = count - 1

Now all i such that A[i] is true are prime,
and count is the total count of primes found.

An unbounded sieve:

Input: none

primes = 2 : 3 : (odds(5) \\ mapEach(multiplesOf, tail(primes)))
multiplesOf(p) = generate q, q + 2p, q + 4p, ..., where q = p · p
odds(n) = generate n, n + 2, n + 4, ...
tail(a : b) = b

where
  a : b denotes a sequence b with an element a prefixed to it
  \\ is a set difference operation

Here primes is a sequence of prime numbers generated by the algorithm, one by one, and used by it in generating the sequences of each prime's multiples by conting up in equal spaces, as is characteristic of the sieve of Eratosthenes. Set difference operation will have to be repeated for each relevant multiples' sequence (those that started below the current odd under consideration).

The timing issues are left to be taken care of by a concrete implementation. With sequences represented segment-wise on a fixed size array, it becomes a segmented sieve.

Mnemonic

A short, albeit imprecise, description of the Sieve of Eratosthenes in verse:

Sift the Twos and sift the Threes,
The Sieve of Eratosthenes.
When the multiples sublime,
The numbers that remain are Prime.
— Traditional, ^[9]^[10]

Euler's sieve

Euler in his Proof of the Euler product formula for the Riemann zeta function came up with a version of the sieve of Eratosthenes, better in the sense that each number was eliminated exactly once. Unlike Eratosthenes' sieve which marks multiples of primes, as it finds them, in the same candidates sequence, Euler's sieve actually removes all the multiples of each prime it finds, creating progressively culled candidates sequences:

A) Start with all the natural numbers except '1' which is not a prime:

  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ...
  ^

B) The leftmost number is prime. Multiply each number in the list by this prime to find all the multiples of this prime, and then discard the products:

 *2:
 (4  6  8 10 12 14 16 18 20 22 24 26 28 30 ... )

  New list:
  2  3     5     7     9    11    13    15    17    19    21    23    25    27    29 ...
     ^

C) The number after the previous prime is the next prime. Multiply by it each number in the new list starting from this prime and then discard the new products:

   *3:
    (9    15    21    27    33    39    45    51    57    63    69    75    81    87 ...)

  New list:
  2  3     5     7          11    13          17    19          23    25          29 ...
           ^
         *5:
         (25    35          55    65          85    95          115   125         145 ...)
       ........

Repeat C) indefinitely. On each repetition a new prime is identified (marked by the cursor, ^) until all the primes in the starting list have been found.

When generating a bounded sequence of primes, when we have exceeded the square root of the upper limit we already have the desired sequence of primes. In the example given above that will be achieved when we identify the prime 7, to give us a list of all primes less than or equal to 30.

Thus in the progression of steps, after the step k the sequence consists of all numbers co-prime with the first k primes (i.e. containing no multiples of them whatever), which is how the wheel factorization comes about. Note that numbers that will get discarded by some step are still being used while it's building its list of multiples, i.e. for 3 we have 3*3=9, 3*5=15, 3*7=21, 3*9=27, ..., 3*15=45, ..., thus we can't remove them as soon as they are found, so the timing issues on imperative implementations can be subtle... or we can just start from the high end of the list.

References

Notes

^ Horsley, Rev. Samuel, F. R. S., "Κόσκινον Ερατοσθένους or, The Sieve of Eratosthenes. Being an Account of His Method of Finding All the Prime Numbers," Philosophical Transactions (1683-1775), Vol. 62. (1772), pp. 327-347.
^ The Prime Glossary: "The Sieve of Eratosthenes", http://primes.utm.edu/glossary/page.php?sort=SieveOfEratosthenes, references 16. November 2008.
^ Nicomachus, Introduction to Arithmetic, I, 13. [1]
^ Nicomachus, ibid., p. 31, where e.g. 93 is marked as a multiple of 31.
^ Nicomachus, ibid., p. 31, where only odd numbers appear in the table.
^ the code attributed to Richard Bird in O'Neill, Melissa E., "The Genuine Sieve of Eratosthenes", Journal of Functional Programming, Published online by Cambridge University Press 9 October 2008 doi:10.1017/S0956796808007004, pp. 10, 11
^ Pritchard, Paul, "Linear prime-number sieves: a family tree," Sci. Comput. Programming 9:1 (1987), pp. 17–35.
^ A. O. L. Atkin and D. J. Bernstein, "Prime sieves using binary quadratic forms", Mathematics of Computation 73 (2004), pp. 1023–1030.
^ Merritt, Doug (December 14, 2008). "Sieve Of Eratosthenes". Retrieved 2009-03-26.
^ Nyk¨anen, Matti (October 26, 2007). "An Introduction to Functional Programming with the Programming Language Haskell" (PDF). Retrieved 2009-03-26.

External links

[1] Horsley, Rev. Samuel, F. R. S., "Κόσκινον Ερατοσθένους or, The Sieve of Eratosthenes. Being an Account of His Method of Finding All the Prime Numbers," Philosophical Transactions (1683-1775), Vol. 62. (1772), pp. 327-347.

[2] The Prime Glossary: "The Sieve of Eratosthenes", http://primes.utm.edu/glossary/page.php?sort=SieveOfEratosthenes, references 16. November 2008.

[3] Nicomachus, Introduction to Arithmetic, I, 13. [1]

[4] Nicomachus, ibid., p. 31, where e.g. 93 is marked as a multiple of 31.

[5] Nicomachus, ibid., p. 31, where only odd numbers appear in the table.

[6] the code attributed to Richard Bird in O'Neill, Melissa E., "The Genuine Sieve of Eratosthenes", Journal of Functional Programming, Published online by Cambridge University Press 9 October 2008 doi:10.1017/S0956796808007004, pp. 10, 11

[7] Pritchard, Paul, "Linear prime-number sieves: a family tree," Sci. Comput. Programming 9:1 (1987), pp. 17–35.

[8] A. O. L. Atkin and D. J. Bernstein, "Prime sieves using binary quadratic forms", Mathematics of Computation 73 (2004), pp. 1023–1030.

[9] Merritt, Doug (December 14, 2008). "Sieve Of Eratosthenes". Retrieved 2009-03-26.

[10] Nyk¨anen, Matti (October 26, 2007). "An Introduction to Functional Programming with the Programming Language Haskell" (PDF). Retrieved 2009-03-26.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Number-theoretic algorithms
Primality tests	AKS APR Baillie–PSW Elliptic curve Pocklington Fermat Lucas Lucas–Lehmer Lucas–Lehmer–Riesel Proth's theorem Pépin's Quadratic Frobenius Solovay–Strassen Miller–Rabin
Prime-generating	Sieve of Atkin Sieve of Eratosthenes Sieve of Pritchard Sieve of Sundaram Wheel factorization
Integer factorization	Continued fraction (CFRAC) Dixon's Lenstra elliptic curve (ECM) Euler's Pollard's rho p − 1 p + 1 Quadratic sieve (QS) General number field sieve (GNFS) Special number field sieve (SNFS) Rational sieve Fermat's Shanks's square forms Trial division Shor's
Multiplication	Ancient Egyptian Long Karatsuba Toom–Cook Schönhage–Strassen Fürer's
Euclidean division	Binary Chunking Fourier Goldschmidt Newton-Raphson Long Short SRT
Discrete logarithm	Baby-step giant-step Pollard rho Pollard kangaroo Pohlig–Hellman Index calculus Function field sieve
Greatest common divisor	Binary Euclidean Extended Euclidean Lehmer's
Modular square root	Cipolla Pocklington's Tonelli–Shanks Berlekamp Kunerth
Other algorithms	Chakravala Cornacchia Exponentiation by squaring Integer square root Integer relation (LLL; KZ) Modular exponentiation Montgomery reduction Schoof Trachtenberg system
Italics indicate that algorithm is for numbers of special forms