Talk:Quadratic sieve

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated B-class, Mid-priority)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
B Class
Mid Priority
 Field: Number theory

The paragraph named "Data processing" needs a serious rewrite. I just fixed an error that was introduced six(!) months ago, but even though it does not contain any more "logical errors", it does not show what happens at all. The 350^2 = (-90)^2 example is a very, very bad one, because actually 350 *is* -90 (mod 55). It only happens to prove the result because 350-90 = 2*350 has factor 5 common with 55, which itself is true only because 350 contains factor 5. This is nonsense! I really believe a less contrived example would be welcome, as it would decrease the probability of such errors and would help readers get a better grasp of the concept.

OK, I wrote a new example, with a bigger n and more strictly following the procedure outlined in the article (this example uses sieving). Decrypt3 23:44, Dec 13, 2004 (UTC)

New stuff[edit]

I tried to add a more approachable introduction to the ideas behind the algorithm, based roughly on the presentation from Prime Numbers: A Computational Perspective. Please scan my contributions for errors - number theory isn't quite my forte. Thanks. Deco 22:05, 8 October 2005 (UTC)


I'm not familiar with how to edit these pages, but with the example of 1817, you should also produce the term (5,392) with 392=2^3*7^2 being smooth.

Also the line ...


should read

I just made a change. The original article said it was an extension of Dixon's algorithm, but I don't think this is a good analogy for two reasons: (i) Pomerance himself described it as a simple change to Schroeppel's algorithm (not Dixon's), and (ii) Dixon's algorithm and Quadratic Sieve are in two different categories: the former is a provable algorithm, the latter is heuristic. Scott contini (talk) 11:39, 12 August 2010 (UTC)

Space complexity[edit]

I'm not yet sure whether the recent correction to the space complexity is correct or not. The optimal choice of the smoothness bound B is indeed the square root of the time complexity. On the other hand, if the bit matrix is represented explicitly, it requires B2 space. On the other other hand, we don't normally represent it explicitly - we use a sparse matrix representation that only stores a small number of factors per row, and reduce it using the Lanczos method for large sparse linear systems. We can't guarantee that the matrix will be sparse, but in practice it is. The cited reference referred to Number Field Sieve and so does not apply. Until someone finds an explicit reference describing the space complexity, I've removed any mention of it, since I'm uncertain myself. Deco 18:53, 26 November 2006 (UTC)

  • I searched high and low, but I couldn't find anything on the space complexity, either. Perhaps it's time to ask someone with more knowledge in number theory. On another note, I'd imagine that each main step of the quadratic sieve would have different complexities, with the sieving and matrix-solving steps being particularly large when compared to the other three. Personally, I think that there are still many things that can be added to the article. --Ixfd64 11:55, 27 November 2006 (UTC)
    • My computational number theory book (already referenced in the article) has the following to say in section 6.1.3:
      • There have been suggested at least three alternative sparse-matrix methods intended to replace Gaussian elimination [...] the space required is O(N ln B). Since our factorization matrices have at most O(ln n) nonzero entries per row, the space requirement for the matrix stage of the algorithm, using a sparse encoding, is O(B ln2 n).
    • If you put this together with the optimal choice of B being , this means the space in the matrix stage is:
    • On the other hand, my book also says that "there are hidden costs for increasing the list of primes p with which we sieve. One is that it is unlikely we can fit the entire sieve array into memory on a computer, so we segment it [...]". It seems difficult to establish which stage dominates in space, although in practice the space issues of the sieve are more easily overcome. As always, little is rigorous when it comes to QS. I suggest that we say that the matrix stage requires the amount of memory shown above, and leave it at that. Deco 12:21, 27 November 2006 (UTC)
It all becomes a lot simpler if you use L-notation. The matrix has rows each of which have at most (actually less than) entries. This implies the size of the matrix is at most -- the little o in the exponent of the swallows the that is multiplied by it. So the memory is indeed square root of the running time. Also, memory during sieve stage is dominated by storing the factor base which is again size. Scott contini (talk) 11:53, 12 August 2010 (UTC)

To do[edit]

Some stuff that needs to be done in this article:

  • Explain how the asymptotically optimal value of the smoothness bound B is chosen.
  • Explain better the "sign" bit of the vectors and why it's needed.
  • Talk about multiple polynomial variations.
  • Talk about sparse matrix representations and solution methods and their applicability.
  • Talk about how large a region needs to be sieved.
  • Talk about the use of approximate logarithms to speed up calculations.
  • Similarities and differences with number field sieve.
  • Lots more stuff.

Just some things to keep in mind for future expansion. There is plenty to add here. Deco 01:13, 29 November 2006 (UTC)

In the German Article some of thoose questions (like multiple polynomials, approximate logarithms) were discussed. An Example is also given. 11:42, 6 February 2007 (UTC)

Implementing a Quadratic Sieve[edit]

Hey, as part of my 3rd year project, one of the things I am doing is implementing this. I think I have found an error in the selection of the Factor Base, could someone please check that 13 should not be in the factor base? Also should 23 be? It's a factor of 1817. I will be further correcting this description if I find what I feel is a mistake, but could people possibly review my changes? Cheers, Tompsci 16:01, 18 January 2007 (UTC)

I don't see why 13 shouldn't be in the factor base. (1817/13) = (10/13) and 6^2 = 10 mod 13. Thus the Legendre symbol is 1. And you can't exclude 23 unless you know in advance that it is a factor of 1817. But you can't know that without factoring the number, which is the object of the exercise. In practice a quadratic sieve implementation will trial divide by all primes upto the largest prime in the factor base, it will check that the number being factored is not prime and it will check that the number being factored is not a perfect power. In all these situations the quadratic sieve will fail to find a factor otherwise. Someone ought to remove the thing about the article being factually incorrect (apart from my comments below), and perhaps the talk page shouldn't be used as a means of asking questions for a 3rd year project. :-) Anyhow, 23 is not mentioned as an element in the factor base in the article is it? wbhart 03:45, 11 February 2007 (UTC)
I think the error in question has been sufficiently addressed by wbhart, so I am going to remove the tag. Schneau 17:25, 25 June 2007 (UTC)

crossover with NFS[edit]

In various experiments using best-available-free implementations (ggnfs for the NFS, msieve for MPQS) I found the crossover was nearer 100 than 110 digits; a 102-digit number was 17hrs msieve and 8hrs ggnfs on the same hardware Fivemack 11:42, 7 February 2007 (UTC)

- This is a moving target dependent on:

* the architecture of the hardware used, memory throughput vs processor clock speed vs hard disk speed, etc
* the amount of time invested in optimizing the number field sieve and quadratic sieve
* the "goodness" of the polynomials used with the number field sieve (still an open research problem)
* the particular numbers being factored - wbhart 03:45, 11 February 2007 (UTC)

Fastest factoring method?[edit]

The statement that the quadratic sieve is the fastest factoring method for numbers of less than 100 digits and that the number field sieve is the fastest otherwise is factually incorrect. Many numbers of hundreds of decimal digits have been factored with the elliptic curve algorithm and yet the sun would die of heat death before the quadratic sieve or number field sieve would factor them. Also, below about 40 digits, the quadratic sieve is not usually the best algorithm. In practice a cocktail of one or more of SQUFOF, P-1, P+1, elliptic curve method, Pollard-Rho, trial factoring and the Brent variant are used.

The *only* correct statement is that the number field sieve is asymptotically the fastest known algorithm for factoring general integers. The quadratic sieve held this title until the NFS was invented. The elliptic curve method has the same asymptotics, but depends only on the size of the factor, not the number being factored. This makes the elliptic curve algorithm your only hope in certain situations.

For numbers of a special form, many faster algorithms exist, e.g: Zhang's sieve, hybrid QS-GNFS's, the special number field sieve, certain variants of Fermat's algorithm (even special numbers of thousands of digits can be factored with these), etc, etc. Thus the qualifier of "general integer" is required when talking about comparative times. - wbhart 04:02, 11 February 2007 (UTC)

Heat death refers to the death of the universe, and according to current models will take 101000 years, which is more than enough time to factor numbers with 100s of digits. I think you meant to say the Sun would die of gravitational collapse. Arvindn 06:15, 12 February 2007 (UTC)

It is the second fastest (randomized) algorithm for general inputs, as stated in the third sentence: "It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties." No other algorithms can ensure a better running time on all possible inputs (except the NFS for big inputs). It is a Las Vegas algorithm, which can ensure the running time under certain mathematical conditions (which hold with a high probability). Of course for some specific Inputs there were faster algorithms. The running time of these algorithms depend on certain conditions of the number to factorize. The running time of the ECM depends on the (length) of the factors. -- 14:45, 20 February 2007 (UTC)

Right, but that is not what the article says in the first sentence. wbhart —The preceding unsigned comment was added by (talk) 12:17, 6 April 2007 (UTC).

Partial Relations & Cycles = Large Primes?[edit]

As I see it the section labeled "partial relations and cycles" is identical to the methods described in section "large primes." The latter seems to do a much more concise job with the description. Am I right or wrong on this?

I am curious about this statement:

11−1 modulo 91 is 58

Could someone explain how 1/11th can be 58 (mod 91)? Admittedly I may be missing something, but this statement makes no sense to me. (talk) 16:57, 14 September 2009 (UTC)

I know you asked a long time ago, but anyway the answer is that "1/11 is 58" (mod 91) because 58×11 is 1 (mod 91). See modular multiplicative inverse. (BTW, I sympathize with your suffering; the article as a whole seems pretty confusing to me.) --Quuxplusone (talk) 04:00, 10 February 2010 (UTC)

Numbers of primes and relations[edit]

The algorithm description talks about π(B) (the number of primes less than B) which is ok. However, the statement "use sieving to locate π(B) + 1 [relations]..." seems incorrect (i.e. misleading) to me for two reasons:

1) not every prime actually contributes to the factorization (only those which give quadratic residues)

2) because not every congruence of squares leads to a solution, it is common to gather a few more relations (not only 1)

So, something like "π(B)/2 + c" (for some constant c, e.g. 10) might be more accurate. —Preceding unsigned comment added by (talk) 20:57, 8 January 2009 (UTC)

I believe you're correct - rather than just substituting the formula this could probably afford to be explained in more detail. Dcoetzee 23:48, 8 January 2009 (UTC)

General Optimization to the Matrix Processing of QS-Algorithm[edit]

The matrix processing is not only a basic part of the Quadratic-Sieve-Algorithm but it is a basic part of all variants of the Dixon-Algorithm. For this reason the following general optimization to the matrix processing of QS-Algorithm is valid also for all the other variants of the Dixon-Algorithm. With this general optimization to the matrix processing it is possible to speed up the execution time of the QS-Algorithm nearly about the factor 3/4 to 2.

Let be N the odd and composite number which is to factorize, S the defined smoothness bound and B = {-1, p1, p2, p3, ... pK-1} the factor base. The factor base contains the factor p0 = -1 and all the primes p which are not greater than the smoothness bound S and where N is a quadratic residue of the prime p. Because the factor base B contains exactly K different factors, the binary quadratic matrix which is to process has K rows and K colums. It's true, to get any linear dependent combination of vectors of a K dimensional linear independent system you need at least K+1 vectors. It's also true that not each linear dependent combination of vectors leads us to a nontrivial factor of N. For this reason you will need some times significant more than K+1 vectors even in the case if K the number of factors in the factor base is relatively small.

If the two phases of the QS-Algorithm - the data collection phase (DCP) and the data processing phase (DPP) - are strictly separated and the DPP is executed after than the DCP is completed then there exist the following problem. Because at the DCP you don't know exactly how many S-smooth values you will really need in the DPP, you must generate in each case enough S-smooth values. That means in the DCP you have to generate not only K+1 but significant more than K+1 different S-smooth values. But it's very difficult to find enough S-smooth values even in the case if the values are not relatively small.

For this reason I would suggest a repeated pipelining of the DCP and the DPP. That means each time if a new S-smooth value was found in the DCP this new S-smooth value should be processed in the DPP at the same time. With other words: Now the DCP and the DPP are no more two different phases of the QS-Algorithm but now they are two different parts of the QS-Algorithm which can be executed as a pipeline, executable as often as needed. If the processing of any new S-smooth value leads us to a nontrivial factor of N, the QS-Algorithm can be terminated at this point, otherwise the next S-smooth value must be found an processed. By this way it is very often possible to terminate the QS-Algorithm successfully even if significant less than K different S-smooth values were found an processed.

A simple test implementation of the QS-Algorithm with a pipelining of the DCP and the DPP leads to the following results: Some times there were needed only about (75% of K) different S-smooth values, often there were needed only about (90% of K) different S-smooth values, and only in very rare cases there were needed more then (100% of K) different S-smooth values to get a nontrivial factor of N.

Because now is no more need in the QS-Algorithm to generate each time significant more then K different S-smooth values, you can successive increase S the smoothness bound. This increases K - the number of factors in the factor base, which leads to a lot more of relatively small S-smooth values, which further increases the execution time of the QS-Algorithm, ...

To achive such a pipelining of the DCP and the DPP you start with an empty binary quadratic Matrix M and you must fill it step by step. Each time if any new S-smooth value Yi was found, you determine its binary exponent vector Ei = (ei,0, ei,1, ..., ei,k-1) where all the ei,j are binary values {0,1}. While the new Ei is not completely zero, determine d the degree of this binary vector Ei. The degree of a binary vector is defined as the index of the most significant bit which is equal to 1. If the row with the index d in the matrix M is still empty - that means completely zero, save the new binary exponent vector Ei unchanged into the row with the index d of the matrix M. Otherwise xor the existing binary vector Ed of the Matrix M to the new binary vector Ei. Now as the result of the xor operation the changed binary vector Ei may be completely zero then try to find a nontrivial factor of N by determining of the gcd(Xi-Yi, N). If the changed binary vector Ei was not completely zero, it has a new degree definitely less than the old degree d. By this way you create step by step a binary triangle matrix where the right upper triangle part of the matrix is completely zero. Its very unlikely that all the binary values at the main diagonal of the matrix M are equal to 1. Even for this reason its possible to find a linear dependent combination of binary vectors and a nontrivial factor of N even in the case if the matrix M is not completely filled.

Please correct me if I am wrong and correct my terrible English, because I am not a native English speaker. Aragorn321 (talk) 10:28, 1 July 2013 (UTC)

How QS optimizes finding congruences[edit]

I'm not sure I'm right and my English is not very good, but I think there's something wrong in this section:

I think the second line should read

That's because

The Handbook of Applied Cryptography (Chapter 3, page 95, equation (3.1)) seems to agree. If I'm correct, then the conclusion

This implies that y is on the order of 2x[√n]. However, it also implies that y grows linearly with x times the square root of n.

is also wrong. —Preceding unsigned comment added by (talk) 21:50, 2 July 2009 (UTC)

Asymptotically you are right. The term only becomes sigificant if is comparable with . But in practice when n is large then x will always be much smaller than and hence the approximation in the article is ok. (talk) 09:32, 4 July 2009 (UTC)
Oh. You're right. Thanks for clarifying this. —Preceding unsigned comment added by (talk) 23:29, 4 July 2009 (UTC)

Example correction[edit]

Shouldn't the line about finding quadratic residues be 15347 (mod p) instead of √15347 (mod p)? (talk) 21:27, 26 November 2011 (UTC)

The line in question spoke of the existence of a square root of 15347 mod p, which is equivalent to 15347 being a quadratic residue mod p. I rewrote it to try to clarify things. Dcoetzee 22:48, 26 November 2011 (UTC)

Error in "the approach[edit]

The subheading "the approach" includes the following:

To factorise the integer n, Fermat's method entails a search for a single number a such that a2 mod n is a square. But these a are hard to find.

That's nonsense of course. Its very easy to find such a. For example -a will work in all cases as will 2 for n larger than 4 and so on. What is meant is something like "a single number a ... is a square of a number other than itself or -a". I'm not sure how to word this neatly and wondered if someone else could do so? Francis Davey (talk) 11:57, 14 April 2012 (UTC)


I added my new open source PSIQS package coming with basic QS, MPQS, SIQS and a few other algorithms. Cheers, Tilman Neumann (talk) 14:12, 15 January 2016 (UTC)