Talk:Quadratic sieve

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics     (Rated B-Class)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating: B Class Mid Priority Field: Number theory

Please update this rating as the article progresses, or if the rating is inaccurate.


The paragraph named "Data processing" needs a serious rewrite. I just fixed an error that was introduced six(!) months ago, but even though it does not contain any more "logical errors", it does not show what happens at all. The 350^2 = (-90)^2 example is a very, very bad one, because actually 350 *is* -90 (mod 55). It only happens to prove the result because 350-90 = 2*350 has factor 5 common with 55, which itself is true only because 350 contains factor 5. This is nonsense! I really believe a less contrived example would be welcome, as it would decrease the probability of such errors and would help readers get a better grasp of the concept.

OK, I wrote a new example, with a bigger n and more strictly following the procedure outlined in the article (this example uses sieving). Decrypt3 23:44, Dec 13, 2004 (UTC)

Contents

[edit] New stuff

I tried to add a more approachable introduction to the ideas behind the algorithm, based roughly on the presentation from Prime Numbers: A Computational Perspective. Please scan my contributions for errors - number theory isn't quite my forte. Thanks. Deco 22:05, 8 October 2005 (UTC)


[edit] Corrections

I'm not familiar with how to edit these pages, but with the example of 1817, you should also produce the term (5,392) with 392=2^3*7^2 being smooth.

Also the line ...

  x^2\equiv 1817\pmod{p}  

should read (x+42)^2\equiv 1817\pmod{p}


I just made a change. The original article said it was an extension of Dixon's algorithm, but I don't think this is a good analogy for two reasons: (i) Pomerance himself described it as a simple change to Schroeppel's algorithm (not Dixon's), and (ii) Dixon's algorithm and Quadratic Sieve are in two different categories: the former is a provable algorithm, the latter is heuristic. Scott contini (talk) 11:39, 12 August 2010 (UTC)

[edit] Space complexity

I'm not yet sure whether the recent correction to the space complexity is correct or not. The optimal choice of the smoothness bound B is indeed the square root of the time complexity. On the other hand, if the bit matrix is represented explicitly, it requires B2 space. On the other other hand, we don't normally represent it explicitly - we use a sparse matrix representation that only stores a small number of factors per row, and reduce it using the Lanczos method for large sparse linear systems. We can't guarantee that the matrix will be sparse, but in practice it is. The cited reference referred to Number Field Sieve and so does not apply. Until someone finds an explicit reference describing the space complexity, I've removed any mention of it, since I'm uncertain myself. Deco 18:53, 26 November 2006 (UTC)

  • I searched high and low, but I couldn't find anything on the space complexity, either. Perhaps it's time to ask someone with more knowledge in number theory. On another note, I'd imagine that each main step of the quadratic sieve would have different complexities, with the sieving and matrix-solving steps being particularly large when compared to the other three. Personally, I think that there are still many things that can be added to the article. --Ixfd64 11:55, 27 November 2006 (UTC)
    • My computational number theory book (already referenced in the article) has the following to say in section 6.1.3:
      • There have been suggested at least three alternative sparse-matrix methods intended to replace Gaussian elimination [...] the space required is O(N ln B). Since our factorization matrices have at most O(ln n) nonzero entries per row, the space requirement for the matrix stage of the algorithm, using a sparse encoding, is O(B ln2 n).
    • If you put this together with the optimal choice of B being O\left(\exp\frac{1}{2}\left(\sqrt{\log n \log\log n}\right)\right), this means the space in the matrix stage is:
      • O\left(\log^2 n \exp\frac{1}{2}\left(\sqrt{\log n \log\log n}\right)\right)
    • On the other hand, my book also says that "there are hidden costs for increasing the list of primes p with which we sieve. One is that it is unlikely we can fit the entire sieve array into memory on a computer, so we segment it [...]". It seems difficult to establish which stage dominates in space, although in practice the space issues of the sieve are more easily overcome. As always, little is rigorous when it comes to QS. I suggest that we say that the matrix stage requires the amount of memory shown above, and leave it at that. Deco 12:21, 27 November 2006 (UTC)
It all becomes a lot simpler if you use L-notation. The matrix has L_n[1/2,1/2] rows each of which have at most (actually less than) \ln n entries. This implies the size of the matrix is at most \left(\ln n\right) L_n[1/2,1/2] = L_n[1/2,1/2] -- the little o in the exponent of the L_n[1/2,1/2] swallows the \left(\ln n\right) = e^{\left(\ln\ln n\right)} that is multiplied by it. So the memory is indeed square root of the running time. Also, memory during sieve stage is dominated by storing the factor base which is again L_n[1/2,1/2] size. Scott contini (talk) 11:53, 12 August 2010 (UTC)

[edit] To do

Some stuff that needs to be done in this article:

  • Explain how the asymptotically optimal value of the smoothness bound B is chosen.
  • Explain better the "sign" bit of the vectors and why it's needed.
  • Talk about multiple polynomial variations.
  • Talk about sparse matrix representations and solution methods and their applicability.
  • Talk about how large a region needs to be sieved.
  • Talk about the use of approximate logarithms to speed up calculations.
  • Similarities and differences with number field sieve.
  • Lots more stuff.

Just some things to keep in mind for future expansion. There is plenty to add here. Deco 01:13, 29 November 2006 (UTC)

In the German Article http://de.wikipedia.org/wiki/Quadratisches_Sieb some of thoose questions (like multiple polynomials, approximate logarithms) were discussed. An Example is also given. 87.234.198.146 11:42, 6 February 2007 (UTC)

[edit] Implementing a Quadratic Sieve

Hey, as part of my 3rd year project, one of the things I am doing is implementing this. I think I have found an error in the selection of the Factor Base, could someone please check that 13 should not be in the factor base? Also should 23 be? It's a factor of 1817. I will be further correcting this description if I find what I feel is a mistake, but could people possibly review my changes? Cheers, Tompsci 16:01, 18 January 2007 (UTC)

I don't see why 13 shouldn't be in the factor base. (1817/13) = (10/13) and 6^2 = 10 mod 13. Thus the Legendre symbol is 1. And you can't exclude 23 unless you know in advance that it is a factor of 1817. But you can't know that without factoring the number, which is the object of the exercise. In practice a quadratic sieve implementation will trial divide by all primes upto the largest prime in the factor base, it will check that the number being factored is not prime and it will check that the number being factored is not a perfect power. In all these situations the quadratic sieve will fail to find a factor otherwise. Someone ought to remove the thing about the article being factually incorrect (apart from my comments below), and perhaps the talk page shouldn't be used as a means of asking questions for a 3rd year project. :-) Anyhow, 23 is not mentioned as an element in the factor base in the article is it? wbhart 86.20.38.38 03:45, 11 February 2007 (UTC)
I think the error in question has been sufficiently addressed by wbhart, so I am going to remove the tag. Schneau 17:25, 25 June 2007 (UTC)

[edit] crossover with NFS

In various experiments using best-available-free implementations (ggnfs for the NFS, msieve for MPQS) I found the crossover was nearer 100 than 110 digits; a 102-digit number was 17hrs msieve and 8hrs ggnfs on the same hardware Fivemack 11:42, 7 February 2007 (UTC)

- This is a moving target dependent on:

* the architecture of the hardware used, memory throughput vs processor clock speed vs hard disk speed, etc
* the amount of time invested in optimizing the number field sieve and quadratic sieve
* the "goodness" of the polynomials used with the number field sieve (still an open research problem)
* the particular numbers being factored - wbhart 86.20.38.38 03:45, 11 February 2007 (UTC)

[edit] Fastest factoring method?

The statement that the quadratic sieve is the fastest factoring method for numbers of less than 100 digits and that the number field sieve is the fastest otherwise is factually incorrect. Many numbers of hundreds of decimal digits have been factored with the elliptic curve algorithm and yet the sun would die of heat death before the quadratic sieve or number field sieve would factor them. Also, below about 40 digits, the quadratic sieve is not usually the best algorithm. In practice a cocktail of one or more of SQUFOF, P-1, P+1, elliptic curve method, Pollard-Rho, trial factoring and the Brent variant are used.

The *only* correct statement is that the number field sieve is asymptotically the fastest known algorithm for factoring general integers. The quadratic sieve held this title until the NFS was invented. The elliptic curve method has the same asymptotics, but depends only on the size of the factor, not the number being factored. This makes the elliptic curve algorithm your only hope in certain situations.

For numbers of a special form, many faster algorithms exist, e.g: Zhang's sieve, hybrid QS-GNFS's, the special number field sieve, certain variants of Fermat's algorithm (even special numbers of thousands of digits can be factored with these), etc, etc. Thus the qualifier of "general integer" is required when talking about comparative times. - wbhart 86.20.38.38 04:02, 11 February 2007 (UTC)

Heat death refers to the death of the universe, and according to current models will take 101000 years, which is more than enough time to factor numbers with 100s of digits. I think you meant to say the Sun would die of gravitational collapse. Arvindn 06:15, 12 February 2007 (UTC)

It is the second fastest (randomized) algorithm for general inputs, as stated in the third sentence: "It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties." No other algorithms can ensure a better running time on all possible inputs (except the NFS for big inputs). It is a Las Vegas algorithm, which can ensure the running time under certain mathematical conditions (which hold with a high probability). Of course for some specific Inputs there were faster algorithms. The running time of these algorithms depend on certain conditions of the number to factorize. The running time of the ECM depends on the (length) of the factors. --87.234.198.146 14:45, 20 February 2007 (UTC)

Right, but that is not what the article says in the first sentence. wbhart —The preceding unsigned comment was added by 86.20.36.114 (talk) 12:17, 6 April 2007 (UTC).

[edit] Partial Relations & Cycles = Large Primes?

As I see it the section labeled "partial relations and cycles" is identical to the methods described in section "large primes." The latter seems to do a much more concise job with the description. Am I right or wrong on this?

I am curious about this statement:

11−1 modulo 91 is 58

Could someone explain how 1/11th can be 58 (mod 91)? Admittedly I may be missing something, but this statement makes no sense to me. 192.54.144.229 (talk) 16:57, 14 September 2009 (UTC)

I know you asked a long time ago, but anyway the answer is that "1/11 is 58" (mod 91) because 58×11 is 1 (mod 91). See modular multiplicative inverse. (BTW, I sympathize with your suffering; the article as a whole seems pretty confusing to me.) --Quuxplusone (talk) 04:00, 10 February 2010 (UTC)

[edit] Numbers of primes and relations

The algorithm description talks about π(B) (the number of primes less than B) which is ok. However, the statement "use sieving to locate π(B) + 1 [relations]..." seems incorrect (i.e. misleading) to me for two reasons:

1) not every prime actually contributes to the factorization (only those which give quadratic residues)

2) because not every congruence of squares leads to a solution, it is common to gather a few more relations (not only 1)

So, something like "π(B)/2 + c" (for some constant c, e.g. 10) might be more accurate. —Preceding unsigned comment added by 79.215.96.81 (talk) 20:57, 8 January 2009 (UTC)

I believe you're correct - rather than just substituting the formula this could probably afford to be explained in more detail. Dcoetzee 23:48, 8 January 2009 (UTC)


[edit] How QS optimizes finding congruences

I'm not sure I'm right and my English is not very good, but I think there's something wrong in this section:

y(x)=\left(\left\lfloor\sqrt{n}\right\rfloor+x\right)^2-n\hbox{ (where }x\hbox{ is a small integer)}
y(x)\approx 2x\left\lfloor\sqrt{n}\right\rfloor

I think the second line should read

y(x)\approx x^2 + 2x\left\lfloor\sqrt{n}\right\rfloor

That's because y(x)=\left(\left\lfloor\sqrt{n}\right\rfloor+x\right)^2 - n = x^2 + 2x\left\lfloor\sqrt{n}\right\rfloor + \left\lfloor\sqrt{n}\right\rfloor^2 - n \approx x^2 + 2x\left\lfloor\sqrt{n}\right\rfloor

The Handbook of Applied Cryptography (Chapter 3, page 95, equation (3.1)) seems to agree. If I'm correct, then the conclusion

This implies that y is on the order of 2x[√n]. However, it also implies that y grows linearly with x times the square root of n.

is also wrong. —Preceding unsigned comment added by 85.180.69.155 (talk) 21:50, 2 July 2009 (UTC)

Asymptotically you are right. The x^2 term only becomes sigificant if x is comparable with n^{1/4}. But in practice when n is large then x will always be much smaller than n^{1/4} and hence the approximation in the article is ok. 62.203.19.143 (talk) 09:32, 4 July 2009 (UTC)
Oh. You're right. Thanks for clarifying this. —Preceding unsigned comment added by 85.180.67.55 (talk) 23:29, 4 July 2009 (UTC)

[edit] Example correction

Shouldn't the line about finding quadratic residues be 15347 (mod p) instead of √15347 (mod p)? 66.245.72.198 (talk) 21:27, 26 November 2011 (UTC)

The line in question spoke of the existence of a square root of 15347 mod p, which is equivalent to 15347 being a quadratic residue mod p. I rewrote it to try to clarify things. Dcoetzee 22:48, 26 November 2011 (UTC)
Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export