Pages:
Author

Topic: New PoW method using factorization of large numbers. (Read 907 times)

newbie
Activity: 29
Merit: 0
It's really exciting to me that this method could very well be the holy grail PoW method that we have been looking for.  Not only is it inefficient on GPU's, the best you could hope for is GPU's approaching CPU's in speed, but the optimal setup would be 1 GPU and 1 CPU working together.  Not only does this make GPU farms and standard ASIC farms unproductive, but it also makes the Server farms and Botnet's that have plagued CPU only coins unproductive as well.  The optimum setup would be similar to a desktop computer, one CPU and one GPU. Wink
member
Activity: 322
Merit: 54
Consensus is Constitution
Thanks for pointing out those quantum algorithms for further research, but can we really say that shor's will be faster than grover's?  But yes you seem to be right that shor's algorithm could be an "asic" of sorts for GNFS.

But quantum computing is still a hypothetical and we have ASIC's for hashing right now which is a real, pressing, and proven threat.  So I would rather make my bank vault impervious to explosives and torches and not worry so much about people teleporting in at this point.  I think this new PoW at a absolute minimum would give us 50 years of real fairness.  Bitcoin had what, a couple years?  Litecoin a few years?  We are talking drastic improvement here.

Lots of universities and researchers use regular personal computers for doing GNFS and they have been for decades.  There is nothing better.  They can use GPU's for some portions of the work but the heavy lifting is basic commodity CPU's.  Sure you can design a motherboard to be optimized but you will only get incremental gains, nothing like an asic.  That paper is the only sub billion dollar proposed ASIC for large number factoring.  That just shows how impractical it is and I would guess based on reading between the lines of the paper it would give less than an order of magnitude speedup vs the same price and power usage of commodity hardware, and they admit that is their competition.

Off topic but using liquid nitrogen for cooling could give pretty substantial efficiency gains to any hardware for any algorithm...
member
Activity: 322
Merit: 54
Consensus is Constitution
Interesting paper but it appears their chosen problem to solve is an orthogonal vector problem.  While that is a small vector field problem so it would take some memory, it is probably still very vulnerable to acceleration (GPU/ASIC).  Their main goal is to provide a useful outcome, whereas my goal is to provide parallelization resistance with useful work a secondary goal.
jr. member
Activity: 168
Merit: 3
#Please, read:Daniel Ellsberg,-The Doomsday *wk
recently I have seen more and more interest from the scientific community on this topic ... I had spend some of my spare time looking for papers about it ... I really find this one interesting ..

Quote
Proofs of Work (PoWs) were introduced [DN92] to enforce that a certain amount of energy was
expended for doing some task in an easily verifiable way. In most applications, PoWs force malicious
users to accrue a large workload, thus guarding against email spam, denial of service attacks, and,
most recently, double-spending in cryptocurrencies such as Bitcoin [Nak08]. Unfortunately, existing
PoW schemes are often disconnected from the task they are attached to, so that the work expended
is not actually useful in accomplishing that task. More importantly, the work and energy expended
is generally not useful for anything except proving that work had in fact been done.
To this end, PoWs are wasteful of real resources and energy and, in the massive use case
of Bitcoin, have even been called an ”environmental disaster” [And13]. Two early attempts to
combat this are Primecoin [Kin13] and Permacoin [MJS+14]. The former suggests a Proof of
Work system whose outcome enables the search for chains of prime numbers, whereas the latter
repurposes Bitcoin mining resources to achieve distributed storage of archival data, based on Proofs
of Retrievability (thus requiring clients to invest not just computational resources, but also storage).
Another line of work, studies Proofs of Space [ABFG14, DFKP15, KK14], where a user must
dedicate a significant amount of disk space, instead of computing power, to perform their task.
This accomplishes a similar constraint on malicious activity to PoWs since a group of users cannot
over-perform a task without having access to massive amounts of memory. However, as a typical
user has free disk space, such schemes will not similarly waste a resource like energy and pollute
the environment.
In this work we present an alternative to these: Proofs of Work whose work is actually useful
to solving practical problems.

wpaper source: https://eprint.iacr.org/2017/203.pdf

And if I am not mistake they talk about it on cource about cryptocurrency created by Princeton University at coursera (week 8, Alternative Mining Puzzles)   
member
Activity: 322
Merit: 54
Consensus is Constitution
Like I have been saying recently, this algorithm would require 1 CPU and 1 GPU for optimum mining.  This would be perfect for commodity devices like phones, laptops, and AI that have "APU" (combined cpu gpu) processors and would help protect against server farms and IoT botnet's.

http://www.mersenneforum.org/showthread.php?t=19312
member
Activity: 322
Merit: 54
Consensus is Constitution
It's a challenging proposal.
Prime numbers and numbers having small factors can be ruled out easily. Primality test is much faster than factoring and there are algorithms which are very fast in finding small factors (see elliptic curve factoring).

Factorization of large numbers usually means 1024 bit (obsolte) or 2048 bit numbers (4096 bit would be recommended to be on the safe side).
Those numbers are incredibly big (about 600+ digits long).



This could also lead to finding new factoring algorithms.

Quite a few organisations have an incentive to find factoring algorithms with a better runtime.
Until now there hasn't been made any considerable progress in solving this problem.


Yup right on.  I think for our purposes we will be working primarily on the 100-200 digit range.  Still gnfs sieve is the best known way to either look for factors or to determine if these numbers are prime.  Finding one factor of a given length is much easier to do (or prove you can't do) than to prove a mumber is prime.
member
Activity: 322
Merit: 54
Consensus is Constitution
Here is a whitepaper about how to define a new proof of work system that gives preference to CPU's forever theoretically and is immune to vulnerabilities that encryption algorithms are prone to.
(link is to wayback machine so you can trust visiting it)

https://web.archive.org/web/20171212224738/http://www.naturehackerproducts.com/2017/12/new-proof-of-work-pow-method-based-on.html?m=1

The take home message is to find a single factor (of a length defined by the difficulty) of a very large number (over 100 digits).

Since it is such a large number GPU's become very slow at attempting it.

Factoring big numbers can be highly accelerated with quantum computers and Shor's algorithm.

https://en.wikipedia.org/wiki/Shor%27s_algorithm

In my opinion new POW algorithms should not only be ASIC resistant, they should also be quantum computing resistant.

Bitcoin would already suffer if some day quantum computers with enough qubits become reality as the private/public keys could be cracked with Shor's algorithm.
The SHA-256 hashing of the public key would prevent this attack, but there are already lot of public keys revealed which still contain balances (very old transactions, reused bitcoin addresses). And with those known public keys the private keys could be calculated.

Also RSA encryption (used for example in HTTPS connections) could be cracked quite easly with Shor's algorithm which might be the best what could happen to our nice three letter agencies to globally spy on every online connection.

It might still take long time until quantum computers with enough qubits are available, but we should try to avoid not quantum computing proof algorithms when creating new stuff.

Well since every other algorithm suffers massively RIGHT NOW from GPU and/or ASIC speed up, if all we have to worry anout is quantum computing then I think we are way way above par.  For right now we don't even know if quantum computing the way it is envisioned will even work, let alone we have no idea what an actual application will be capable of.  So I say we cross that bridge when we get there.  For now we need gpu and asic and botnet and server farm resistance.
member
Activity: 322
Merit: 54
Consensus is Constitution
Originally I was worried about primes but one of the other posters alerted me that everyone can be working on different numbers.  I didn't realize this is how bitcoin works currently.  Everyone's number is determined by the hash of their public key along with other block details.  So if one person gets dealt a prime (or two or three or 100 people) then the others will be the ones to win that block.
If one miner gets stuck with a prime, then they're not "out" until the next block is solved - they can and will periodically generate a new number by changing some of the block details, and then hope this new number they're factorizing isn't prime.

On a side note it is possible that miners will be biased toward even numbers, since it is more likely that an even number will be composite than an odd number (marginally, but at this scale it could make a difference). As miners can change certain parts of the block details to whatever they want, they can keep trying until they get an even number, and then work on that one.

Good catch.  That is a glaring vulnerability since we want to ultimately test whether numbers are prime or not.  My intention is that people can datamine the blockchain to know what numbers to rule out in the search for new prime numbers.  If everyone is mining even numbers than this wouldn't work.  honestly I think we would have to eliminate numbers that have any factors up to say 1000.  This way people can't just divide their number by everything from 1-100 to make sure it isn't prime.  The point is to rule out prime candidates by proving they are composite.

I don't know, any other ideas on this vulnerability?

Let's say that, to solve a block, you need to find a factor that is at least 1000 digits of a number that is 2000 digits. We already know we can skip odd numbers, so keep trying to hash your block until you get an even number. This should happen nearly instantly. Now, to find a 1000+ digit factor of a 2000 digit even number...... you just divide by 2. Even on a number this size, that kind of operation would take only milliseconds on a CPU.

More digits in the factor meaning higher difficulty is just not accurate. Finding a 50 digit factor would be harder than finding either a 1 digit or 1999+ digit factor of a 2000 digit number. But ruling out prime candidates is not the hard part. You can already prove that a number is composite if it is both greater than 2 and even. We already know it's greater than 2, so you just have to look at the last one bit - if 0 then composite, if 1 then unknown, try again.

Disallowing numbers with any factors (other than 1) up to 1000 is an interesting idea. They certainly exist (here's one, ~100 digits: 1332040162761665637631613428888885580516735653612226801746764441547065802389532 208097762306977281), but I don't know if there's a way to look at a number and calculate if it meets that criteria before you go bruteforcing factors. If there is (I imagine there is) then we're back to looking for a large enough factor anyways.

What you could do instead is have a window of factor length that has to be met. Rather than >= 1000 digits, say 100 to 110 digits, for a factor of a 2000 digit number.

edit: The very big number above is getting a space in it for some reason - if you try to calculate the factors, make sure that you remove that space

Thanka for your great posts.  But yes my intention was that it would require a number with the exact number of digits to win the block, not just 70+ digit factor but an exactly 70 digit factor for example.  If we did the "no numbers with any factors up to 1000" thing then it would just require a teeny bit of brute forcing to verify the proposed target number is valid.  So you would broadcast to the network that here was my target number, notice that it isn't divisible by anything under 1000, and here is my proposed 70 digit factor, notice it works and is exactly 70 digits.
legendary
Activity: 1624
Merit: 2481
It's a challenging proposal.
Prime numbers and numbers having small factors can be ruled out easily. Primality test is much faster than factoring and there are algorithms which are very fast in finding small factors (see elliptic curve factoring).

Factorization of large numbers usually means 1024 bit (obsolte) or 2048 bit numbers (4096 bit would be recommended to be on the safe side).
Those numbers are incredibly big (about 600+ digits long).



This could also lead to finding new factoring algorithms.

Quite a few organisations have an incentive to find factoring algorithms with a better runtime.
Until now there hasn't been made any considerable progress in solving this problem.
member
Activity: 86
Merit: 26
Here is a whitepaper about how to define a new proof of work system that gives preference to CPU's forever theoretically and is immune to vulnerabilities that encryption algorithms are prone to.
(link is to wayback machine so you can trust visiting it)

https://web.archive.org/web/20171212224738/http://www.naturehackerproducts.com/2017/12/new-proof-of-work-pow-method-based-on.html?m=1

The take home message is to find a single factor (of a length defined by the difficulty) of a very large number (over 100 digits).

Since it is such a large number GPU's become very slow at attempting it.

Factoring big numbers can be highly accelerated with quantum computers and Shor's algorithm.

https://en.wikipedia.org/wiki/Shor%27s_algorithm

In my opinion new POW algorithms should not only be ASIC resistant, they should also be quantum computing resistant.

Bitcoin would already suffer if some day quantum computers with enough qubits become reality as the private/public keys could be cracked with Shor's algorithm.
The SHA-256 hashing of the public key would prevent this attack, but there are already lot of public keys revealed which still contain balances (very old transactions, reused bitcoin addresses). And with those known public keys the private keys could be calculated.

Also RSA encryption (used for example in HTTPS connections) could be cracked quite easly with Shor's algorithm which might be the best what could happen to our nice three letter agencies to globally spy on every online connection.

It might still take long time until quantum computers with enough qubits are available, but we should try to avoid not quantum computing proof algorithms when creating new stuff.
newbie
Activity: 17
Merit: 0
It's a challenging proposal.
Prime numbers and numbers having small factors can be ruled out easily. Primality test is much faster than factoring and there are algorithms which are very fast in finding small factors (see elliptic curve factoring). so, you could just pick numbers with specified factor sizes.
This could also lead to finding new factoring algorithms.
legendary
Activity: 1386
Merit: 1053
Please do not PM me loan requests!
Originally I was worried about primes but one of the other posters alerted me that everyone can be working on different numbers.  I didn't realize this is how bitcoin works currently.  Everyone's number is determined by the hash of their public key along with other block details.  So if one person gets dealt a prime (or two or three or 100 people) then the others will be the ones to win that block.
If one miner gets stuck with a prime, then they're not "out" until the next block is solved - they can and will periodically generate a new number by changing some of the block details, and then hope this new number they're factorizing isn't prime.

On a side note it is possible that miners will be biased toward even numbers, since it is more likely that an even number will be composite than an odd number (marginally, but at this scale it could make a difference). As miners can change certain parts of the block details to whatever they want, they can keep trying until they get an even number, and then work on that one.

Good catch.  That is a glaring vulnerability since we want to ultimately test whether numbers are prime or not.  My intention is that people can datamine the blockchain to know what numbers to rule out in the search for new prime numbers.  If everyone is mining even numbers than this wouldn't work.  honestly I think we would have to eliminate numbers that have any factors up to say 1000.  This way people can't just divide their number by everything from 1-100 to make sure it isn't prime.  The point is to rule out prime candidates by proving they are composite.

I don't know, any other ideas on this vulnerability?

Let's say that, to solve a block, you need to find a factor that is at least 1000 digits of a number that is 2000 digits. We already know we can skip odd numbers, so keep trying to hash your block until you get an even number. This should happen nearly instantly. Now, to find a 1000+ digit factor of a 2000 digit even number...... you just divide by 2. Even on a number this size, that kind of operation would take only milliseconds on a CPU.

More digits in the factor meaning higher difficulty is just not accurate. Finding a 50 digit factor would be harder than finding either a 1 digit or 1999+ digit factor of a 2000 digit number. But ruling out prime candidates is not the hard part. You can already prove that a number is composite if it is both greater than 2 and even. We already know it's greater than 2, so you just have to look at the last one bit - if 0 then composite, if 1 then unknown, try again.

Disallowing numbers with any factors (other than 1) up to 1000 is an interesting idea. They certainly exist (here's one, ~100 digits: 1332040162761665637631613428888885580516735653612226801746764441547065802389532 208097762306977281), but I don't know if there's a way to look at a number and calculate if it meets that criteria before you go bruteforcing factors. If there is (I imagine there is) then we're back to looking for a large enough factor anyways.

What you could do instead is have a window of factor length that has to be met. Rather than >= 1000 digits, say 100 to 110 digits, for a factor of a 2000 digit number.

edit: The very big number above is getting a space in it for some reason - if you try to calculate the factors, make sure that you remove that space
member
Activity: 322
Merit: 54
Consensus is Constitution
Originally I was worried about primes but one of the other posters alerted me that everyone can be working on different numbers.  I didn't realize this is how bitcoin works currently.  Everyone's number is determined by the hash of their public key along with other block details.  So if one person gets dealt a prime (or two or three or 100 people) then the others will be the ones to win that block.
If one miner gets stuck with a prime, then they're not "out" until the next block is solved - they can and will periodically generate a new number by changing some of the block details, and then hope this new number they're factorizing isn't prime.

On a side note it is possible that miners will be biased toward even numbers, since it is more likely that an even number will be composite than an odd number (marginally, but at this scale it could make a difference). As miners can change certain parts of the block details to whatever they want, they can keep trying until they get an even number, and then work on that one.

Good catch.  That is a glaring vulnerability since we want to ultimately test whether numbers are prime or not.  My intention is that people can datamine the blockchain to know what numbers to rule out in the search for new prime numbers.  If everyone is mining even numbers than this wouldn't work.  honestly I think we would have to eliminate numbers that have any factors up to say 1000.  This way people can't just divide their number by everything from 1-100 to make sure it isn't prime.  The point is to rule out prime candidates by proving they are composite.

I don't know, any other ideas on this vulnerability?
legendary
Activity: 1386
Merit: 1053
Please do not PM me loan requests!
Originally I was worried about primes but one of the other posters alerted me that everyone can be working on different numbers.  I didn't realize this is how bitcoin works currently.  Everyone's number is determined by the hash of their public key along with other block details.  So if one person gets dealt a prime (or two or three or 100 people) then the others will be the ones to win that block.
If one miner gets stuck with a prime, then they're not "out" until the next block is solved - they can and will periodically generate a new number by changing some of the block details, and then hope this new number they're factorizing isn't prime.

On a side note it is possible that miners will be biased toward even numbers, since it is more likely that an even number will be composite than an odd number (marginally, but at this scale it could make a difference). As miners can change certain parts of the block details to whatever they want, they can keep trying until they get an even number, and then work on that one.
member
Activity: 322
Merit: 54
Consensus is Constitution
Ok, so you just use that particular algorithm to generate random numbers. Have you studied how many % of large random numbers meet your expectations? And what will happen when you hit actual prime or composite that has only relatively small factors? There will be lot of them.

Great questions keep them coming!  Originally I was worried about primes but one of the other posters alerted me that everyone can be working on different numbers.  I didn't realize this is how bitcoin works currently.  Everyone's number is determined by the hash of their public key along with other block details.  So if one person gets dealt a prime (or two or three or 100 people) then the others will be the ones to win that block.  This is kind of good because the winning number we know are not prime, so if certain numbers aren't showing up in our list of winning numbers, we can say they may be prime.

So at this size, there will be primes I'm guessing 1 out of 5,000 numbers.  So not bad at all, pretty much inconsequential for the network as a whole.  Lets say we are looking for  20 digit factor of a 100 digit number.  This is like saying we are looking for a 2 digit factor (12 for example) of a 10 digit number (2,000,000,000 for example) so they should be very very common.  Really since everyone generates their own numbers it doesn't even matter how common a working number will be.  We just need one person to find a match.
member
Activity: 98
Merit: 26
What you have to understand is the network as a whole is not "parallelizing" ... they can all have different start points in their nonce attempts.  They can have 1 core start at 0, another core start at 1,000,000 another core start at 2,000,000 etc.  

I realize the difference. But it actually doesn't matter because the search space is so large (2256) that the probability that any random slice will overlap another random slice is effectively zero. Mining pools would not be any more efficient if they coordinated the search space than they are by simply randomizing it. This is an artifact of the size of the search space.

Quote
This way they can split up the workload into chunks.  The network is trying random attempts and not collaborating on which computers have checked which numbers already.  So for tasks that cannot be split up into chunks and one chunk given to each core, CPU's will benefit.  Another thing that makes CPU's benefit is their memory utilization, which in the case of working with number fields, these number fields tend to lead to randomly accessing the memory often, again benefiting CPU's.

Both the CPU and GPUs have access to RAM (random-access memory), so access time to any memory location is equivalent, maybe some slight advantage for the CPU if it is onboard and the GPU is on an expansion card.

I encourage you to keep working on your idea - I'm not pouring cold water on it. But I am a computer engineer and computer architecture is where my career experience is, so I'm just giving you some additional facts to work with. Best of luck.
member
Activity: 73
Merit: 10
Ok, so you just use that particular algorithm to generate random numbers. Have you studied how many % of large random numbers meet your expectations? And what will happen when you hit actual prime or composite that has only relatively small factors? There will be lot of them.
member
Activity: 322
Merit: 54
Consensus is Constitution
Thanks haltingprobability.  I think factoring large numbers may be a bit more complex than you realize.  Here is an excerpt from wikipedia for general number field sieve, the most efficient way to factor large numbers over 100 digits.  I just want to get this posted so we can dissect it and discuss:

Note: The results of complexity theory apply regardless of "best known algorithm" - a problem can be proved to much easier than the best known algorithm for solving that problem.

The computational complexity of GNFS is given in the first paragraph on Wiki, as of this writing. It is an exponential time algorithm. As long as SHA-256 is actually one-way, solving the hashcash PoW function is also exponential in the number of zero-bits. Let me try to explain what I'm saying. Even though solving Bitcoin's PoW (hashcash) is exponential time complexity, the pool of all Bitcoin mining (considered as an aggregate) is "parallelizing" this problem. That's why the hashrate is so high. For a given difficulty factor, the average number of attempts to find a solution is fixed. We can think of the entire mining pool as dividing up this solution time amongst themselves - parallelizing it. For the sake of discussion, let's say it takes 1 trillion hash attempts, on average, to mine a block. If there are 1,000 miners and each of them is operating 1,000 mining rigs and each mining rig can solve 1,000 hashes per second, the total hash rate of the network is 1 billion hash attempts per second. At this rate, it will take about 1,000 seconds (almost 17 minutes) to mine a block. In fact, these numbers will hold no matter whether we change hashcash to some other exponential problem (including GNFS), so long as it takes about 1 trillion attempts (on average) to find a solution. There's nothing in the nature of the problem (factoring versus finding a preimage whose hash solves a pre-specified equation) that matters because we only care about the difficulty of the problem (proof of work). In the case of hashcash, we don't care what the preimage looks like (we are indifferent to it), so long as its hash satisfies the required property. Any random number will do just as well as any other random number. In the case of GNFS, we don't care which factors you find, just as long as you find the required number of factors. This indifference means that parallelization is trivial because we can just utilize random search. Proof-of-work, after all, is not really about solving a problem, it's just a race.

What you have to understand is the network as a whole is not "parallelizing" like a GPU would parallelize.  The reason GPU's can beat CPU's in normal hashing is they can all have different start points in their nonce attempts.  They can have 1 core start at 0, another core start at 1,000,000 another core start at 2,000,000 etc.  This way they can split up the workload into chunks.  The network is trying random attempts and not collaborating on which computers have checked which numbers already.  So for tasks that cannot be split up into chunks and one chunk given to each core, CPU's will benefit.  Another thing that makes CPU's benefit is their memory utilization, which in the case of working with number fields, these number fields tend to lead to randomly accessing the memory often, again benefiting CPU's.
member
Activity: 98
Merit: 26
Thanks haltingprobability.  I think factoring large numbers may be a bit more complex than you realize.  Here is an excerpt from wikipedia for general number field sieve, the most efficient way to factor large numbers over 100 digits.  I just want to get this posted so we can dissect it and discuss:

Note: The results of complexity theory apply regardless of "best known algorithm" - a problem can be proved to much easier than the best known algorithm for solving that problem.

The computational complexity of GNFS is given in the first paragraph on Wiki, as of this writing. It is an exponential time algorithm. As long as SHA-256 is actually one-way, solving the hashcash PoW function is also exponential in the number of zero-bits. Let me try to explain what I'm saying. Even though solving Bitcoin's PoW (hashcash) is exponential time complexity, the pool of all Bitcoin mining (considered as an aggregate) is "parallelizing" this problem. That's why the hashrate is so high. For a given difficulty factor, the average number of attempts to find a solution is fixed. We can think of the entire mining pool as dividing up this solution time amongst themselves - parallelizing it. For the sake of discussion, let's say it takes 1 trillion hash attempts, on average, to mine a block. If there are 1,000 miners and each of them is operating 1,000 mining rigs and each mining rig can solve 1,000 hashes per second, the total hash rate of the network is 1 billion hash attempts per second. At this rate, it will take about 1,000 seconds (almost 17 minutes) to mine a block. In fact, these numbers will hold no matter whether we change hashcash to some other exponential problem (including GNFS), so long as it takes about 1 trillion attempts (on average) to find a solution. There's nothing in the nature of the problem (factoring versus finding a preimage whose hash solves a pre-specified equation) that matters because we only care about the difficulty of the problem (proof of work). In the case of hashcash, we don't care what the preimage looks like (we are indifferent to it), so long as its hash satisfies the required property. Any random number will do just as well as any other random number. In the case of GNFS, we don't care which factors you find, just as long as you find the required number of factors. This indifference means that parallelization is trivial because we can just utilize random search. Proof-of-work, after all, is not really about solving a problem, it's just a race.
member
Activity: 322
Merit: 54
Consensus is Constitution
Thanks haltingprobability.  I think factoring large numbers may be a bit more complex than you realize.  Here is an excerpt from wikipedia for general number field sieve, the most efficient way to factor large numbers over 100 digits.  I just want to get this posted so we can dissect it and discuss:

First off:
The general number field sieve, on the other hand, manages to search for smooth numbers that are subexponential in the size of n. Since these numbers are smaller, they are more likely to be smooth than the numbers inspected in previous algorithms. This is the key to the efficiency of the number field sieve. In order to achieve this speed-up, the number field sieve has to perform computations and factorizations in number fields. This results in many rather complicated aspects of the algorithm, as compared to the simpler rational sieve.

Method:
Two polynomials f(x) and g(x) of small degrees d and e are chosen, which have integer coefficients, which are irreducible over the rationals, and which, when interpreted mod n, have a common integer root m. An optimal strategy for choosing these polynomials is not known; one simple method is to pick a degree d for a polynomial, consider the expansion of n in base m (allowing digits between −m and m) for a number of different m of order n1/d, and pick f(x) as the polynomial with the smallest coefficients and g(x) as x − m.

Consider the number field rings Z[r1] and Z[r2], where r1 and r2 are roots of the polynomials f and g. Since f is of degree d with integer coefficients, if a and b are integers, then so will be bd·f(a/b), which we call r. Similarly, s = be·g(a/b) is an integer. The goal is to find integer values of a and b that simultaneously make r and s smooth relative to the chosen basis of primes. If a and b are small, then r and s will be small too, about the size of m, and we have a better chance for them to be smooth at the same time. The current best-known approach for this search is lattice sieving; to get acceptable yields, it is necessary to use a large factor base.

Having enough such pairs, using Gaussian elimination, one can get products of certain r and of the corresponding s to be squares at the same time. A slightly stronger condition is needed—that they are norms of squares in our number fields, but that condition can be achieved by this method too. Each r is a norm of a − r1b and hence that the product of the corresponding factors a − r1b is a square in Z[r1], with a "square root" which can be determined (as a product of known factors in Z[r1])—it will typically be represented as an irrational algebraic number. Similarly, the product of the factors a − r2b is a square in Z[r2], with a "square root" which also can be computed. It should be remarked that the use of Gaussian elimination does not give the optimal run time of the algorithm. Instead, sparse matrix solving algorithms such as Block Lanczos or Block Wiedemann are used.

Since m is a root of both f and g mod n, there are homomorphisms from the rings Z[r1] and Z[r2] to the ring Z/nZ (the integers mod n), which map r1 and r2 to m, and these homomorphisms will map each "square root" (typically not represented as a rational number) into its integer representative. Now the product of the factors a − mb mod n can be obtained as a square in two ways—one for each homomorphism. Thus, one can find two numbers x and y, with x2 − y2 divisible by n and again with probability at least one half we get a factor of n by finding the greatest common divisor of n and x − y.
Pages:
Jump to: