Pages:
Author

Topic: [XPM] So, why is porting primecoin mining to the GPU so difficult? (Read 18723 times)

legendary
Activity: 2492
Merit: 1473
LEALANA Bitcoin Grim Reaper
I know this is and old topic, but I just now read it.
I am a long time Boinc-er (using the Berkley distributed computing platform) and between the projects I participate, there is one called PrimeGrid and it is all about various prime number mathematical work, one of them is finding various kind of them.
The projects uses GPU in some of their calculation, so maybe it could interesting to look at what they are using it for and get some new ideas.

As far as I am aware there are a few Primecoin GPU mining code bases to use:

Claymore and PrimeGPU
newbie
Activity: 30
Merit: 0
I know this is and old topic, but I just now read it.
I am a long time Boinc-er (using the Berkley distributed computing platform) and between the projects I participate, there is one called PrimeGrid and it is all about various prime number mathematical work, one of them is finding various kind of them.
The projects uses GPU in some of their calculation, so maybe it could interesting to look at what they are using it for and get some new ideas.
hero member
Activity: 532
Merit: 500
Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

Still, the sieve looks quite easy to port to the GPU and I will try to do that if my time permits it. Let's better have a definite prove that GPU sieving is ineffective right now, if that is really the case. Although, a >10 difficulty could make the sieving part more important than now, that could change the current situation.
Since the openCL miner is in the wild now, have you had a chance too look at it and see if a CUDA miner is possible?
hero member
Activity: 574
Merit: 500
I would offer 5 BTC to the first to provide me with a working GPU that the public does not know about.  This will need to be atleast 10x faster than CPU.

WOW u are cooolzz
sr. member
Activity: 363
Merit: 250
Hey primedigger, now that mtrlt's miner is out, have you taken a look at the source? Was it what you expected?
member
Activity: 93
Merit: 10
He reports a 2x performance gain against a phenom x6, that's not much considering that a 6990 has 2 cores .
If the final results go on these lines, it will not be better to mine on GPU than on CPU.

In fact the price of a Radeon HD 6990 is more than twice that of a Phenom II X6. Probably also the power consumption is 2x.
newbie
Activity: 4
Merit: 0
From reading all of the information out there I think I understand the basics of primecoin. Is there a worked example, using an existing block which shows how they fit together ?
hero member
Activity: 532
Merit: 500
Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

hp10 didn't improve that much; the metric is wrong. We're getting pretty much the same results with hp9 than hp10. Others have confirmed this.
doubling the sieve speed would not equate to a 2x speedup overall.  The deeper you end up seiving the less increased result you get from it.  Imagine the normal sieve eliminating 99% of candidates.  The added seive could then amount to 99.1% or 99.2% of all candidates. 
hero member
Activity: 518
Merit: 502
Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

hp10 didn't improve that much; the metric is wrong. We're getting pretty much the same results with hp9 than hp10. Others have confirmed this.
member
Activity: 75
Merit: 10
Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

Still, the sieve looks quite easy to port to the GPU and I will try to do that if my time permits it. Let's better have a definite prove that GPU sieving is ineffective right now, if that is really the case. Although, a >10 difficulty could make the sieving part more important than now, that could change the current situation.
hero member
Activity: 637
Merit: 500
Mtrlt has a new status on his miner. 2x speedup for testnet, ? for mainnet. I hope that we will see the source soon, so that I can port it over to CUDA. If we can get this to 5x-10x, it will still be a game changer! But based on my analysis, we won't see the extreme speedups which we have seen for hash based coins (atleast not without some clever tricks I'm unaware of). Sunny may have unintentionally created the best CPU-coin there currently is! Wink
He reports a 2x performance gain against a phenom x6, that's not much considering that a 6990 has 2 cores .
If the final results go on these lines, it will not be better to mine on GPU than on CPU.
member
Activity: 75
Merit: 10
Mtrlt has a new status on his miner. 2x speedup for testnet, ? for mainnet. I hope that we will see the source soon, so that I can port it over to CUDA. If we can get this to 5x-10x, it will still be a game changer! But based on my analysis, we won't see the extreme speedups which we have seen for hash based coins (atleast not without some clever tricks I'm unaware of). Sunny may have unintentionally created the best CPU-coin there currently is! Wink
member
Activity: 114
Merit: 10
For primecoin mining it only needs modular multiplication of a thousand bits.

Unfortunately the only modular multiplication implementation on CUDA I saw involved 32-bit integers Sad 
member
Activity: 75
Merit: 10
Lucas–Lehmer–Riesel test is deterministic and requires that numbers are in a special form. This is both a good fit for Mersenne primes, which have that special form and deterministic means that the test is proof for primality. The class of probabilistic primality tests is usually faster, but doesn't prove primality. Fermat's test is a good fit for primecoin, because it is general purpose. Also Fermat's test should be the fastest test among all probabilistic primality tests with the price of a high false positive rate (it's still very small in practise). This is ignored in Primecoins, as all numbers passing a Fermat test in base 2 are threaded as primes.
legendary
Activity: 1205
Merit: 1010
Smiley Though I think the Lucas-Lehmer used for GIMPS is very different because Mersenne is dealing with multiplications of tens of millions of bits. I think the kernel is probably spending most of its time doing a special FFT (floating point) algorithm for big number multiplications.

For primecoin mining it only needs modular multiplication of a thousand bits.
hero member
Activity: 532
Merit: 500
The Lucas-Lehmer test currently available on GPU is very simple compared to the fermat testing needed for Primecoin.  A mersenne number is stated as 2p-1.  In order for a mersenne number to be prime, P must also be prime.  As stated on http://Mersenne.org/various/math.php, The Lucas-Lehmer primality test is remarkably simple. It states that for P > 2, 2P-1 is prime if and only if Sp-2 is zero in this sequence: S0 = 4, SN = (SN-12 - 2) mod (2P-1).  Trust me when I say the explanation is a lot harder than the actual math involved.  To prove 27-1 is prime take 6 steps, which is 1 less that the exponent of 7.  The current record Mersenne Prime, 257,885,161-1 took 57,885,160 calculations to prove as prime, requiring 15 days on my overclocked i5-2500k (4.3GHz) compared to ~4 days on an nVidia GTX 480 or ~3.5 days on a GTX 580.

Lucas-Lehmer test itself is computationally similar to Fermat test in the sense that they both rely on modular multiplication/square. But in the case of Mersenne the modulo is of a special form 2p-1, so the modulo operation can be done through only shift and addition.
I will admit that I don't undertand the fermat test well... my brain locks up at The triple bar, ≡, and the set membership symbol (∈).  :/
legendary
Activity: 1205
Merit: 1010
The Lucas-Lehmer test currently available on GPU is very simple compared to the fermat testing needed for Primecoin.  A mersenne number is stated as 2p-1.  In order for a mersenne number to be prime, P must also be prime.  As stated on http://Mersenne.org/various/math.php, The Lucas-Lehmer primality test is remarkably simple. It states that for P > 2, 2P-1 is prime if and only if Sp-2 is zero in this sequence: S0 = 4, SN = (SN-12 - 2) mod (2P-1).  Trust me when I say the explanation is a lot harder than the actual math involved.  To prove 27-1 is prime take 6 steps, which is 1 less that the exponent of 7.  The current record Mersenne Prime, 257,885,161-1 took 57,885,160 calculations to prove as prime, requiring 15 days on my overclocked i5-2500k (4.3GHz) compared to ~4 days on an nVidia GTX 480 or ~3.5 days on a GTX 580.

Lucas-Lehmer test itself is computationally similar to Fermat test in the sense that they both rely on modular multiplication/square. But in the case of Mersenne the modulo is of a special form 2p-1, so the modulo operation can be done through only shift and addition.
sr. member
Activity: 434
Merit: 250

The algorithm is kind of complicated to explain in lay terms....


Thank you. You explained very well.
hero member
Activity: 532
Merit: 500
Base 2 Fermat test is strong enough. It's known that pseudoprimes are extremely rare compared to primes. The term probable prime is misleading to layman because they are actually almost certain primes.

In fact I would be very surprised if you can find a pseudoprime in the block chain.
Pseudoprimes are not that rare! Here is an example:

341, 561, 645, 1105, 1387, 1729, 1905, 2047, 2465, 2701, 2821, 3277, 4033, 4369, 4371, 4681, 5461, 6601, 7957, 8321, 8481, 8911
22 Pseudoprimes by 8911 as compared to 1384 prime numbers.  Definitely very uncommon.  Plus, similar to Mersenne Primes, while there are a lot of psuedoprimes at the lower range, they become a lot more scarce:  22 mersenne primes had an exponent less than 10,000.  #48, found Jan this year, and has an exponent of 57,885,161
legendary
Activity: 1205
Merit: 1010
Pseudoprimes are not that rare! Here is an example:

341, 561, 645, 1105, 1387, 1729, 1905, 2047, 2465, 2701, 2821, 3277, 4033, 4369, 4371, 4681, 5461, 6601, 7957, 8321, 8481, 8911

Asymptotically they are extremely rare compared to primes. Please refer to my design paper.

Primecoin is dealing with numbers over 256 bits long, so you cannot compare the probability of the really small numbers.
Pages:
Jump to: