Pages:
Author

Topic: [XPM] [ANN] Primecoin High Performance | HP14 released! - page 49. (Read 397616 times)

member
Activity: 98
Merit: 10
In case you are interested in the validity of the chains per day metric, with the following setup running on 9 identical machines I managed 8 chains over 24 hours.

8/9=0.8888 chains per day.

I would say, close enough for government work in this case.

Code:
    "blocks" : 105938,
    "chainspermin" : 3,
    "chainsperday" : 0.92945471,
    "currentblocksize" : 1000,
    "currentblocktx" : 0,
    "difficulty" : 9.59519291,
    "errors" : "",
    "generate" : true,
    "genproclimit" : -1,
    "roundsievepercentage" : 70,
    "primespersec" : 1707,
    "pooledtx" : 0,
    "sievepercentage" : 10,
    "sievesize" : 1100000,
    "testnet" : false
sr. member
Activity: 301
Merit: 250
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.

Mikael, thank you for the clarification. Indeed after checking the code I found the reason why it has to be a multiple of 64/32. It's a shame I didn't checked this before.
Anyway I've run some more tests and using the default settings of the primecoin client there was no significant gain for different nL1CacheElements, but in the moment the sieve size and the sieve percentage was raised  the nL1CacheElements size had more affect. For example with sievesize=2000000 and sievepercentage=50 a higher nL1CacheElements (1056000) speeds up the Wave by ~30%. (my CPU is i7)
In any case because 50% is probably not the optimal sievepercentage this measurement is less relevant.

BTW: I'm compiling the 64bit version of jhPrimeminer under Windows with Visual Studio, where the size of 'unsigned long' is 32bit and not 64bit like in Linux (GNU). I've tried replacing the "unsigned long" with uint64_t (unsigned long long)  but it makes the Wave function run slower ( ~15-25%). I'm thinking maybe it worth a try to replace the "unsigned long' to uint32 in the Wave() function in primecoin (HP9) client where the nL1CacheElements  is also used.
Does it makes sense what I'm talking about?

Well, it's good to know that long is always 32-bits in Visual Studio. I haven't really worked with that compiler. One unfortunate side-effect of that is being unable to pass 64-bit values directly to GMP/MPIR because the interface uses unsigned longs.

I think 64-bits integer are better for this part of the code because it performs logical operations on bits.
Thus, you process twice more bits with 64-bits integers than with 32-bits.
I think you may have a performance loss because Visual Studio does not vectorize 64-bits integers operations very well.

Yup, I also suspect that the Visual Studio compiler isn't optimizing properly there. But I haven't looked at the generated code.

Back to the nL1CacheElements constant. I am having a small (>3%) but clear performance improvement with 256000 (chainsperday increases and primesperday is preserved).
Tested and confirmed on two different architectures (AMD & Intel).
I know it's micro-optimization garbage (sorry!), but Mikaelh, does this value sound good to you ?

Well, I did check that most Intel processors have a 32 kB L1 data cache while AMD processors typically have a 64 kB data cache. 256000 is pretty much pushing the L1 cache to its limits on Intel CPUs. If it also shows good performance in my testing, I will probably incorporate that change. The annoying issue with 256000 is that it doesn't divide 1M (sieve size) evenly.

I have also considered discovering the L1 cache size automatically and adjusting the parameters based on that. The problem with that is that I don't have a lot of CPUs to test on.
sr. member
Activity: 291
Merit: 250
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.

Mikael, thank you for the clarification. Indeed after checking the code I found the reason why it has to be a multiple of 64/32. It's a shame I didn't checked this before.
Anyway I've run some more tests and using the default settings of the primecoin client there was no significant gain for different nL1CacheElements, but in the moment the sieve size and the sieve percentage was raised  the nL1CacheElements size had more affect. For example with sievesize=2000000 and sievepercentage=50 a higher nL1CacheElements (1056000) speeds up the Wave by ~30%. (my CPU is i7)
In any case because 50% is probably not the optimal sievepercentage this measurement is less relevant.

BTW: I'm compiling the 64bit version of jhPrimeminer under Windows with Visual Studio, where the size of 'unsigned long' is 32bit and not 64bit like in Linux (GNU). I've tried replacing the "unsigned long" with uint64_t (unsigned long long)  but it makes the Wave function run slower ( ~15-25%). I'm thinking maybe it worth a try to replace the "unsigned long' to uint32 in the Wave() function in primecoin (HP9) client where the nL1CacheElements  is also used.
Does it makes sense what I'm talking about?
sr. member
Activity: 291
Merit: 250
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.

Mikael, thank you for the clarification. Indeed after checking the code I found the reason why it has to be a multiple of 64/32. It's a shame I didn't checked this before.
Anyway I've run some more tests and using the default settings of the primecoin client there was no significant gain for different nL1CacheElements, but in the moment the sieve size and the sieve percentage was raised  the nL1CacheElements size had more affect. For example with sievesize=2000000 and sievepercentage=50 a higher nL1CacheElements (1056000) speeds up the Wave by ~30%. (my CPU is i7)
In any case because 50% is probably not the optimal sievepercentage this measurement is less relevant.
hero member
Activity: 820
Merit: 1000
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
mikeal, what about using values that ARE a multiple of both 64 and 32 then?  Is that safe and will it yield improved performance if done correctly?

If you use multiples of 64 (or 32), then that should work in theory. I haven't really tested that exhaustively though. I would expect minimal gains from adjusting it in most cases.
thanks mikael, I tried values ~100000 higher and lower (but multiples of 64/32) and saw decreased performance both ways, so 200000 seems optimal for now.  Shame, I was starting to think I could get 3x performance then Smiley
sr. member
Activity: 301
Merit: 250
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
mikeal, what about using values that ARE a multiple of both 64 and 32 then?  Is that safe and will it yield improved performance if done correctly?

If you use multiples of 64 (or 32), then that should work in theory. I haven't really tested that exhaustively though. I would expect minimal gains from adjusting it in most cases.
hero member
Activity: 820
Merit: 1000
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
mikeal, what about using values that ARE a multiple of both 64 and 32 then?  Is that safe and will it yield improved performance if done correctly?
sr. member
Activity: 301
Merit: 250
The nL1CacheElements constant was not meant to be changed and there's a good reason for that. nL1CacheElements needs to be a multiple of 64 on a 64-bit platform (or a multiple of 32 on a 32-bit platform). Otherwise the sieve code will start misbehaving which seems to be what razibuzouzou is seeing. The sieve will start producing false candidates which will distort the metrics. The tests/h is the number of candidates produced by the sieve which is directly impacted. The chains/day estimate relies on tests/h being accurate so that will also be heavily impacted.

With the latest version, there could also be a small negative impact on block rate if the sieve is misbehaving. The false candidates will always be marked as bi-twin candidates. If it in reality is a candidate of another type, it will not get recognized properly.

Summary: Don't touch nL1CacheElements unless you know exactly what you're doing.
sr. member
Activity: 354
Merit: 251
coinorama.net
Hi,
Did anyone tried to play with the nL1CacheElements constant in prime.cpp ?

Yesterday, I tried a set of values and was a bit puzzled by the results.
It did not seem to affect the number of blocks  found, however it had a strong impact on primesperday, chainspermin and chainsperday.
I am using default mining settings: sievesize=1000000, etc.

With default nL1CacheElements=200000 ; let's assume I measure a chainsperday of 1
With nL1CacheElements=10000, chainsperday is multiplied by 6, but primesperday is multiplied by 0.1
With nL1CacheElements=20000, chainsperday is multiplied by 2, but primesperday is multiplied by 0.8
With nL1CacheElements=65536, chainsperday is multiplied by 0.5
With nL1CacheElements=90000, chainsperday is multiplied by 0.6666
With nL1CacheElement=100000, chainsperday is multiplied by 1.6
With nL1CacheElements=400000, chainsperday is multiplied by 0.5555

I am guessing that chainsperday metric is  affected somehow by the number of loops performed to combine the candidates arrays; but couldn't understand how.
I know that primesperday is not an accurate performance metric, however I am wondering whether chainsperday is a reliable efficiency measurement.
Maybe getting the best for both values indicates maximal efficiency, or not...  Undecided

Hi,
In the modified version of the jhPrimeminer that is used for ypool.net I did some kind of auto tuning for the nL1CacheElements by measuring the time it takes to execute the Wave() function. Indeed the default is not the optimal most of the time. I also found that for some settings even higher (1.000.000+) nL1CacheElements performs better. Based on test using profiling tools I concluded that the most time consuming code is writing to the memory and also this is done less linearly but to more random positions. So I think in this case the CPU cache has less role. Maybe I'm not right, I never done these kind of tunings before.

Just tried to remove the loop which processes blocks of nL1CacheElements.
Performance seems a bit affected, chainsperday hit by a 0.82 factor.

What do you think ? Are chainspermin variation due to nL1CacheElements value an indicator of performance improvement/degradation ?
sr. member
Activity: 476
Merit: 250
8 blocks found during the night, 2 of them are orphans, thats 25%...
Could this be related to connexion count ? My current vps all have 8 connections while on DO I could get up to 30 connexions per instance, could this be a reason ?
Any ways to increase the connexion count ?
sr. member
Activity: 291
Merit: 250
Hi,
Did anyone tried to play with the nL1CacheElements constant in prime.cpp ?

Yesterday, I tried a set of values and was a bit puzzled by the results.
It did not seem to affect the number of blocks  found, however it had a strong impact on primesperday, chainspermin and chainsperday.
I am using default mining settings: sievesize=1000000, etc.

With default nL1CacheElements=200000 ; let's assume I measure a chainsperday of 1
With nL1CacheElements=10000, chainsperday is multiplied by 6, but primesperday is multiplied by 0.1
With nL1CacheElements=20000, chainsperday is multiplied by 2, but primesperday is multiplied by 0.8
With nL1CacheElements=65536, chainsperday is multiplied by 0.5
With nL1CacheElements=90000, chainsperday is multiplied by 0.6666
With nL1CacheElement=100000, chainsperday is multiplied by 1.6
With nL1CacheElements=400000, chainsperday is multiplied by 0.5555

I am guessing that chainsperday metric is  affected somehow by the number of loops performed to combine the candidates arrays; but couldn't understand how.
I know that primesperday is not an accurate performance metric, however I am wondering whether chainsperday is a reliable efficiency measurement.
Maybe getting the best for both values indicates maximal efficiency, or not...  Undecided

Hi,
In the modified version of the jhPrimeminer that is used for ypool.net I did some kind of auto tuning for the nL1CacheElements by measuring the time it takes to execute the Wave() function. Indeed the default is not the optimal most of the time. I also found that for some settings even higher (1.000.000+) nL1CacheElements performs better. Based on test using profiling tools I concluded that the most time consuming code is writing to the memory and also this is done less linearly but to more random positions. So I think in this case the CPU cache has less role. Maybe I'm not right, I never done these kind of tunings before.
member
Activity: 75
Merit: 10
I've only been mining for about a week now, and I occasionally get the following error/crash:

Assertion failed!

Program: D:\Primecoin\primecoin-qt.exe
File: src/checkqueue.h, line 171

Expression: pqueue->nTotal == pqueue->nIdle


primecoin-0.1.2-hp9-winx64.zip

It's a bug in the primecoin client - It's been partially fixed but everyone gets it occasinally. From what I have heard systems with more cores (6+)tend to get it more often. All we can do is just restart the primecoin client to deal with it for now.
newbie
Activity: 18
Merit: 0
I've only been mining for about a week now, and I occasionally get the following error/crash:

Assertion failed!

Program: D:\Primecoin\primecoin-qt.exe
File: src/checkqueue.h, line 171

Expression: pqueue->nTotal == pqueue->nIdle


primecoin-0.1.2-hp9-winx64.zip
sr. member
Activity: 363
Merit: 250
I don't think we will ever reach 1000 digit zone, not in 100 years at least.

Has fontas sent a future tweet yet for the pump?  I want to make sure I have my coins ready.

newbie
Activity: 54
Merit: 0
Well, block payments just dropped below 11 and difficulty passed 9.54

It's amazing how much a software update and better parameters can get you.
On that note, I'm starting to think the chains/day measurement is becoming less accurate.  Either that or I've been quite unlucky.

It seems to not be accurate for me - I haven't gotten any blocks since the 25th (other than an orphan on the 25th) on my desktop though so I feel it's hard to tell.  Then again my sister got three blocks on saturday with the same cpu... It seems like it's a rough average that was accurate when it was added but is slowly becoming less useful for judging how often you might get a block from my experiences.
One major misconception on this coin is people think it acts like any other coin out there... in one way it does, but for the most part it doesn't.  Bitcoin, Litecoin, etc, all use a hash.  Since the hash needed is random, it takes time to find.  This is the only area where primecoin is like the others, in that it is a random chance to get the right variable to find a prime.  Where primecoin diverges is that it can never reuse a number already found.  Bitcoin and Namecoin can use the same hash and be 'merged mined', primecoin alone can never be merge mined.

The basis behind blocks is finding Cunningham Chains of length N where N is the difficulty rating.  What does this mean?  The smallest known length 9 chains are:85864769  and
857095381 (1st and 2nd kind).  The proof of work on this coin is such that once a prime chain has been found, it cannot be reused.  So each prime found means the one less prime that can be found.  Look at the records primecoin has found in length 9 chains: 2030793138184474269420052163338592688212588454185304749794006580858788424425799 69*179#-1 is 151 digits long!  In a way, the nice thing on this is that as time goes by and the length of these primes gets up into the 1000+ digit range, difficulty will start coming down as they become harder to find and we'll start finding larger 8 chains and larger 7 chains.

Therefore the metric for finding blocks is ever changing, you can't count on what works today being the best thing to do in 2 weeks time as well.  This is an ever changing coin and people really need to start thinking outside the box.

Prime chains are plenty.  I don't think we will ever reach 1000 digit zone, not in 100 years at least.
sr. member
Activity: 434
Merit: 250
Well less of a model to predict blocks/day and more of a way to measure performance. For instance, it could help someone tweak their setup to finding longer chain lengths instead of shorter. Or really, just help people understand what their tweaks are doing to performance

Exactly. Even if the prediction is way off, that isn't a big problem as long as something half the speed is also equally off with half the prediction. Need some way to compare hardware and know which machines are faster miners than others, etc.
member
Activity: 105
Merit: 10
Well, block payments just dropped below 11 and difficulty passed 9.54

It's amazing how much a software update and better parameters can get you.
On that note, I'm starting to think the chains/day measurement is becoming less accurate.  Either that or I've been quite unlucky.

It seems to not be accurate for me - I haven't gotten any blocks since the 25th (other than an orphan on the 25th) on my desktop though so I feel it's hard to tell.  Then again my sister got three blocks on saturday with the same cpu... It seems like it's a rough average that was accurate when it was added but is slowly becoming less useful for judging how often you might get a block from my experiences.
One major misconception on this coin is people think it acts like any other coin out there... in one way it does, but for the most part it doesn't.  Bitcoin, Litecoin, etc, all use a hash.  Since the hash needed is random, it takes time to find.  This is the only area where primecoin is like the others, in that it is a random chance to get the right variable to find a prime.  Where primecoin diverges is that it can never reuse a number already found.  Bitcoin and Namecoin can use the same hash and be 'merged mined', primecoin alone can never be merge mined.

The basis behind blocks is finding Cunningham Chains of length N where N is the difficulty rating.  What does this mean?  The smallest known length 9 chains are:85864769  and
857095381 (1st and 2nd kind).  The proof of work on this coin is such that once a prime chain has been found, it cannot be reused.  So each prime found means the one less prime that can be found.  Look at the records primecoin has found in length 9 chains: 2030793138184474269420052163338592688212588454185304749794006580858788424425799 69*179#-1 is 151 digits long!  In a way, the nice thing on this is that as time goes by and the length of these primes gets up into the 1000+ digit range, difficulty will start coming down as they become harder to find and we'll start finding larger 8 chains and larger 7 chains.

Therefore the metric for finding blocks is ever changing, you can't count on what works today being the best thing to do in 2 weeks time as well.  This is an ever changing coin and people really need to start thinking outside the box.

That's really interesting info, I couldn't figure out how the block hashes were determined in Primecoin. This shows Primecoin to have a really bright future!
newbie
Activity: 54
Merit: 0
Well, block payments just dropped below 11 and difficulty passed 9.54

It's amazing how much a software update and better parameters can get you.
On that note, I'm starting to think the chains/day measurement is becoming less accurate.  Either that or I've been quite unlucky.

It seems to not be accurate for me - I haven't gotten any blocks since the 25th (other than an orphan on the 25th) on my desktop though so I feel it's hard to tell.  Then again my sister got three blocks on saturday with the same cpu... It seems like it's a rough average that was accurate when it was added but is slowly becoming less useful for judging how often you might get a block from my experiences.

I wonder if there's a way to use  stats like jhPrimeminer (ypool client) does but in this build. Basically counts how many chains per hour of different lengths. For instance, my AMD Phenom II X4 is currenlty 17 6-length chains per hour, 193 5 length/hour, and 2046 4-length/hour. There are more stats of course, but that gives you an idea. If you let it run for a day or so, it is quite accurate.

You are proposing to build a model to predict how many blocks can be found per day. As I understandd that was basicly why Sunny added the chains per day gauge. However Sunny pointed out somewhere that chain/d is not block/day. They are just closely related.

Well less of a model to predict blocks/day and more of a way to measure performance. For instance, it could help someone tweak their setup to finding longer chain lengths instead of shorter. Or really, just help people understand what their tweaks are doing to performance
hero member
Activity: 516
Merit: 500
CAT.EX Exchange
Well, block payments just dropped below 11 and difficulty passed 9.54

It's amazing how much a software update and better parameters can get you.
On that note, I'm starting to think the chains/day measurement is becoming less accurate.  Either that or I've been quite unlucky.

It seems to not be accurate for me - I haven't gotten any blocks since the 25th (other than an orphan on the 25th) on my desktop though so I feel it's hard to tell.  Then again my sister got three blocks on saturday with the same cpu... It seems like it's a rough average that was accurate when it was added but is slowly becoming less useful for judging how often you might get a block from my experiences.

I wonder if there's a way to use  stats like jhPrimeminer (ypool client) does but in this build. Basically counts how many chains per hour of different lengths. For instance, my AMD Phenom II X4 is currenlty 17 6-length chains per hour, 193 5 length/hour, and 2046 4-length/hour. There are more stats of course, but that gives you an idea. If you let it run for a day or so, it is quite accurate.

You are proposing to build a model to predict how many blocks can be found per day. As I understandd that was basicly why Sunny added the chains per day gauge. However Sunny pointed out somewhere that chain/d is not block/day. They are just closely related.
hero member
Activity: 532
Merit: 500
Well, block payments just dropped below 11 and difficulty passed 9.54

It's amazing how much a software update and better parameters can get you.
On that note, I'm starting to think the chains/day measurement is becoming less accurate.  Either that or I've been quite unlucky.

It seems to not be accurate for me - I haven't gotten any blocks since the 25th (other than an orphan on the 25th) on my desktop though so I feel it's hard to tell.  Then again my sister got three blocks on saturday with the same cpu... It seems like it's a rough average that was accurate when it was added but is slowly becoming less useful for judging how often you might get a block from my experiences.
One major misconception on this coin is people think it acts like any other coin out there... in one way it does, but for the most part it doesn't.  Bitcoin, Litecoin, etc, all use a hash.  Since the hash needed is random, it takes time to find.  This is the only area where primecoin is like the others, in that it is a random chance to get the right variable to find a prime.  Where primecoin diverges is that it can never reuse a number already found.  Bitcoin and Namecoin can use the same hash and be 'merged mined', primecoin alone can never be merge mined.

The basis behind blocks is finding Cunningham Chains of length N where N is the difficulty rating.  What does this mean?  The smallest known length 9 chains are:85864769  and
857095381 (1st and 2nd kind).  The proof of work on this coin is such that once a prime chain has been found, it cannot be reused.  So each prime found means the one less prime that can be found.  Look at the records primecoin has found in length 9 chains: 2030793138184474269420052163338592688212588454185304749794006580858788424425799 69*179#-1 is 151 digits long!  In a way, the nice thing on this is that as time goes by and the length of these primes gets up into the 1000+ digit range, difficulty will start coming down as they become harder to find and we'll start finding larger 8 chains and larger 7 chains.

Therefore the metric for finding blocks is ever changing, you can't count on what works today being the best thing to do in 2 weeks time as well.  This is an ever changing coin and people really need to start thinking outside the box.
Pages:
Jump to: