Looking at the code, unless there is a bug in the algorithm that is say rejecting 10.5 length chains at the moment, it *should* behave linearly. Diff 9.999 should be same as 10, but but from the charts it is obviously not.
I think the distribution of the Fermat's remainder is skewed. Sunny mentioned that briefly in his paper too. The skewed distribution is probably caused by the repeated exponentiation done during the Fermat's test.
I wonder if this is because all the current miners are optimized to sieve for 9 chains at the moment? I wonder if the the difficulty would look more linear if miners switch to sieving for 10 chains now that most 9 chains won't solve a block. I would love for a -sievelength param to be added to the build that would sieve for chains of a set length. I would do it myself and experiment but I am having some difficulty building the project on windows, and I suspect it is at least as hard as bitcoin to build.
I can build the ypool miner easily and from those experiments the sieve length is quite effective at optimizing for chains of a desired length.
I'm a bit doubtful that switching the target difficulty to 10 would be beneficial at this point. It might become beneficial when difficulty is around 9.9 but I haven't really investigated that.
Also it's good to remember that optimal mining parameters are probably not the same for ypool and solo mining. Ypool miners are concerned with maximizing the value of their shares. Previously that system was exploited using shorter chains. Now that the system was changed radically it could be possible to exploit it with longer chains.
The building procedure for the stock Primecoin client should be the same as for Bitcoin. For my version you also need libgmp on top of that.
Ah ok, if the distribution is heavily skewed, and the number of miners was increasing rapidly, that could explain some of it. Well, if the parameter is added I would try it out now even if it is too early and see what happens.
I did eventually get it to build on mingw64 with the gmp 5.1.0 source. For fun I collected stats on the decimal part of the distribution of all 5+ chains, separated into 1% bins. It looks fairly flat:
66 54 53 55 52 56 44 70 68 53 53 57 54 33 53 46 61 68 66 67 55 69 59 58 39 60 73 58 51 61 56 56 53 56 66 50 72 76 58 57 55 55 61 55 62 67 53 58 46 44 59 60 53 62 39 46 47 55 55 48 69 77 47 56 75 52 65 74 41 49 67 53 74 47 49 63 55 57 47 61 59 63 45 59 54 55 61 62 48 61 55 62 63 55 70 62 50 81 55 64
Would I be correct in saying the chance to find a prime in a set that is sieved is about 0.095 and the chance of finding a prime just randomly is 1/ ln(2^255) = 0.00568 ?
Yeah, I actually did some testing myself too and observed similar results. The distribution of the remainder is indeed not (significantly) skewed. So I was wrong there. I originally thought that the difficulty curve looked logarithmic which led to me guessing that the distribution would be skewed. I looked at the difficulty curve again and it's not logarithmic. The amount of chains that pass the Fermat's remainder test is simply inversely proportional to the decimal part of the difficulty. That means 20% of 9-chains pass at difficulty 9.8 and only 10% pass at difficulty 9.9.
The probability of finding a prime around 2^255 is indeed about 1 / ln(2^255). Sieving improves the odds significantly. According to my miner's statistics, about 4.3% of candidates are primes on mainnet (7% on testnet).
And SlyWax's earlier formula block rate estimate based on difficulty was indeed correct. I was wrong there too. But please remember that chains/day is not an accurate estimate of actual performance.
Ah ok, so EstimateCandidatePrimeProbability() returns around 9%, but in reality it is more like 4% when you actually divide primes found by the number of tests. Interesting. My calculations also showed it would be a terrible idea to start search for 10 chains already.
Edit: actually the number of tests isn't a good stat for this.. it is the number of chains tested and is incremented even when the sieve is empty and no prime tests are done. I ran some numbers on real tests at each chain length, and it looks like it will always the same, around 0.094 of candidates are prime. I'll run it for longer though to get better results for the longer chains.
0.09442 0.094065 0.094624 0.094806 0.048345 0.099099 0.18182 0 -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND -1.#IND
This would also make a good block estimator:
double nineFraction = (currently around 0.17)
double blocksPerDay = 24*(d7ChainsPerHour*0.094^2*nineFraction + d7ChainsPerHour*0.094^3);