I was asked to comment on this issue. I'm providing a claim that Vlad's analysis is incorrect. Since many of the people that read this thread are statistical laymen, I'm going to walk through this step by step.
First off, approximating the distribution with one huge encompassing Poisson Distribution is not the best choice in this scenario. A much better choice would be use the central limit theorem to approximate this Binomial Density separated by difficulty. The Poisson is only valid under certain criteria while the central limit theorem is pretty much good whenever n is large.
I will show my work for one iteration of this procedure and produce the results for the rest. Taking difficulty 434877 as the example to go off:
// Notes - bolded letters are estimates of what the true value should be (estimates taken from Vlad's sheet)
// n - sample size
//
p - estimated probability of success
//
p = x/n, where x is the number of successes (estimated blocks found)
Y ~ Binomial(n,
p)
n = 2016
p = 135.705/2016 = 0.0673
Y ~ Binomial(2016,0.0673)
From the central limit theorem, we know that a Binomial of sufficient sample size will follow a normal with mean n*
p and variance of n*
p*
q. Therefore in our example our distribution becomes the following
Y ≈ Normal(n*
p, n*
p*
q)
Y ≈ Normal(135.677, 126.546)
To calculate the probability of getting less than some number observed Y (actual blocks found), all that's left is to convert our normal distribution to a standard normal and look up the p-value.
P(Y ≤ 134)
= P(Z ≤ (134-135.677)/sqrt(126.546))
= P(Z ≤ -0.149)
= 0.440
What does this value mean? This means we are 44% likely to see a value this extreme or more at this difficulty(434877) which is completely acceptable.
Things to consider with a grain of salt. We gave an estimate for
p when in fact
p actually changes quite a lot during each difficulty with all the hashing power changes. I've also made a mention on the spreadsheet for occurrences that might indicate something odd happening, explained by DDoS attacks or other systematic errors fixed by patches later on.
If this original value value that Vlad had stated was true, I would be concerned. However, thankfully this is not the case and I hope everyone can see the sense and reasoning posted here.
The remaining p-values are below for convenience and the sheet that I used to calculate said values is linked:
https://spreadsheets.google.com/spreadsheet/ccc?key=0AoAyWRmssbLKdHduLURqdENHckw0SzRNX3JhN3ZKV2c&hl=en_USDifficulty| P-value Verdict
434877.04 |0.439 Nothing wrong at all
567269.53 |0.000 Check For DDoS/Other Systematic Errors
876954.49 |0.449 Nothing wrong at all
1379192.28 |0.341 Nothing wrong at all
1563027.99 |0.040 Statistical Anomaly 4% chance?
1690895.8 |0.331 Nothing wrong at all
1888786.7 |0.001 Check For DDoS/Other Systematic Errors
1805700.83 |0.720 Nothing wrong at all