Pages:
Author

Topic: extremely long blocks - what are the chances? (Read 2854 times)

legendary
Activity: 1512
Merit: 1036
January 14, 2013, 11:36:50 AM
#25
This seems like the best place to make this available, a CSV dump of block times and durations.
Blocks 0 - 216538, in this format:

Code:
blocknum,epochtime,blocksec,datetime
0,1231006505,0,2009-01-03T18:15:05Z
1,1231469665,0,2009-01-09T02:54:25Z
2,1231469744,79,2009-01-09T02:55:44Z
3,1231470173,429,2009-01-09T03:02:53Z
4,1231470988,815,2009-01-09T03:16:28Z
5,1231471428,440,2009-01-09T03:23:48Z
6,1231471789,361,2009-01-09T03:29:49Z
7,1231472369,580,2009-01-09T03:39:29Z
8,1231472743,374,2009-01-09T03:45:43Z
9,1231473279,536,2009-01-09T03:54:39Z
10,1231473952,673,2009-01-09T04:05:52Z
11,1231474360,408,2009-01-09T04:12:40Z
12,1231474888,528,2009-01-09T04:21:28Z
13,1231475020,132,2009-01-09T04:23:40Z
14,1231475589,569,2009-01-09T04:33:09Z

http://we.lovebitco.in/blocktimes.zip

(space reserved if I make some histograms or conclusions with data...)
legendary
Activity: 1512
Merit: 1036
....
The average time would be 10 minutes, but not the median as 50% would imply.  The long tail at the high end pulls the average above the median.

1-exp(-1) is c. 63%, a little less than your number, as one would expect given the history of increasing network speed.

These two statements appear to be in dissonance, but are both true:
  • The average time to solve a block is 10 minutes
  • Half the blocks will be solved in under 7 minutes

Obviously easy to confuse the two when calculating things.

So the chance a block won't be found after 10, 20 minutes, etc. (and the 1 in x chance)
Code:
>>> for x in range(10,131,10) :
print x, '%2.5f%%' % (100*math.exp(-x/10)), '%1.2f' % (math.exp(x/10))
10 36.78794% 2.72 (average block length)
20 13.53353% 7.39
30 4.97871% 20.09
40 1.83156% 54.60
50 0.67379% 148.41
60 0.24788% 403.43
70 0.09119% 1096.63
80 0.03355% 2980.96
90 0.01234% 8103.08
100 0.00454% 22026.47
110 0.00167% 59874.14
120 0.00061% 162754.79
130 0.00023% 442413.39


And after how many minutes will half of blocks, 25% of the blocks, etc. not be solved:
Code:
>>> for x in [ 10*math.log(2), 10*math.log(4), 10*math.log(8), 10*math.log(16), 10*math.log(32), 10*math.log(64)]:
print x, '%2.5f%%' % (100*math.exp(-x/10)), '%1.2f' % (math.exp(x/10))

   
6.9314718056 50.00000% 2.00 (median block length)
13.8629436112 25.00000% 4.00
20.7944154168 12.50000% 8.00
27.7258872224 6.25000% 16.00
34.657359028 3.12500% 32.00
41.5888308336 1.56250% 64.00



And every day we can reasonably expect a block longer than 49 minutes, every week one longer than 69 minutes, every month 83 minutes:
Code:
>>> for x in [ 10*math.log(144), 10*math.log(144*7), 10*math.log(144*365.25/12)]:
print x, '%2.5f%%' % (100*math.exp(-x/10)), '%1.2f' % (math.exp(x/10))

   
49.6981329958 0.69444% 144.00
69.1572344863 0.09921% 1008.00
83.8548870042 0.02282% 4383.00


I'm curious to see how an integrated histogram of actual block data matches up with this.

Of more significance, if the hash rate temporarily halves (miners quit, pool DDOS, etc) all times double.
donator
Activity: 2772
Merit: 1019
The probability of a block being solved after 10 minutes: 50%, or 1/2

I might be doing something wrong, but it seems my data shows that 65% of block have been solved in less than 10 minutes. The difference seems a bit too large to be explained by our "faster mining", no?

No, the probability after 10 minutes at constant hashrate would approach, IIRC, 1 - exp(-1), not 50%.  The average time would be 10 minutes, but not the median as 50% would imply.  The long tail at the high end pulls the average above the median.

1-exp(-1) is c. 63%, a little less than your number, as one would expect given the history of increasing network speed.


Yep. Isn't it nice when empirical data matches up with the mathematical model so nicely? It might not be a big deal for most, but it never fails to make me feel deeply content.

Thanks for chipping in John and thanks again for bitcoin-abe.
hero member
Activity: 481
Merit: 529
The probability of a block being solved after 10 minutes: 50%, or 1/2

I might be doing something wrong, but it seems my data shows that 65% of block have been solved in less than 10 minutes. The difference seems a bit too large to be explained by our "faster mining", no?

No, the probability after 10 minutes at constant hashrate would approach, IIRC, 1 - exp(-1), not 50%.  The average time would be 10 minutes, but not the median as 50% would imply.  The long tail at the high end pulls the average above the median.

1-exp(-1) is c. 63%, a little less than your number, as one would expect given the history of increasing network speed.
donator
Activity: 2058
Merit: 1007
Poor impulse control.
I think the three and four block sequence probabilities are negative binomial, and if so the median values will be ~ 25 minutes for 3 blocks and ~ 34 minutes for four blocks solved, which also look about right.
If we treat time as continuous it will be the continuous version of negative binomial, which is the Erlang distribution.

I'm fairly certain we should treat time as continuous Wink

With that correction the median time for three blocks to be solved is 26.7 minutes and for four blocks to be solved 36.7 minutes.
donator
Activity: 2058
Merit: 1054
I think the three and four block sequence probabilities are negative binomial, and if so the median values will be ~ 25 minutes for 3 blocks and ~ 34 minutes for four blocks solved, which also look about right.
If we treat time as continuous it will be the continuous version of negative binomial, which is the Erlang distribution.
donator
Activity: 2058
Merit: 1007
Poor impulse control.
  • I'm looking at overlapping sequences, so a block that takes 127 minutes to calculate would result in multiple sequences being counted

So it's like a sliding window - for each block you report the amount of time taken for the next three or four blocks?

Without having your dataset to check, your data appears to be what I'd expect - with the exception of the errors due to the incorrect block reporting times, the top histogram looks exponentially distributed. If it surprises you, think of the time taken for a block to be solved like the number of shares taken for a pool to solve a block. On average blocks are solved in 10 minutes, 63.2% should be solved in under 10 minutes.

I think the three and four block sequence probabilities are negative binomial, and if so the median values will be ~ 25 minutes for 3 blocks and ~ 34 minutes for four blocks solved, which also look about right.

Excuse me now while I work through Meni's result Smiley
donator
Activity: 2058
Merit: 1007
Poor impulse control.
The probability of a block being solved after 10 minutes: 50%, or 1/2

I might be doing something wrong, but it seems my data shows that 65% of block have been solved in less than 10 minutes. The difference seems a bit too large to be explained by our "faster mining", no?

That's seems right. See the post above yours.
legendary
Activity: 1512
Merit: 1036
I will have to concede that my assumption about median solve time was wrong (the median time is 6.9314718056 minutes, not 10 minutes).
donator
Activity: 2772
Merit: 1019
The probability of a block being solved after 10 minutes: 50%, or 1/2

I might be doing something wrong, but it seems my data shows that 65% of block have been solved in less than 10 minutes. The difference seems a bit too large to be explained by our "faster mining", no?
donator
Activity: 2058
Merit: 1054
I misread the first post initially and did some incorrect math but still demonstrates this is a relatively likely event.

The probability of one block taking 127 minutes or more (simplified):
2^-12.7 = ~ 1/6654 = average one in 46 days worth of blocks

It's not 2^-12.7, it's e^-12.7 = 1 / 327748.

This is an example for an event that will probably not happen in 4 years, but still has a reasonable chance.
My result above is a binomial simplification of mining:

The probability of a block being solved after 10 minutes: 50%, or 1/2

The probability of a block not being solved after 10 minutes: 50%, (1/2), or 2 ^ -1
The probability of a block not being solved after 20 minutes: 25%, (1/2)*(1/2) = 1/4, or 2 ^ -2
The probability of a block not being solved after 30 minutes: 12.5%, or 1/8, or 2 ^ -3
....
The probability of a block not being solved after 127 minutes: 2 ^ -12.7, or ~0.015%
It's not a "simplification", it's just wrong. The probability of a block being solved within 10 minutes is not 50%, it's 63.21%. The probability that a block will take more than x*10 minutes to solve is exp(-x). 10 minutes is the mean, not the median.
donator
Activity: 2772
Merit: 1019
At this point it might be advisable to relax the presentation with some charts based on actual data Wink.

Unfortunately I'm using quite old data (missing some blocks so the longest chain ends at block 210060), as you can see by this query result.

Code:
     last_block_date     | blocks_in_db | avg_blocktime_seconds | avg_blocktime_minutes
------------------------+--------------+-----------------------+-----------------------
 2012-11-29 01:19:00+01 |       210060 |  586.2221984194991907 |    9.7769351613824622

So here we go: some histograms, click images for slightly larger versions.







Observations and clarifications/notes:

  • I'm looking at overlapping sequences, so a block that takes 127 minutes to calculate would result in multiple sequences being counted
  • The case of a 3-block sequence taking at least 127 minutes to find happened 759 out of 210,060 times (0.3613%)
  • The case of a 4-block sequence taking at least 127 minutes to find happened 1,551 out of 210,060 times (0.7383%)
  • 135,421 blocks (out of 210,060) have been solved in less than 10 minutes (64.47%)
  • There can be negative block times because miners clocks can be unsynced.
  • The block that took longest to calculate was block #2 (7719 minutes). It might've been block #1, but we don't know how long that took.
  • I put a confusing date on the upper right corner, data is from 2012-11-29
  • first and last "bins" include the rest of the data (for example last bin contains the number of blocks that took 127 minutes or more to find)
  • Surprisingly to me the "bin" with the most blocks is the 1-2 minutes bin (1:00 to 1:59.999) (bar labeled "1" in the charts")

queries I used are here: http://pastebin.com/tPg1RQtG

Does this stuff look like it could be correct to you guys?

EDIT: some corrections
legendary
Activity: 1512
Merit: 1036
I misread the first post initially and did some incorrect math but still demonstrates this is a relatively likely event.

The probability of one block taking 127 minutes or more (simplified):
2^-12.7 = ~ 1/6654 = average one in 46 days worth of blocks

It's not 2^-12.7, it's e^-12.7 = 1 / 327748.

This is an example for an event that will probably not happen in 4 years, but still has a reasonable chance.
My result above is a binomial simplification of mining:

The probability of a block being solved after 10 minutes: 50%, or 1/2

The probability of a block not being solved after 10 minutes: 50%, (1/2), or 2 ^ -1
The probability of a block not being solved after 20 minutes: 25%, (1/2)*(1/2) = 1/4, or 2 ^ -2
The probability of a block not being solved after 30 minutes: 12.5%, or 1/8, or 2 ^ -3
....
The probability of a block not being solved after 127 minutes: 2 ^ -12.7, or ~0.015%

donator
Activity: 2058
Merit: 1054
I misread the first post initially and did some incorrect math but still demonstrates this is a relatively likely event.

The probability of one block taking 127 minutes or more (simplified):
2^-12.7 = ~ 1/6654 = average one in 46 days worth of blocks

Of course if you want to do real discrete statistics:

At target 0000000000000529B10000000000000000000000000000000000000000000000
  • difficulty = 3249549.5844872
  • Probability per single hash = 0.000000000000000071649034700583570
  • Expected number of hashes per 10 minutes = 13956922157834111

All calculations that have been shown expect the hashrate actually corresponds to difficulty, however there's enough of a pattern when summing many blocks-per-day-of-the-week intervals to know some people turn off mining on the weekends.
It's not 2^-12.7, it's e^-12.7 = 1 / 327748.

This is an example for an event that will probably not happen in 4 years, but still has a reasonable chance.
legendary
Activity: 1512
Merit: 1036
I misread the first post initially and did some incorrect math but still demonstrates this is a relatively likely event.

The probability of one block taking 127 minutes or more (simplified):
2^-12.7 = ~ 1/6654 = average one in 46 days worth of blocks (edit: wrong)

Of course if you want to do real discrete statistics:

At target 0000000000000529B10000000000000000000000000000000000000000000000
  • difficulty = 3249549.5844872
  • Probability per single hash = 0.000000000000000071649034700583570
  • Expected number of hashes per 10 minutes = 13956922157834111

All calculations that have been shown expect the hashrate actually corresponds to difficulty, however there's enough of a pattern when summing many blocks-per-day-of-the-week intervals to know some people turn off mining on the weekends.
donator
Activity: 2772
Merit: 1019
Probability of this not happening. The probability of happening is roughly
0.99999999999999999999999999999999999999999999999999999999999999999999979340994 89681.

Thanks a lot everyone for doing this math.

Shown once again: probabilities are not intuitive. I actually thought I had observed something unlikely.

member
Activity: 112
Merit: 10
That's pretty messed up.
donator
Activity: 2058
Merit: 1054
@organofcorti - I don't think "4 blocks in 127 minutes" is the correct way to look at the event, because what we can see here is actually an interval of (127 minutes - epsilon) with only 3 blocks.

To simplify and clarify, let's look instead of the sequence of X1, X2, ... of time intervals between successive blocks. What we see here is a subsequence of intervals for which X1+X2+X3+X4 > 12.7 (I'm rescaling to have mean 1 for each interval). The probability for this with a specified subsequence is 0.13295%.

Now we ask what is the chance of this happening in 4 years. It's easier and not much different to ask instead about the chance of this happening in a sequence of 210,000 intervals.

If we denote Yi = X_{i}+X_{i+1}+X_{i+2}+X_{i+3} and u=12.7, the chance of this not happening is

Prob [For all 0 <= i < 200996  Yi < u] =
Product (i from 0 to 200996) of Prob [Yi < u | for all 0 <= j < i Yj < u]
Y_k and Y_m are independent for |k-m|>3. Approximating the first few terms (not needed but slightly simpler) this is equal to
Product (i from 0 to 200996) of Prob [Yi < u | Y_{i-1} < u, Y_{i-2} < u, Y_{i-3} < u]
All multiplicands are equal, and their value is
1 + (u^2 (6 + 2 E^u (-3 + u) + u (4 + u)))/(6 (-2 - 2 E^(2 u) - u (4 + u) + E^u (4 + u (4 + (-1 + u) u))))
(Slightly higher than the unconditional probability)
For u=12.7 this is
0.999202021115881015
So we're looking at
0.999202021115881015 ^ 200997 ~ 2 * 10^(-70)
Probability of this not happening. The probability of happening is roughly
0.99999999999999999999999999999999999999999999999999999999999999999999979340994 89681.

Edit: I'd be interested in hearing from anyone who can show me how to answer this question without having to assume the time is broken up into 16554 blocks of 127 minutes. I couldn't see how to do it.
I'm not sure I understand your question. The inter-event time for the Poisson process has http://en.wikipedia.org/wiki/Exponential_distribution . Are you talking about something else?
Yes, he talked about the probability of an event like this ever happening in a 4-year span.
legendary
Activity: 2128
Merit: 1074
Edit: I'd be interested in hearing from anyone who can show me how to answer this question without having to assume the time is broken up into 16554 blocks of 127 minutes. I couldn't see how to do it.
I'm not sure I understand your question. The inter-event time for the Poisson process has http://en.wikipedia.org/wiki/Exponential_distribution . Are you talking about something else?
donator
Activity: 2058
Merit: 1007
Poor impulse control.


so the last blocks took 22, 19, 40 and 46 minutes?

Wow, that's some pretty bad luck, right?

4 blocks in 127 minutes - the time you'd expect 12 to 13? That's pretty bad luck. You'd expect fewer blocks over the same duration only 0.46% of the time.

Thank you for giving me a number Wink.

The what's the probability of such bad luck happening at least once in 4 years? Probably not that unlikely.

There are ~ 16554 sets of 127 minute periods in four years. We would expect ~ 16554 * 0.0046... ~ 77 occurrences in that time. Not unlikely at all, I guess, just unlucky. But that doesn't answer your question and I don't want to just throw another figure out there, so I'll explain how I got the first result and how to answer your question.

The number of blocks solved in a given time period is a Poisson distributed random variable. If one block is expected in 10 minutes, then floor(1/10*127) = 12 blocks are expected in 127 minutes. The CDF can be calculated easily (using open source software such as R or the GNU Scientific Library, or you could use Wolfram Alpha ) and lower tail probability (the probability of 4 or fewer blocks being solved in 127 ) is 0.004636729 .

Although not entirely accurate, you could consider every 127 minute period as a Bernoulli trial where 4 or fewer blocks in the 127 minute period would be a success, and more than four in a 127 minute period is a failure, and p (the probability of success) = 0.0046. The number of successes in a given number of trials is a binomially distributed random variable.

Your question "what's the probability of four blocks in 127 minutes occurring at least once in four years?" can be reinterpreted as "what's the probability of at least one success resulting from 16554 trials where p = 0.0046?". Again this can be calculated using the software options above or Wolfram Alpha, and to the limit of my computer's accuracy:

Code:
P(X > 1) = 1


This means that four blocks in 127 minutes will almost certainly happen at least once in four years.

Edit: I'd be interested in hearing from anyone who can show me how to answer this question without having to assume the time is broken up into 16554 blocks sets of 127 minutes. I couldn't see how to do it.
Pages:
Jump to: