Author

Topic: Retargeting algorithm -- arithmetic or harmonic mean? (Read 2331 times)

donator
Activity: 2058
Merit: 1054
In principle, negative timespans which result from inaccurate timestamps can be solved by sorting the 2016 timestamps, taking differences between the sorted values, and setting a minimum value for the spans.

My comment (which is about timestamps coming from an exact, theoretical Poisson process) wasn't completely accurate. The fact that 1/E[1/X]=0 is only relevant if the number of timestamps is arbitrary. For the fixed number 2016, E[H] has a finite value, which is approximately 0.109*(10 minutes). For the geometric mean, E[G] = exp(E[log(X)]) = exp(-gamma)*E[X], independent on the number of intervals.
legendary
Activity: 905
Merit: 1012
Thanks gmaxwell, that's the kind of explanation I was looking for. I think I understand Meni's original post now--the harmonic mean is clearly a flawed choice in this case.
staff
Activity: 4284
Merit: 8808
But with the harmonic mean every miner would have to lie to achieve the same effect as just the miner of the first/last block being dishonest in the arithmetic mean scenario.

_any_ miner lying can lower the difficulty slightly, so everyone gets a marginal return for doing it. This makes larger drifts (which at still limited to half a percent one time reduction in difficulty at most) harder, yes, but it makes smaller drifts much more likely.


If I understand correctly you're proposing:

10./(2016./sum([1./GAPi for i in range(2016)])) = adj factor

So, if GAP is all 10 you have 1.  ... All 5 you have 2. Etc.

What happens when every third is a false tick that goes back 10 minutes, then the next tick is correct.

The gaps then look like [10, -10, 30, ...].   The arithmetic mean gives the right value.

10./(2016./sum([1./(10+(-20*(x%3==1))+(20*(x%3==2))) for x in range(2016)])) = 0.1111x   !

Shall we try for a geometric mean so that we can get an imaginary adjustment next? Smiley
legendary
Activity: 1232
Merit: 1076
PID controllers are hard to tune. I like bitcoin's simplicity
donator
Activity: 2058
Merit: 1054
This absolutely cannot work. The harmonic expectation of the exponential distribution is 0, so there is no way to retarget using it.

Even if you use a geometric mean or something else that does converge, it will have very high variance causing retargets to be all over the place. Not only that, but an attacker could release blocks with reported timestamp a second after the previous ones, causing a massive drop in the geometric mean and a spike in the difficulty.
Are you talking about continuous difficulty adjustment? Otherwise I'm really not sure where you're coming from. The question in the OP is about a drop-in replacement of the method for determining the average hashrate in bitcoin's once-every-2016th-block difficulty adjustment. I assure you it does work as I have live code doing it right now, and it retargets just fine; it even results in a more accurate adjustment. The harmonic mean, not the arithmetic mean, yields the actual average hashrate for the last N blocks, although the difference between the two is typically very small.
Maybe I'm not getting your proposed algorithm. Letting Ti = time between block i-1 and block i, the current algorithm is (disregarding the off-by-one bug):
Quote
A = (T_1 + T_2 + ... +T_2016) / 2016
D2 = D1 * 10 minutes / A
What I think you're suggesting:
Quote
H = 2016 / (1/T_1 + 1/T_2 + ... + 1/T_2016)
D2 = D1 * c / H
Is that right? If so, can you explain what c is, and how does this work? If not, what is it?
legendary
Activity: 905
Merit: 1012
This actually isn't desirable.  In the current system miners have no economic incentive to lie about their timestamps (no marginal gain from doing so), except for the window boundary block, whos permissible times are constrained by the surrounding blocks and the actual time, allowing at most a small one time difficulty shift (assuming the warping bug is fixed).  Including other blocks in the time computation would create an incentive to lie constantly for every miner, and for the same final response time a much larger allowable skew.
That's no different than the incentive in the current system, however. The existing constraint on block header timestamps vs. network time caps the amount of michief that can be done in either case. But with the harmonic mean every miner would have to lie to achieve the same effect as just the miner of the first/last block being dishonest in the arithmetic mean scenario.

This absolutely cannot work. The harmonic expectation of the exponential distribution is 0, so there is no way to retarget using it.

Even if you use a geometric mean or something else that does converge, it will have very high variance causing retargets to be all over the place. Not only that, but an attacker could release blocks with reported timestamp a second after the previous ones, causing a massive drop in the geometric mean and a spike in the difficulty.
Are you talking about continuous difficulty adjustment? Otherwise I'm really not sure where you're coming from. The question in the OP is about a drop-in replacement of the method for determining the average hashrate in bitcoin's once-every-2016th-block difficulty adjustment. I assure you it does work as I have live code doing it right now, and it retargets just fine; it even results in a more accurate adjustment. The harmonic mean, not the arithmetic mean, yields the actual average hashrate for the last N blocks, although the difference between the two is typically very small.

Also, messing with the timestamp of the last block may have some short-term effect, but not so much on the long-term because of the rules for validity of timestamps.
The question is more theoretical than anything. Such mischief would result in at most an increase or decrease in the difficulty by a few percent, and would likely be corrected in the next round anyway. Really I'm asking the question out of a desire to understand: I'm wondering if there is a reason Satoshi decided to use the arithmetic mean (such as preventing an attack or removing incentive for bad behavior), or if it was just a mistake (a very common mistake, to be fair).
donator
Activity: 2058
Merit: 1054
This absolutely cannot work. The harmonic expectation of the exponential distribution is 0, so there is no way to retarget using it.

Even if you use a geometric mean or something else that does converge, it will have very high variance causing retargets to be all over the place. Not only that, but an attacker could release blocks with reported timestamp a second after the previous ones, causing a massive drop in the geometric mean and a spike in the difficulty.

Also, messing with the timestamp of the last block may have some short-term effect, but not so much on the long-term because of the rules for validity of timestamps.

This isn't to say I think the retargeting algorithm is perfect. It is basically a P controller, where PID may have been better.
staff
Activity: 4284
Merit: 8808
such potential/risk would have been evenly distributed across all blocks.

This actually isn't desirable.  In the current system miners have no economic incentive to lie about their timestamps (no marginal gain from doing so), except for the window boundary block, whos permissible times are constrained by the surrounding blocks and the actual time, allowing at most a small one time difficulty shift (assuming the warping bug is fixed).  Including other blocks in the time computation would create an incentive to lie constantly for every miner, and for the same final response time a much larger allowable skew.

Thats not to say that such a change wouldn't have also been a viable approach— it probably would have been— I just don't see that it's clearly better.
legendary
Activity: 905
Merit: 1012
That's actually how I came to thinking about this. I was looking at the retargeting code and got to thinking about how even if the time-warp bug were fixed one might still go about causing mischief by coming up with acausal values for the block header timestamp. Because an arithmetic average is used, all that matters for the difficulty calculation are the endpoints of the retargeting interval--the miners that happen to find the first and last block of an interval are privileged in their ability to affect the difficulty retargeting algorithm by about +/- 2 hours (0.6%) each. If the harmonic mean were used instead, such potential/risk would have been evenly distributed across all blocks. (Additionally, a pool operator with a significant percentage of the global hash rate could further move the difficulty by a few percent by mucking around with all its blocks so as to skew the network time in one direction or the other, but this remains true in either case.)

By not fixing the time warp attack the problem is exacerbated, as otherwise (with the fix in place) an inflation of the perceived difficulty of one retarget interval would have a corresponding deflationary effect on the following round, and vice versa.
legendary
Activity: 873
Merit: 1000
Bitcoin uses a simple arithmetic mean for its retargeting algorithm. But since it's adapting to a change in rate, shouldn't it be using a harmonic mean instead? Of course it is probably way too late to change the algorithm, but wouldn't an arithmetic mean give mine pool operators who fudge timestamps a disproportionate effect?

read up on the time warp bug, it might be relevant: https://bitcointalksearch.org/topic/m.521772
legendary
Activity: 905
Merit: 1012
Bitcoin uses a simple arithmetic mean for its retargeting algorithm. But since it's adapting to a change in rate, shouldn't it be using a harmonic mean instead? Of course it is probably way too late to change the algorithm, but wouldn't an arithmetic mean give mine pool operators who fudge timestamps a disproportionate effect?

EDIT: As gmaxwell and Meni Rosenfeld point out, bitcoin is actually measuring the rate parameter of an exponential distribution, for which the arithmetic mean is the proper choice. Also, the harmonic mean could be manipulated by miners via acausal blocks or very short timestamp differences.
Jump to: