Pages:
Author

Topic: Analysis of hashrate-based double-spending (Read 14748 times)

donator
Activity: 2058
Merit: 1054
June 15, 2014, 12:45:11 PM
#79
No, the security comes from the average hashpower needed to produce a block. If you adjust the difficulty down, you reduce the security one block provides.
That's the myth that I'm refuting.

Though to be accurate - reducing the time per block does reduce the security one confirmation provides, but not in the way you think.
sr. member
Activity: 378
Merit: 265
Due to similar flaws, even put into account while projecting I believe PoS coins are the future. Yes, bitcoin will still stay the king, however, if more people vote, why not turn BTC into PoS??
member
Activity: 175
Merit: 10
No, the security comes from the average hashpower needed to produce a block. If you adjust the difficulty down, you reduce the security one block provides.
donator
Activity: 2058
Merit: 1054
I am trying to give clarity to the facts that contradict the myth. You don't need 51% to be successful in double spending coins but the more hashpower you have the more effective this attack will be.
Well, clarifying the facts is what the titular paper is about...
sr. member
Activity: 448
Merit: 250
It's Money 2.0| It’s gold for nerds | It's Bitcoin
Quote
Myth. Double-spending requires having more than half of the network hashrate.

As long as the attacker can use their hashpower to confirm a block that contains the double spend then the attack will be successful. The success rate of executing this attack increases as your hashrate increases until being 100% success when you control more then 1/2 the network.
Are you agreeing this is a myth or disagreeing?

Anyway, it's not just "confirm a block", it's make an alternative longer chain after the seller waited for n confirmation.

Quote
Myth. The important factor is the amount of time spent waiting for confirmations, rather than the number of blocks. (Still a myth if instead of actual time you look at #blocks * average time per block).

This could be true if the seller was willing to accept a 0/unconfirmed transaction. A seller would want to wait some time for the transaction to propagate throughout the network prior to releasing gods as an attacker could potentially release a TX to a very well connected node then one second later release a transaction to a poorly connected node that the "attackee" is suppose to believe will go through. For some period of time both transactions will appear to be valid on some parts of the network but after some time the TX will likely fall out of the pending pool on nodes.
Note that this is an analysis of hashrate-based double spending, not double-spends of 0-conf based on network topology.

I am trying to give clarity to the facts that contradict the myth. You don't need 51% to be successful in double spending coins but the more hashpower you have the more effective this attack will be.
donator
Activity: 2058
Merit: 1054
Quote
Myth. Double-spending requires having more than half of the network hashrate.

As long as the attacker can use their hashpower to confirm a block that contains the double spend then the attack will be successful. The success rate of executing this attack increases as your hashrate increases until being 100% success when you control more then 1/2 the network.
Are you agreeing this is a myth or disagreeing?

Anyway, it's not just "confirm a block", it's make an alternative longer chain after the seller waited for n confirmation.

Quote
Myth. The important factor is the amount of time spent waiting for confirmations, rather than the number of blocks. (Still a myth if instead of actual time you look at #blocks * average time per block).

This could be true if the seller was willing to accept a 0/unconfirmed transaction. A seller would want to wait some time for the transaction to propagate throughout the network prior to releasing gods as an attacker could potentially release a TX to a very well connected node then one second later release a transaction to a poorly connected node that the "attackee" is suppose to believe will go through. For some period of time both transactions will appear to be valid on some parts of the network but after some time the TX will likely fall out of the pending pool on nodes.
Note that this is an analysis of hashrate-based double spending, not double-spends of 0-conf based on network topology.
sr. member
Activity: 448
Merit: 250
It's Money 2.0| It’s gold for nerds | It's Bitcoin
Quote
Myth. Double-spending requires having more than half of the network hashrate.

As long as the attacker can use their hashpower to confirm a block that contains the double spend then the attack will be successful. The success rate of executing this attack increases as your hashrate increases until being 100% success when you control more then 1/2 the network.

Quote
Myth. The important factor is the amount of time spent waiting for confirmations, rather than the number of blocks. (Still a myth if instead of actual time you look at #blocks * average time per block).

This could be true if the seller was willing to accept a 0/unconfirmed transaction. A seller would want to wait some time for the transaction to propagate throughout the network prior to releasing gods as an attacker could potentially release a TX to a very well connected node then one second later release a transaction to a poorly connected node that the "attackee" is suppose to believe will go through. For some period of time both transactions will appear to be valid on some parts of the network but after some time the TX will likely fall out of the pending pool on nodes.
donator
Activity: 2058
Merit: 1054
If the mining is memoryless, then does it make another popular paper published last year less meaningful? (http://arxiv.org/abs/1311.0243). They argue that if a selfish miner get a block, they can hide it and starts to mine the next block based on it earlier than others so they can have an advantage.

What's the advantage then? The only advantage is that they have a slight chance to find the block before others having a chance to work on it. If they publish the block before finding the next one, then no matter how long they have worked on it, they don't have any advantage.
That paper correctly treats block finding as memoryless (it may or may not be correct about other things). You'd need to examine their analysis to understand why the attack works. The basic idea is that the attackers are attempting to build a long chain which is unknown to the network. If they manage to find 2 blocks in a row, they have the ability to evaporate a block found by the honest network, by releasing their longer, previously hidden chain.
legendary
Activity: 882
Merit: 1000
When you find a block earlier, your opponents do not need to start from the scratch. They keep on working on their current block, on which they have spent , so the probability for they to get the next block earlier than you is much much higher than 'p'.

It seems that if all these words in this paper make sense, actually the author assumes the mining block is a memoryless process. It means no matter how much time you have spent on mining the current block, the expecting time from now to the moment you get the block is always the same (around 10 minutes). Begin computing early does not bring you any advantage. This is quite counter-intuitive.

Therefore, now I really doubt the correctness of this paper.
It's a well-known fact that block finding is memoryless. Miners "start from scratch" every fraction of a second. They try a hash, and it being a valid block is completely random and independent. If the hash is not valid, it does not make the next hashes any more likely to be valid. If a miner mines for 20 minutes and finds no block, he is not any closer to finding one than when he began.


I have a question about this paper.

We will not
use this assumption, but rather model m more accurately as a negative binomial variable; it
is the number of successes (blocks found by the attacker) before n failures (blocks found by
the honest network), with a probability q of success. The probability for a given value of m
is
P(m) = (m + n − 1) p ^n * q ^m
                 m

I doubt this is correct. When the attacker is preparing a fork, he is not competing with the others. There are two completely independent mining process. I don't know why this can be modelled as a negative binomial variable.

When others find a block, it's not the attacker's failure. He will not restart a mining based on other's block, but has to continue mining based on his own previous block.


Moreover

"We will denote by az the probability that the attacker will be able to catch up when he is
currently z blocks behind. Clearly, if z < 0 then az = 1 as the attacker already has a longer
branch. To find az when z ≥ 0, we condition on the first step. If the next block found will
be by the honest network, which happens with probability p, the attacker will now be z + 1
blocks behind and his probability of success will be az+1. If the next block found will be by
the attacker, which happens with probability q, his probability of success will be az−1. It
follows that az satisfies the recurrence relation
   az = paz+1 + qaz−1."


This is also not so straightforward. As I said, there's no competition, so what's the meaning of 'next block'. They are working on two different chains.
"Failure" is the standard term used when discussing negative binomial distributions. It doesn't mean anything.

If you look at continuous time these are two separate process. However, we are certainly allowed to introduce an abstraction that puts all the blocks found in a sequence. Each such block will have probability p to be honest and q to be the attackers (independent of other blocks). The rest of the analysis follows.

Yes, if the mining is really memoryless, then I agree that everything makes sense.

So it means because the nonce space is almost infinite, excluding part of it does not help at all.

If the mining is memoryless, then does it make another popular paper published last year less meaningful? (http://arxiv.org/abs/1311.0243). They argue that if a selfish miner get a block, they can hide it and starts to mine the next block based on it earlier than others so they can have an advantage.

What's the advantage then? The only advantage is that they have a slight chance to find the block before others having a chance to work on it. If they publish the block before finding the next one, then no matter how long they have worked on it, they don't have any advantage.

donator
Activity: 2058
Merit: 1054
When you find a block earlier, your opponents do not need to start from the scratch. They keep on working on their current block, on which they have spent , so the probability for they to get the next block earlier than you is much much higher than 'p'.

It seems that if all these words in this paper make sense, actually the author assumes the mining block is a memoryless process. It means no matter how much time you have spent on mining the current block, the expecting time from now to the moment you get the block is always the same (around 10 minutes). Begin computing early does not bring you any advantage. This is quite counter-intuitive.

Therefore, now I really doubt the correctness of this paper.
It's a well-known fact that block finding is memoryless. Miners "start from scratch" every fraction of a second. They try a hash, and it being a valid block is completely random and independent. If the hash is not valid, it does not make the next hashes any more likely to be valid. If a miner mines for 20 minutes and finds no block, he is not any closer to finding one than when he began.


I have a question about this paper.

We will not
use this assumption, but rather model m more accurately as a negative binomial variable; it
is the number of successes (blocks found by the attacker) before n failures (blocks found by
the honest network), with a probability q of success. The probability for a given value of m
is
P(m) = (m + n − 1) p ^n * q ^m
                 m

I doubt this is correct. When the attacker is preparing a fork, he is not competing with the others. There are two completely independent mining process. I don't know why this can be modelled as a negative binomial variable.

When others find a block, it's not the attacker's failure. He will not restart a mining based on other's block, but has to continue mining based on his own previous block.


Moreover

"We will denote by az the probability that the attacker will be able to catch up when he is
currently z blocks behind. Clearly, if z < 0 then az = 1 as the attacker already has a longer
branch. To find az when z ≥ 0, we condition on the first step. If the next block found will
be by the honest network, which happens with probability p, the attacker will now be z + 1
blocks behind and his probability of success will be az+1. If the next block found will be by
the attacker, which happens with probability q, his probability of success will be az−1. It
follows that az satisfies the recurrence relation
   az = paz+1 + qaz−1."


This is also not so straightforward. As I said, there's no competition, so what's the meaning of 'next block'. They are working on two different chains.
"Failure" is the standard term used when discussing negative binomial distributions. It doesn't mean anything.

If you look at continuous time these are two separate process. However, we are certainly allowed to introduce an abstraction that puts all the blocks found in a sequence. Each such block will have probability p to be honest and q to be the attackers (independent of other blocks). The rest of the analysis follows.
legendary
Activity: 882
Merit: 1000
Moreover

"We will denote by az the probability that the attacker will be able to catch up when he is
currently z blocks behind. Clearly, if z < 0 then az = 1 as the attacker already has a longer
branch. To find az when z ≥ 0, we condition on the first step. If the next block found will
be by the honest network, which happens with probability p, the attacker will now be z + 1
blocks behind and his probability of success will be az+1. If the next block found will be by
the attacker, which happens with probability q, his probability of success will be az−1. It
follows that az satisfies the recurrence relation
   az = paz+1 + qaz−1."


This is also not so straightforward. As I said, there's no competition, so what's the meaning of 'next block'. They are working on two different chains. When you find a block earlier, your opponents do not need to start from the scratch. They keep on working on their current block, on which they have spent , so the probability for they to get the next block earlier than you is much much higher than 'p'.

It seems that if all these words in this paper make sense, actually the author assumes the mining block is a memoryless process. It means no matter how much time you have spent on mining the current block, the expecting time from now to the moment you get the block is always the same (around 10 minutes). Begin computing early does not bring you any advantage. This is quite counter-intuitive.

Therefore, now I really doubt the correctness of this paper.
legendary
Activity: 882
Merit: 1000
I have a question about this paper.

We will not
use this assumption, but rather model m more accurately as a negative binomial variable; it
is the number of successes (blocks found by the attacker) before n failures (blocks found by
the honest network), with a probability q of success. The probability for a given value of m
is
P(m) = (m + n − 1) p ^n * q ^m
                 m

I doubt this is correct. When the attacker is preparing a fork, he is not competing with the others. There are two completely independent mining process. I don't know why this can be modelled as a negative binomial variable.

When others find a block, it's not the attacker's failure. He will not restart a mining based on other's block, but has to continue mining based on his own previous block.
donator
Activity: 2058
Merit: 1054
January 05, 2013, 12:20:16 PM
#67
Inetresting paper for the modelling part but rather weak in dealing with the economics of the double-spend.
I tend to agree, the economics part is not meant to be conclusive, rather act as a starting point for how an analysis could be done.

For example
"An attacker with majority hashrate can not only double-spend any trans- action at no cost (above the cost of running the mining hardware normally)"

Wrong: the attacker can only double-spend his own coins, meaning he must earn or buy a stash of coins upfront. As a result of his attack, his cost includes the initial value of the coins (before the attack) minus whatever value is reached after the attack is stopped by counter-measures (like new check points in the blockchain or merchants temporarily holding deliveries on bitcoin payments).
This cost exists but as I explicitly stated, I don't think it's significant since other factors limit the possible scale of attack.

More generally, you can include most "practical" costs as terms in the model. Future work will include formulating a model that can encompass most such things and be tractable to solve. Then it's just a matter of figuring out all the costs and plugging them in.

The same flaw (silo thinking) was found in the microsoft paper that dealt with bitcoin endogenous inventives (algorithmic rewards) while ignoring completely exogenous incentives (contractual rewards).
There's something very basic about research papers that some people seem not to get. You can't give and solve a model that 100% accurately reflects reality. You need to make some simplifying assumptions and even if those assumptions aren't correct per se, they often have less effect on the final conclusions as it seems; even if the effect is significant, the more accurate model will rely on the intuitions of the simpler case. You don't formulate general relativity without first having a firm grasp of Newtonian mechanics. A big part of the trick is knowing when the assumptions are adequate, and how to refine them when they are not.
legendary
Activity: 1220
Merit: 1015
e-ducat.fr
January 04, 2013, 12:07:06 PM
#66
Inetresting paper for the modelling part but rather weak in dealing with the economics of the double-spend.
A thorough economic analysis of the double-spend cannot go very far without game theory (Nash equilibrium in particular).

For example
"An attacker with majority hashrate can not only double-spend any trans- action at no cost (above the cost of running the mining hardware normally)"

Wrong: the attacker can only double-spend his own coins, meaning he must earn or buy a stash of coins upfront. As a result of his attack, his cost includes the initial value of the coins (before the attack) minus whatever value is reached after the attack is stopped by counter-measures (like new check points in the blockchain or merchants temporarily holding deliveries on bitcoin payments).

Also, the cost of performing a criminal activity (theft from merchants) with serious legal consequences (financial dammage to the stakeholders) must be factored in unless your assumption is the perfect crime (the promoters of the attack remain perfectly anonymous: a fairly aggressive assumption).

In short, if Visa or a similar organisation (incumbent stakeholder) were to engage in such attack, it would not be long before a disgruntled employee aired the story. Harnessing enough power to perform a serious attack is not something that could go unnoticed.

The same flaw (silo thinking) was found in the microsoft paper that dealt with bitcoin endogenous inventives (algorithmic rewards) while ignoring completely exogenous incentives (contractual rewards).

legendary
Activity: 1204
Merit: 1015
January 03, 2013, 06:01:53 PM
#65
DAG stands for Directed Acyclic Graph. The PoS article you linked to mentions this once, not going into detail. I think it refers to the structure of the block relations. The blockchain currently forms a tree, where different branches are not reconcilable, with one having to be eventually forfeighted. I guess the proposal is to make it possible for branches to join back. I don't know of any such proposals though.
Thanks. It's probably the blocktree thing then.
This is indeed what a DAG is and I think we're thinking about the same proposal. I might try linking to it sometime.
I guess this is Maged proposal, isn't it? https://bitcointalksearch.org/topic/m.686497
That's the one. I haven't spent time thinking about it in a while, AFAIR it's not completely fleshed out yet, but I think these ideas are the key to preventing a majority attack that rejects all transactions.
The few major issues I've come to realize with that proposal are:
1) It will most likely require cementing at some level so that miners don't "save up" their mined blocks to guarantee themselves a block in the main chain that won't be kicked out by someone else by chance.
2) On a similar note, Finney attacks are guaranteed to be successful, unless some form of block discouraging is used. This wouldn't be that big of a deal if the main chain gets extended every few seconds, though.
3) Any currently-backwards-compatible change that only requires a majority of the miners to upgrade to go into effect would require a hard fork if this were used. This might not be that much of a problem once the Bitcoin network is more mature.

Speaking of #3, that directly relates to that whole contract thing theymos brought up recently: it would truly become impossible to make lost coins unspendable if ECDSA is ever broken, so that would have to be considered too.

Anyway, that's the update on that.
donator
Activity: 2058
Merit: 1054
December 20, 2012, 02:22:36 AM
#64
I ready your paper and got most of it.  I have to admit my knowledge of binomial distributions and such is a little rusty, but I get your points.  Of course, rhetorically speaking, what is the optimal balance between the proof of work difficulty and the target time between blocks? Why not reduce the target time to 30 seconds instead of 10 minutes?  Great paper, though, and thank you for your contribution, certainly worth a donation.
Finding out an optimal tradeoff would require more analysis. Essentially, about t/T of the honest network's total hashrate is lost due to forking, where T is the time between blocks and t is the typical time to propagate a block on the network. If we assume t = 10 sec, then for T=10 min 1.7% is wasted, and for T=30 sec 33% is wasted. How this compares with the ability to get more confirmations in a unit of time depends on the specific scenario. The additional resource cost to SPV clients (inversely proportional to T) should also be considered.

I had a suggestion once for self-adjusting dynamic block frequency but I don't think it's really going to work. Even with a static system 2-3 minutes is probably better than 10, but I doubt this will be changed for Bitcoin.

Thank you for the donation.
newbie
Activity: 58
Merit: 0
December 20, 2012, 12:05:11 AM
#63
Meni R.,

I ready your paper and got most of it.  I have to admit my knowledge of binomial distributions and such is a little rusty, but I get your points.  Of course, rhetorically speaking, what is the optimal balance between the proof of work difficulty and the target time between blocks? Why not reduce the target time to 30 seconds instead of 10 minutes?  Great paper, though, and thank you for your contribution, certainly worth a donation.

donator
Activity: 2058
Merit: 1054
December 15, 2012, 12:22:11 PM
#62
That is an excellent paper discussing the hash-rate with a very straight mathematical explanation. I appreciated very much.

Only this part I could not understand completely:

Quote
A more accurate model might take into account that generating blocks costs more to the attacker than their reward, and that he would not have mined them at all (or procured the necessary computing resources) if he did not want to double-spend. Such a model could obviate the need to choose a value for q, by posing limits on the hashrate an attacker would obtain to perform attacks of a given value. However, once the focus of the security is the cost of neutrally mining, the number of confirmations required becomes linear, not logarithmic, in the transaction value; this is very poor security, hence in a situation where this is relevant, we have already lost anyway.

What did you mean by 'cost of neutrally mining' and how the 'number of confirmations required becomes linear'?
'Cost of neutrally mining' = how much it costs to mine each block without trying to double-spend.

The cost of double-spending is roughly inversely proportional to the probability of success. If the hashrate isn't too high, the probability is exponentially decreasing with the number of confirmations, hence the cost is exponentially increasing. So the needed number of confirmations is logarithmic in the value of the transaction. If the attacker has majority hashrate, the probability of success is essentially 1, so it cannot help us make the attack expensive. The cost to double-spend in this case is roughly proportional to n/(2q-1), where n is the number of confirmations and q is the attacker's hashrate. Thus the number of confirmations needed is linear in the value of the transaction.
vip
Activity: 756
Merit: 503
December 15, 2012, 11:38:39 AM
#61
That is an excellent paper discussing the hash-rate with a very straight mathematical explanation. I appreciated very much.

Only this part I could not understand completely:

Quote
A more accurate model might take into account that generating blocks costs more to the attacker than their reward, and that he would not have mined them at all (or procured the necessary computing resources) if he did not want to double-spend. Such a model could obviate the need to choose a value for q, by posing limits on the hashrate an attacker would obtain to perform attacks of a given value. However, once the focus of the security is the cost of neutrally mining, the number of confirmations required becomes linear, not logarithmic, in the transaction value; this is very poor security, hence in a situation where this is relevant, we have already lost anyway.

What did you mean by 'cost of neutrally mining' and how the 'number of confirmations required becomes linear'?
donator
Activity: 2058
Merit: 1054
December 13, 2012, 03:25:13 AM
#60
Quote from: Meni Rosenfeld
The time constant might be relevant, if we assume that the attacker cannot sustain his
hashrate for a prolonged period of time. In this case, even majority hashrate does not
guarantee success, as the attack could take more time than available. However, it does
not seem likely that an attacker would go through the trouble of obtaining enormous
computational resources and only maintain them for a time short enough to make this
consideration relevant.
I think the time constraint assumption does have merit. The attacker can rent hash power on a hash market (or just rent GPUs/ASICs directly from a cloud provider), he does not necessary need to exclusively own a large hash rate himself.
I don't think it's feasible. You don't just walk into a store and buy plut rent this kind of hashrate, it still requires a lot of preparation. GPUs will soon be obsolete and ASICs are Application Specific, a provider of SHA-256 ASICs is a hash market. And I still think if it becomes possible we lose anyway, since the security will be only linear in the number of confirmations. A fuller treatment of this is under the scope of more accurate economic modeling.

It is. 6 confirmations is equivalent to 6 confirmations and 60 confirmations is equivalent to 60 confirmations. The probability of success is uniquely determined by the number of confirmations and the attacker's percentage of the total hashrate. With these two fixed, it does not at all matter what are the difficulty, the total hashrate, the average time between blocks, or the actual time it took to find those confirmations.
Yes, I see it now.  I believe it's a correct analysis (and not what I had previously thought).
Great.

A lot of big thinking behind this paper.. Thanks for doing it, as I see, it seems to raise some serious tought, and may help bitcoin become better in the future.

Meni Rosenfeld : You seem to have put a lot of time and effort in studying this, and writing a such good paper..

Why have you done this ?  I mean, what was you motivation to study this so deeply ?
Well, first it's worth pointing out exactly how much effort was involved. I have spent about 12 hours working on this paper. It is this low because the paper is something of a "v Pravde net izvestiy, v Izvestiyakh net pravdy". There is really nothing new in sections 1-5. 1-2 are just common knowledge (added for context), 3-5 (which are what I really wanted to write) are just drilling down the analysis in Satoshi's paper. The only differences are:

1. Instead of a Poisson distribution, I used a more accurate negative binomial distribution.
2. I assumed the attacker needs to be 1 block ahead to win, but also that he pre-mines a block a la Finney; these two cancel out. Also, there is some ambiguity about how Satoshi counts the confirmations.

Writing when you know what you want to write doesn't take that much time. (It does require a specific state of mind which is not always available.) Half the time was spent figuring out how to properly align the blockchain diagrams.

Section 6 is mostly novel and required some research, but it's not meant to be authoritative, more of a starting point for additional analysis. I included it because the paper wouldn't be complete without it.

My primary motivations for writing this were:

1. I hate misinformation and myths. When I see important mistakes constantly being repeated I want to do something about it. It might end up saving me time by sparing the need to correct the mistakes every time.
2. It can signal my skills and commitment to Bitcoin, which may help expand my business opportunities (aka indirect monetization).
3. While it is hard work, some of it can also be fun (the math part, not the figuring out xy-pic part).

Altruism plays variable roles in different things I do, but I can't honestly say it had more than a minor influence on this in particular. So all in all #1 is why I wanted to do this, #2 is why I could justify allocating time to it, and #3 is a perk.

DAG stands for Directed Acyclic Graph. The PoS article you linked to mentions this once, not going into detail. I think it refers to the structure of the block relations. The blockchain currently forms a tree, where different branches are not reconcilable, with one having to be eventually forfeighted. I guess the proposal is to make it possible for branches to join back. I don't know of any such proposals though.
Thanks. It's probably the blocktree thing then.
This is indeed what a DAG is and I think we're thinking about the same proposal. I might try linking to it sometime.
I guess this is Maged proposal, isn't it? https://bitcointalksearch.org/topic/m.686497
That's the one. I haven't spent time thinking about it in a while, AFAIR it's not completely fleshed out yet, but I think these ideas are the key to preventing a majority attack that rejects all transactions.
Pages:
Jump to: