Pages:
Author

Topic: New paper: Accelerating Bitcoin's Trasaction Processing - page 2. (Read 36368 times)

legendary
Activity: 905
Merit: 1012
Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

It's done based on local information.
member
Activity: 70
Merit: 10
Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?
newbie
Activity: 23
Merit: 0
Hi,

I have post my suggestion for GHOST implementation, aiming to solve the centralization/unfairness of mining problem, and to reduce the vulnerability to convergence attacks (by making it much more difficult to effectively release old-mined blocks). Here is the link to the thread, and I will appreciate any of your comments:  https://bitcointalksearch.org/topic/m.4278073.
 
Lear.
newbie
Activity: 23
Merit: 0
Hi,

Let's reexamine proposition 8.1 and claim 8.2 of the paper, that say there will always eventually be a convergence ("collapse") to the next chain block and this convergence takes in expectation a finite time.

The proof is based on the fact that for any finite interval of time, there is a positive constant probability of the event where no new block is created anywhere in the network during a period of time of this interval's length. The problem is that you implicitly assume all miners immediately release any block they have found, and we know this is untrue. So you cannot simply derive that on some point all the nodes in the network will have exactly the same information without assuming that all nodes are honest!

If the average time of creating a new block in the tree is longer than the network diameter, your proof is nevertheless valid. Else, theoretically there is a positive probability that the network will never come to a point of sharing exactly the same information, since withholded blocks might be released once every d seconds where d is the network's diameter.
 
Now let me describe a possible DoS attack that tries to prevent convergence. We take the approach of the paper to have a simplified network of only two honest nodes, while noting that a similar attack is applicable to a more complex network. Say we have two honest nodes whose connection having a delay of d that is relatively big compared with the mean time of block creation.

Say we have two more nodes controlled by the same attacker, each one is adjacent to one of the two honest nodes, so that the delay between the pair is negligible compared to d. The two attacker's nodes may be connected by a special secret channel whose delay is lower than d, or may have no direct connection at all. Having such a special connection can only reduce the attacker's required fraction of total hash-power, but this is not crucial for the attack success. (Nevertheless I think it is most probable that such a faster channel may exist, because in reality the attacker can skip validation of directly transmitted data between his nodes, or even send only data about the number of created blocks instead of the block themselves. On the other hand those two honest nodes are actually two "centers" of the network and the delay between them is caused by having a path of at least several nodes between them…).

Let the two honest nodes have about the same hash-power, and the two attacker's nodes have relatively small hash-power compared with the honest nodes. Due to the delay, one honest node is informed only of his blocks and the released blocks of his adjacent attacker, as well as the blocks of the other honest node and the released blocks of the other attacker's node that were not released at the last d seconds. Therefore it is possible that the other honest node is working on a branch that is actually heavier that the branch the first honest node is working on, because first honest node wrongly thinks that his branch is the heavier. The attacker is trying to always fool the honest node that mines on the less heavy branch, so that the honest nodes will never converge to the same branch.

In order to analyze the required hash-power of the attacker needed so that with a positive probability there will never occur a "collapse" (in contradiction to proposition 8.1), we shall slightly modify our model so it will be easier to mathematically analyze it: instead of saying a honest node will continue to mine on his initial branch until a point where the competitive branch has been heavier already d seconds before, we say that he continues to mine on his initial branch unless the other branch is heavier by k blocks or more (ignoring any delays). In a sense having a delay can be shown to be equivalent of having such k > 0. Moreover, on reality there is no such fixed delay d, so I am not sure that the model of having a fixed corresponding k is less realistic than having a fixed delay d.

The attack goes as follows: each of the attacker's nodes keeps mining in secret on top of the branch of its adjacent honest node, and whenever the competitive second branch is reaching a tie from the point of view of the two nodes working on the first branch, meaning it is actually leading by k blocks) the attacker's node release a single block that breaks the tie in favor of its adjacent honest node. So the question is what is the minimal required hash-power of the attacker so that there is a positive probability that he will never run out of blocks when he needs to release a new block? (In fact the attacker might temporarily run out of blocks, so the network will temporarily collapse to one of the branches, but the attacker may later succeed mining the required block to reverse the collapse. We ignore that for simplicity).

Mathematically, we have a random walking on the 2k-1 numbers: 0, +-1, +-2, … +-(k-1). Each step is forward with probability 1/2 and backward with probability 1/2, where "forward" of (k-1) is regarded as (k-1) and backward of -(k-1) is regarded as -(k-1), as long as the attacker hasn’t run out of secret blocks. This simple Markov process can be shown to define a uniform distribution over the set of legal positions, therefore each of the attacker's nodes have to release a single block every 2(2k-1) blocks in expectation. So the minimum required hash rate of the attacker is 1/(2k-1) of the total honest hash-power, or 1/2k of the total network's hash-power.

The bigger is the delay d with respect to the average time it takes to create a new block, the bigger is k, and the closer to zero is the required hash-power of a successful attacker. Therefore I am afraid that 1 second per block-chain block (which is less than 1 second per a block in the tree!!) is just too much. Yet I am sure GHOST is a revolutionary proposal that can much improve the very low current rate of 10 minutes per block.

The last thing I would like to note is that vulnerability to the same kind of convergence attack isn’t unique to GHOST, but to any system whose delay is relatively high compared with the average time per block. If Bitcoin blocks will be *much* increased (a lot more than the current 1MB maximum) and as a result the delay will increase too, it will be possible to attack Bitcoin similarly.

Yet a major difference is that in Bitcoin the attacker cannot keep secret blocks and release them later, as this will have no effect. Therefore the attacker should always mine on top of the shorter chain, and hopes that the difference between the lengths of the competitive chains will not grow too much. Theoretically it can be shown that no matter how much hash-power the attacker has, eventually this gap will reach any natural number. As this might take a very long time, I still consider such an attack valid.


Lear
newbie
Activity: 21
Merit: 0
Aviv & Yonatan:  Could GHOST work in the situation that transaction & block propagation across the whole network happens much slower than block generation?  For instance, in inter-planetary distances?  We'd expect blocks to be made on Mars every few seconds but each would not make it to Earth for a few minutes.

It would work, in the sense that the 50% attack remains hard to carry out, but the number of transactions per second would be really low. Too low to be of any use.
We're working on an extension that would fix that issue as well. It's a bit too early to tell if we'll be able to close all the holes. I haven't thought of it as a solution for an interplanetary currency though  Smiley
newbie
Activity: 11
Merit: 0
Aviv & Yonatan:  Could GHOST work in the situation that transaction & block propagation across the whole network happens much slower than block generation?  For instance, in inter-planetary distances?  We'd expect blocks to be made on Mars every few seconds but each would not make it to Earth for a few minutes.

The only mining on Mars should be minerals Smiley
newbie
Activity: 24
Merit: 0
Aviv & Yonatan:  Could GHOST work in the situation that transaction & block propagation across the whole network happens much slower than block generation?  For instance, in inter-planetary distances?  We'd expect blocks to be made on Mars every few seconds but each would not make it to Earth for a few minutes.
legendary
Activity: 1176
Merit: 1036
Dash Developer
To anyone interested in helping implement ghost:

I'm going to start implementing the changes needed to traverse the tree in the reverse direction and then implement GHOST. If anyone wants to help, feel free to submit pull requests:

https://github.com/evan82/bitcoin-ghost
newbie
Activity: 24
Merit: 0
Regarding bloat: This is not a difficult issue. First, there is only potential bloat once propagation delays are significantly higher than block creation rates, but more importantly,
The contents of off-chain blocks can be discarded after a short while (e.g., after 1 day). Only the header is needed to verify the difficulty was met. If a switch is needed to some previously discarded subtree, then the node that triggers it, can supply the contents of the blocks as well.
 OK.  This works for me.
newbie
Activity: 21
Merit: 0
2) There is also a need to add hashes (or other identifiers) of newly discovered off-chain blocks into each block ("newly discovered" here means: not included inside any other previously reachable block). Nodes must then request such blocks if they do not already have them, and the propagation mechanisms / inv messages need to be adjusted accordingly.
Yes, this would also need to be done in order to implement your proof-of-work difficulty-adjustment algorithm, right?  But it's not required for GHOST?  I have not started looking into how to implement this.  I am worried about the bloat that this would cause if the entire tree must be stored though.  Have you looked into how your algorithm would fare if we ignored off-chain blocks for difficulty-adjustment?

This is actually required for GHOST, so that nodes have the same world-view and decide on the same leaf -- the off-chain blocks affect your choice of leaf. It is especially important for nodes that were offline for a while and want to catch up. You may in theory be able to do without it, but the convergence time to the same chain would take longer.

Regarding bloat: This is not a difficult issue. First, there is only potential bloat once propagation delays are significantly higher than block creation rates, but more importantly,
The contents of off-chain blocks can be discarded after a short while (e.g., after 1 day). Only the header is needed to verify the difficulty was met. If a switch is needed to some previously discarded subtree, then the node that triggers it, can supply the contents of the blocks as well.
newbie
Activity: 24
Merit: 0
1) I might be wrong about this, because I haven't looked at the implementation too carefully, but in principle, I think forward and backward traversal of the chain should already be possible: as far as I'm aware, the unspent transaction output set needs to be rolled back if a recent part of the chain is replaced. If I'm not mistaken (and somebody please correct me if I am) the unspent outputs are only maintained for the "best block" and if this is replaced by an alternate chain, it isn't built from scratch.
class CBlockIndex has pointers pprev and pnext that point to the parent block and the child along the main chain.  The main chain can be traversed forward and backward, but the entirety of the tree cannot.  I see no reason why this can't be changed.

2) There is also a need to add hashes (or other identifiers) of newly discovered off-chain blocks into each block ("newly discovered" here means: not included inside any other previously reachable block). Nodes must then request such blocks if they do not already have them, and the propagation mechanisms / inv messages need to be adjusted accordingly.
Yes, this would also need to be done in order to implement your proof-of-work difficulty-adjustment algorithm, right?  But it's not required for GHOST?  I have not started looking into how to implement this.  I am worried about the bloat that this would cause if the entire tree must be stored though.  Have you looked into how your algorithm would fare if we ignored off-chain blocks for difficulty-adjustment?

3) I'm not sure what you mean by "the code to invalidate and discard blocks". Can you explain in more detail?

I was originally under the impression that the parts of the blocktree that are not on the main chain get deleted from memory, but now I see that what I was looking at was code to invalidate blocks if the header is not correct.  I am not sure if the entire tree is kept in memory or if parts are eventually discarded.

By the way, I'm waiting to hear back from my friends before I email you back.

newbie
Activity: 21
Merit: 0
The way the client calculates the proof of work for each block is that it does the calculation only once.

Effectively, each block has a number which means "Proof of work for the chain, if this block was the leaf".

It also has a PoW for best chain variable.  If you add a block and it is higher than the best chain, then it has to set that chain as the best.

Under your system, when a new block is added, it would be more complex.

Code:
A <- B <- C  <- D
     \ <- C' <- D'
          \  <- D''

When comparing to D' to D, D' would get extra weight from D''.  However, when comparing D' to D'', it wouldn't get the extra weight.  There isn't one "work" number for each node.

You are right, it may not be as simple as just keeping the one number of "proof for the chain as if block was a leaf" but in principle the following approach works (and its refinement below is probably much more efficient):

Approach 1 (very naive):
1) For each block, save the total amount of work in its subtree. Let us denote this saved value by W(X)=total work of subtree rooted at block X.

In your example above: the subtree of block C includes blocks C,D. The subtree of block C' includes C',D',D''.
2) When a new block arrives, simply increment W(X) for all blocks X along the path to the genesis by the amount of work represented by this new block.
3) If the new block was off-chain, find its connection to the main chain and check if a change to the chain is needed.

(the 2nd step is computationally expensive so the following refinement can be done)

Approach 2 (Lazy evaluation):
The idea here is to still maintain values for total work in the subtree rooted at each block, but to do the updates only when required.
For each block on the main chain we maintain a lower bound on the work in its subtree, for each node off-chain we maintain an exact number. The idea being that off-chain branches die out after a relatively short while, and on-chain blocks that are sufficiently deep in the tree do not need to be examined often.

Invariants: W(X) for any block X that is off chain is exact, W(X) for any block X that is on the main chain is a lower-bound on the real value.

The updates:

When a new block is added:
1) If it extends the current chain, do not update W(X) for any block.
2) Otherwise, if it is off-chain then traverse its path (towards the genesis) until it meets the main chain, and increment the work W(X) of each subtree along the way accordingly. Also traverse the main chain up to this point and update W(X) for each node along the way.
3) check to see if a change of chain is needed. If it is, the off-chain values are exact and can be used to find the new best leaf by following the greedy path towards the leaves.

I hope I explained this well enough for you to at least get the basic idea... Please let me know if I didn't, or if you spot a mistake.
newbie
Activity: 21
Merit: 0
I've been reading through the bitcoin source code.  I'm interested in working with anyone who tries to implement this.  Here's what I've found:

All the relevant code seems to be in main.cpp and main.h.

As currently implemented, we can traverse the blocktree from leaves->root, but Aviv and Yonatan's algorithm requires us to be able to traverse the blocktree from root->leaves.  The first thing that needs to be done is that we need to rewrite the tree to make it traversable in both directions.

After this, the code to determine which block is the "BestBlock" to append a new child block to needs to be pretty much completely rewritten.  The code currently computes the ChainWork (the amount of work needed to compute the current block and all blocks above the current block on the tree) of each block in the tree.  It then searches through the entirety of (the still valid parts of) the tree for the block that has the largest ChainWork, and it assigns this block as the BestBlock.  Instead, we need to traverse the tree from root->leaves, and compare the work in each subtree whenever we come to a fork.

Hi, thanks for taking a look! I think it's a good idea to map out all the changes that need to take place.

Comments:

1) I might be wrong about this, because I haven't looked at the implementation too carefully, but in principle, I think forward and backward traversal of the chain should already be possible: as far as I'm aware, the unspent transaction output set needs to be rolled back if a recent part of the chain is replaced. If I'm not mistaken (and somebody please correct me if I am) the unspent outputs are only maintained for the "best block" and if this is replaced by an alternate chain, it isn't built from scratch.

2) There is also a need to add hashes (or other identifiers) of newly discovered off-chain blocks into each block ("newly discovered" here means: not included inside any other previously reachable block). Nodes must then request such blocks if they do not already have them, and the propagation mechanisms / inv messages need to be adjusted accordingly.

Also, the code to invalidate and discard blocks on the tree needs to be rewritten.  I'm lost on how this works and how it should be rewritten.

3) I'm not sure what you mean by "the code to invalidate and discard blocks". Can you explain in more detail?

legendary
Activity: 1232
Merit: 1094
The way the client calculates the proof of work for each block is that it does the calculation only once.

Effectively, each block has a number which means "Proof of work for the chain, if this block was the leaf".

It also has a PoW for best chain variable.  If you add a block and it is higher than the best chain, then it has to set that chain as the best.

Under your system, when a new block is added, it would be more complex.

Code:
A <- B <- C  <- D
     \ <- C' <- D'
          \  <- D''

When comparing to D' to D, D' would get extra weight from D''.  However, when comparing D' to D'', it wouldn't get the extra weight.  There isn't one "work" number for each node.
newbie
Activity: 24
Merit: 0
I've been reading through the bitcoin source code.  I'm interested in working with anyone who tries to implement this.  Here's what I've found:

All the relevant code seems to be in main.cpp and main.h.

As currently implemented, we can traverse the blocktree from leaves->root, but Aviv and Yonatan's algorithm requires us to be able to traverse the blocktree from root->leaves.  The first thing that needs to be done is that we need to rewrite the tree to make it traversable in both directions.

After this, the code to determine which block is the "BestBlock" to append a new child block to needs to be pretty much completely rewritten.  The code currently computes the ChainWork (the amount of work needed to compute the current block and all blocks above the current block on the tree) of each block in the tree.  It then searches through the entirety of (the still valid parts of) the tree for the block that has the largest ChainWork, and it assigns this block as the BestBlock.  Instead, we need to traverse the tree from root->leaves, and compare the work in each subtree whenever we come to a fork.

Also, the code to invalidate and discard blocks on the tree needs to be rewritten.  I'm lost on how this works and how it should be rewritten.
newbie
Activity: 21
Merit: 0
Is there an implementation of this that someone is working on? I'd like to help implement it

We'd be happy to cooperate with anyone who wants to have a go at an implementation. While we have poked around in the bitcoin source code a bit, I'm afraid we're not familiar enough with the code to pull this off easily ourselves.

I'm also very interested in establishing a testnet that will be stress tested with many transactions per second. If you have an idea on how to do this, I'd be happy to know.

Feel free to email me directly regarding any of these projects.
legendary
Activity: 1176
Merit: 1036
Dash Developer
Is there an implementation of this that someone is working on? I'd like to help implement it
hero member
Activity: 798
Merit: 1000
Your attitude is too cavalier... Just because you weren't supposed to know the non-public inputs doesn't mean that those inputs don't leak when you examine the SNARK proof, you need a zero-knowledge proof to guarantee that nothing leaks besides the output.
But for the last time, why do you care whether the proof is zero-knowledge or not in this context?

I think the distinction of must-not-have and not-necessarily-need of ZKP vs. SNARK is being lost here. Probably not necessary to keep asking this question.
sr. member
Activity: 360
Merit: 251
F(prev_utxo_tree_hash, transaction set for current block) = utxo_tree_hash

I meant that it'd probably have less complexity to do: F(prev_utxo_tree_hash, transaction set for current block, utxo_tree_hash) == 1

It's a great idea, but what does it got to do with zero-knowledge?

I suppose it's partially zero-knowledge as you wouldn't need all of the inputs to the program. I did read somewhere that the SNARK programs can have public inputs as well as arbitrary inputs which are unknown to verifiers. Is that right? Some details would be hidden to the verifier, such as the transactions.

Your attitude is too cavalier... Just because you weren't supposed to know the non-public inputs doesn't mean that those inputs don't leak when you examine the SNARK proof, you need a zero-knowledge proof to guarantee that nothing leaks besides the output.
But for the last time, why do you care whether the proof is zero-knowledge or not in this context?
newbie
Activity: 43
Merit: 0
Yes, very sad

I would love to find out what's holding up bitcoin coders from integrating this double spending alert which they talk about in section 8.5 ?
http://eprint.iacr.org/2012/248.pdf
Not that it would resolve the issue of compromised mining pool, but still, it will significantly reduce the issue.


We need a moderator to step in and remove all this offtopic.

Or at least split the topic.

gmaxwell, do it please.
Pages:
Jump to: