Hi all,
I'm a researcher at the Hebrew University in Israel, and I've been working with a student (Yonatan Sompolinsky) on a Bitcoin related paper. We have really exciting results that we are eager to share with the Bitcoin community. We would really like to get constructive feedback on our work from the many smart people in the community, which is why I am posting here.
Here is a link to the full paper:
http://www.cs.huji.ac.il/~avivz/pubs/13/btc_scalability_full.pdfTitle: Accelerating Bitcoin's Transaction Processing (Fast Money Grows on Trees, Not Chains)
Edit: Thanks for all the good feedback! Recap of main issues and questions added below.As the paper is quite lengthy, and is written for an academic audience (which, sadly, is not usually familiar with the protocol) we thought it best to also provide a quick explanation of our results aimed at Bitcoiners:
tl;dr: We suggest a protocol modification to the block chain that securely allows blocks to be generated around once per second, can handle over 200 transactions per second at these rates, and consumes under 0.5 MBps in terms of bandwidth (less at lower rates than 200 TPS). All of this with no increased susceptability to 50% attacks. This essentially solves the problem that caused Satoshi to set the 10 minute target for the block creation rate. We also analyze the number of transactions per second Bitcoin can handle with and without our modification. We note that block propagation times are the primary obstacle for scalability.
A more detailed explanation follows bellow. The primary goal of our research is to address Bitcoin's ability to process transactions quickly, and in large volumes. Here are our main findings:
Scalibility, Delays & Security:
We begin our paper by examining the exact effects of high transaction rates on Bitcoin's security (Following in the footsteps of previous work by Decker and Wattenhofer). The number of transactions per second (TPS) that Bitcoin can handle is limited by two main things: 1) The block creation rate (of 1 block every 10 minutes) and 2) the block size limit (currently at a 1MB default). These two parameters combine to limit the number of transactions per second that Bitcoin can process. The straightforward way to increase the TPS is to either increase the block size, or to increase the block creation rate. Both of these changes are controversial, and for good reason: both may affect the security guarantees of the protocol. First, let us consider an increase in the number of blocks per second (e.g., Litecoin's blocks that are created every 2.5 minutes, or even Fastcoin with its extreme 12 second blocks). Because blocks are created quickly, many conflicting blocks are created. Most will end up off the blockchain. The same symptom occurs when the block size is increased: large blocks take longer to propagate through the network (due to bandwidth limitations) and blocks that are created in the meantime are likely built on top of older blocks, i.e., they will be wasted.
The fact that many blocks are wasted lowers the security of the network and makes it more susceptible to 50% attacks. For example, if half the blocks are wasted in this manner, the network essentially wastes half its hash power on blocks that do not contribute confirmations to transactions. An attacker which is centralized and has no delays, can execute a so-called 50% attack with only slightly more than 33% of the hash power. This is because it can easily create longer chains than the rest of the network (botnets that still suffer from the effect of internal delays are less effective than centralized attackers).
Using different techniques, we analyze how many blocks end up in the chain, and how many are discarded, and use this to estimate the change in security for different parameters. Among other results, we show that transmitting blocks that contain only transaction hashes (instead of full transaction records) will greatly help scalability (i.e., this is not just a 2-fold saving in bandwidth, but rather a 16-fold increase in the number of transactions per second!).
Our suggested protocol modification (which appears in section 8 of our paper):Since high transaction rates imply many conflicting blocks are created, it would be quite useful if these blocks were not really lost. In fact, each block can be seen as supporting not just transactions inside it, but also those embedded in previous blocks. Even if a block is not in the main chain, we can still count the confirmations it gives previous blocks as valid. This is the basis of our proposed modification, which we call the "Greedy Heaviest-Observed Sub-Tree" chain selection rule.
Roughly speaking, since each block contains a hash of its predecessor, all blocks form a tree structure, rooted at the Genesis Block. Bitcoin currently selects the accepted history as the longest (or rather heaviest) chain in the tree. We suggest another approach: At each fork, pick the sub-tree containing the most blocks (or rather the blocks with greatest combined difficulty). Do this repeatedly until reaching a leaf. The path traversed is the chain that nodes should accept. But how does this help? Notice now, that an attacker that wishes to change the main-chain selected by the algorithm needs to make us change our decision in one of the forks. To do so, he needs to build more blocks than are contained in the entire sub-tree (and not just more blocks than are contained in the longest chain!).
Here is the pseudo-code of the GHOST chain selection rule:
1. SET B <- Genesis Block.
2. IF B has no successors: RETURN(B).
Else: SET B <- Child of B with heaviest sub-tree.
3. GOTO 2
The cost of such a modification: At low block creation rates, and small block sizes there is almost no difference between the longest-chain rule and the GHOST-rule. There is no cost. Both are almost identical since the longest chain in these cases is also the heaviest subtree. In high transaction rates, GHOST builds slightly less blocks in its main-chain (because it doesn't always extend the longest chain), thus slightly lowering the number of transactions accepted per second, but it does so more securely! Delays and many off-chain blocks no longer make it more susceptible to 50% attacks. This implies that we can increase the block creation rates and block size to levels that were previously too risky and easily make up for the loss in transaction volumes. In fact, we estimate that 1 second blocks can be easily combined with rates of over 200 TPS. This implies quite fast authorization times, but much more importantly, an increased granularity in confirmations. Even 1 confirmation gives some level of certainty regarding irreversibly, and it comes almost instantly when blocks are generated every second.
Since Bitcoin's security relies primarily on the number of confirmations received instead of on elapsed time, we end up getting irreversibility of transactions with very high probability in far less than 10 minutes.
Recap of main issues and questions raised:I'd like to clarify one thing: We do not claim to be able to solve every problem one might think of with regards to Bitcoin's various aspects (incentives, mining centralization, etc.) so I think we are criticized for problems that are already there in the protocol. Our main claim is that security gets very bad if high transaction rates occur, and the GHOST modification fixes that (and that alone). If you are uncomfortable with 1 second blocks, the modification is equally valuable at lower block rates. The need to address high transaction rates is still there.
Some specific comments:
1) Advantage of nodes with low latency. Nodes that can reach the rest of the network quickly have more blocks on the main chain and less orphans.
Answer: This is a serious problem that the current bitcoin protocol will face if high transaction rates (i.e., large block sizes) are sent through the system. Strong miners will be able to get more than their proportional share of the blocks. We don't improve things, but the GHOST modification doesn't hurt either. The only improvement that we offer, is that the 50% attack does not get worse in these scenarios.
2) DDOS attack: Malicious nodes can mine blocks above the genesis block at difficulty 1 and spam the network
Answer: This too as a problem that the current bitcoin protocol faces. It's why checkpoints are constantly put in place. (anyone can currently claim to have a chain with harder proofs of work, and just start sending difficulty 1 blocks in a long chain that never ends. See here for gmaxwell's explanation:
https://bitcointalksearch.org/topic/m.2014204.
Checkpoints can also be applied to our solution.
Edit: several people have noted that there are other solutions besides checkpoints (not currently implemented). We'll need to see if these can be adapted here as well. I currently have no reason to believe this won't be possible.
3) SPV clients need more bandwidth + storage if there are blocks every second (given that there will be 600 times as many block headers to download).
Answer: This is true. But the fix should probably be to find better mechanisms for the SPV nodes. For example, it should be possible to probabilistically verify a long chain of block headers without actually downloading all of them. Here's a basic idea: Have the full node build a merkle tree of all blocks +difficulties, and send it to the SPV client. The SPV client picks blocks at random from the chain (weighed by the difficulty the should have) and requests merkle branches to them (from the root it had sent) + checks their difficulty. just a few checks would be enough to verify that there is indeed enough work stored in the chain (with very high probability).
4) Implemenation: How would nodes know about orphaned blocks if they were offline for a while?
Answer: We suggest that each block will also contain hashes of off-chain (orphaned) blocks in its header (only those not written in blocks it built upon). This way, the main chain also contains a record of all off-chain blocks, so it knows what to ask for. Another reason we think it is a good idea: We think difficulty should be retargeted according to the total number of blocks built per second (incl. orphans), not based on the length of the longest chain (retargetting for that is harder at high rates).
5) Storage issue: a lot of space will be needed
Answer: Solutions like the mini-blockchain are still applicable, and allow you to keep only a record of recent events. Still there is no getting around the fact that high transaction rates will require a lot of storage (It's not easy scaling to the size of Paypal / VISA). Again, this is not caused by our modification. You do not need to keep the contents of orphaned blocks -- only their headers to verify difficulty. Having more blocks in the chain per second merely implies each one of them will have fewer transactions inside it. The only added cost is keeping more headers.