Pages:
Author

Topic: New paper: Accelerating Bitcoin's Trasaction Processing (Read 36374 times)

full member
Activity: 1092
Merit: 227
I am seeing the paper in layman’s view since I’m not that too technical. However I have been involved in one thread in regard to understand why the block size has been kept limited to 1MB or 3MB or something else. Now here in your paper I read that you can actually modify the block size and include higher number of transactions. Before this paper my understanding was the block was limited in the size as well as it can never be touched since the time Satoshi released it.

But is this happening any soon or does this paper suggest the scalability issue and how it can be resolved “if” this is implied.
newbie
Activity: 2
Merit: 0
i will be carefully monitoring this technology; if it proves to be successful on paper and test runs, and bitcoin developers fail to implement this, i'll be investing in alt-coin which uses it.
In the end, this proposal did not receive recognition from bit development. He now has an alternative version called Kaspa, which implements the extension of the BTC consensus. It is Kaspa, created by avivz78 and his student Yonatan Sompolinsky. A single block of 1s will quickly achieve 32bps, complete decentralization, and pow coins.
member
Activity: 114
Merit: 12

A fairly broken, over-complicated, brittle version of 2-of-2 signing.

Why not just use GA.it or something similar? Same benefits(required some trust), with no crazy systematic risks.

(I know why but it's rhetorical. Any decentralized system that claims secure 0-conf without trust is lying.)
sr. member
Activity: 433
Merit: 267
Is there a better description about what it solved exactly? Instant worldwide consensus is impossible. Running two nodes on the same computer and getting fast confirmation times by jacking up the block times gets a person a first class ticket to Facepalm Town.

Update: Alright, this is what he's talking about;
https://www.darkcoin.io/about/what-is-darkcoin/instant-transaction-technology/

So Darkcoin uses the idea of "locking" a transaction by "masternodes", and the masternodes can somehow give a user some kind of confidence that the transaction isn't being double spent.

The exact same problems that you have with consensus otherwise would exist with "locks", unless I'm missing something.

Under 4.3 I would like to see a better defense of "This would happen very quickly, in the matter of a few seconds in most cases."
Is this because of highly centralized "Masternodes"?
hero member
Activity: 530
Merit: 500
legendary
Activity: 1232
Merit: 1094
I think that re-requesting any transactions is too costly, so we should work out the prefix size so that there are "probably" no collisions, but only just. Someone needs to do the math.

Right.  The prefix length depends on the the size of the memory pool and the number of transactions in a block.

Assume there are 100k entries in the memory pool and 4k transactions per block.

If you have 4 bytes per transactions, then the odds of a transaction of a transaction matching one of the transactions in the memory pool is (100k * (1/pow(2, 32)) = 1 in 43,000.

If there are 4000 transactions in the block, then at least one of them will collide every 43000/4000 = 10.75 blocks, which is probably acceptable.

However, there is no clear definition of the size of the memory pool.  When a peer requests a block, it could specify how many bytes per transaction.  The reply would use at least that many bytes per entry.
hero member
Activity: 527
Merit: 500
Would you be including the prefixes of the merkle tree nodes too? These would serve as a checksum, so the receiver would at least know which transaction has the colliding prefix. In fact, with the tree nodes included, you could be much less conservative with the prefix length, sending 2 or event 1 byte(s).

Collisions could also be handled by having a re-send system.

Hashes were grouped together into 32 hash groups and the merkle root for those hashes.

The sender could estimate how much of a prefix is required.  If 4 bytes was required, then a group of 32 hashes would require (32 * 4 + 32) bytes = 5 bytes per hash.

The receiver could replace the 32 hashes with its best guess and then check the CRC/Merkle root.  If it doesn't match, then it could ask for those hashes in expanded form.

Assuming a block with 4000 transactions in a block.

Peer 1: Sends block with 5 bytes per hash.

This works out at 20kB

Peer 2: Checks block and finds 2 groups with a mismatch

Peer 2: sends tx request

byte[32]: Block hash
varInt: group count
int: index 1
int: index 2

This works out as 41 bytes

Peer 1: Sends full transaction hashes for those blocks

This works out at 64 * 32 = 2048 bytes

This is much less than 4000 * 32 = 128kB.  However, it is at the cost of an additional round trip delay.

If each compressed hash was prefixed by a length, then the sender could try to estimate which hashes require longer prefixes.

The birthday paradox may cause potential issues here.

This "groups" idea is essentially the same thing as the merkle tree, if you make the group size 2. With smaller group size, you have to resend less transaction hashes in the case of collision. Also, the birthday paradox is less of an issue; you could use smaller prefixes.

Also, consider that we can compress the merkle nodes. Collisions in the nodes can be checked with the parent nodes all the way up the tree, but it becomes exponentially less likely that you'll need to resend a branch as you move up the tree. The advantage of this hierarchical checksum is that you can make the prefixes, of the nodes and the transaction hashes, much smaller.

I think that re-requesting any transactions is too costly, so we should work out the prefix size so that there are "probably" no collisions, but only just. Someone needs to do the math.
legendary
Activity: 1232
Merit: 1094
Our proposal includes keeping the hashes of off-chain blocks in each block (new ones not included in any of the ancestors of the block). The chain, when read sequentially therefore contains all off-chain block hashes, and thus any node can know the entire tree (if a node is missing some block it simply requests it or its header).

Interesting.

That eliminates the need for each sub-tree to be evaluated.  The weight for each block can be directly calculated purely from within the block.

You would need to embed a sub-header into the coinbase (or OP_RETURN tx).

Each node would have to remember all headers from orphaned blocks that are referenced.  A block would be invalid if it re-references a header.
member
Activity: 70
Merit: 10
It would quickly lead to effect like I described above, which would break the system. Soon all mining nodes would be at one place and all other nodes have no incentive to participate or believe the information is correct. So its more than just connectivity. At least the hashing power is distributed in the sense of geolocation, as delays are not an issue. I believe a careful analysis would show that 10 minutes is a well chosen factor, depending on variance of propagation in relation to proof of work time. I'm not sure if somebody has done that analysis, but I think it's more than likely satoshi thought about this in these terms. The 10 minute treshold takes care of all eventualities.
sr. member
Activity: 360
Merit: 251
Most mining now is done in certain hot spots, such as the big mine of Iceland [1]. Well, now you have the effect that any other mining is more profitable in Iceland, and miners should move to Iceland because delays are shorter and the heaviest subtree is located there. Simply by being located there you are nearest to the heaviest subtree (assuming that this miner is the biggest). That would introduce the effect that mining would quickly would be located in one or two places, when in fact we want the exact opposite.

Yes, if some miner nodes have better network connectivity to large portions of the total hashpower, then a shorter blocktime will benefit them more. The analysis in the GHOST paper doesn't take into account scenarios in which network connectivity among the miner nodes is unequal.
member
Activity: 70
Merit: 10
Response to my critique of GHOST.

Quote
Huh? GHOST isn't different from Bitcoin in this regard.

No, not at all. The essence of bitcoin is: each node works on PoW independently. That's also why there are discrete blocks. As you said bitcoin ignores other nodes PoW - but why? The reason is that once a node requires information from other nodes (during working one block by default) you get an unstable network, because it takes time to transmit information. For example a node could run his own implementation and try to find nodes that do more PoW and disregard other weaker nodes. As soon as you introduce as an input the work of others, one node has to know about all(!) other nodes, so that the calculations are fair. One could argue it is a random distribution, but that assumption clearly doesn't hold. Ideally all peers are equal, but hashing power is not even. The bitcoin algorithm would be much better if it could somehow force nodes to be of equal hashing power. That's whats meant by the "proof-of-work is essentially one-CPU-one-vote" comment in the paper.

To be more clear, by way of practical example. If bitcoin would use a ghost like rule today.. Most mining now is done in certain hot spots, such as the big mine of Iceland [1]. Well, now you have the effect that any other mining is more profitable in Iceland, and miners should move to Iceland because delays are shorter and the heaviest subtree is located there. Simply by being located there you are nearest to the heaviest subtree (assuming that this miner is the biggest). That would introduce the effect that mining would quickly would be located in one or two places, when in fact we want the exact opposite. I suppose the reason that this entirely non-obvious is that people intuitively disregard the latency of speed of light. The other problem is how one looks at robustness of statistics. Traditional mathematics is very bad at understanding robustness, especially in economics. The Ghost paper claims to "prove" attacks are not possible. Note that the original bitcoin paper doesn't do that.

Why is the block time chosen to be 10 minutes? The reason is that it takes time to transmit information through a network. One has to really take into account that TCP/IP packets are not point to point connections. So the GHOST paper uses averages of delay times, but doesn't consider variance. The long 10 minutes is a safeguard against such geo-distribution effects. For example in theory people with bad interconnections can mine as well. The 0.5MB/sec throughput mentioned in the paper is not even achieved in developed countries, and so this leads to some biases the original design very thoughtfully avoids (which is why it is so stable).

[1]
http://dealbook.nytimes.com/2013/12/23/morning-agenda-the-bitcoin-mines-of-iceland/?_r=0
newbie
Activity: 21
Merit: 0
Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

Block headers could be broadcast for orphaned blocks.  However, it is possible that 2 nodes will have different headers accumulated.

With a linear blockchain, proof of work is always exactly agreed by all miners.

Since all block headers are linked.  All nodes can agree on the proof of work linked to a particular header.  If a node was missing any blocks on that chain, then it wouldn't be able to form the chain at all.

With GHOST, 2 nodes could disagree about the proof of work for a particular chain.  There is no way to be sure that you have all the orphans.

Very simple:

Our proposal includes keeping the hashes of off-chain blocks in each block (new ones not included in any of the ancestors of the block). The chain, when read sequentially therefore contains all off-chain block hashes, and thus any node can know the entire tree (if a node is missing some block it simply requests it or its header).
legendary
Activity: 1232
Merit: 1094
Would you be including the prefixes of the merkle tree nodes too? These would serve as a checksum, so the receiver would at least know which transaction has the colliding prefix. In fact, with the tree nodes included, you could be much less conservative with the prefix length, sending 2 or event 1 byte(s).

Collisions could also be handled by having a re-send system.

Hashes were grouped together into 32 hash groups and the merkle root for those hashes.

The sender could estimate how much of a prefix is required.  If 4 bytes was required, then a group of 32 hashes would require (32 * 4 + 32) bytes = 5 bytes per hash.

The receiver could replace the 32 hashes with its best guess and then check the CRC/Merkle root.  If it doesn't match, then it could ask for those hashes in expanded form.

Assuming a block with 4000 transactions in a block.

Peer 1: Sends block with 5 bytes per hash.

This works out at 20kB

Peer 2: Checks block and finds 2 groups with a mismatch

Peer 2: sends tx request

byte[32]: Block hash
varInt: group count
int: index 1
int: index 2

This works out as 41 bytes

Peer 1: Sends full transaction hashes for those blocks

This works out at 64 * 32 = 2048 bytes

This is much less than 4000 * 32 = 128kB.  However, it is at the cost of an additional round trip delay.

If each compressed hash was prefixed by a length, then the sender could try to estimate which hashes require longer prefixes.

The birthday paradox may cause potential issues here.
hero member
Activity: 527
Merit: 500

One thing I note is that there are lots of ways to optimise block wire representations. Sending blocks as lists of hashes, for example, would use 32 byte hashes in our current code just because it's lazily reusing the Bloom filtering path. But 32 bytes is massive overkill for this. You only need to distinguish between transactions in the mempool. You could easily drop to 16, 8 or even 4 byte hashes (hash prefixes) and still be able to disambiguate what the peers block message contains very reliably. Indeed they could be varint encoded so the hash is only as long as it needs to be given current traffic loads. Malicious actors might try to create collisions to force block propagation times up, but if peers negotiate random byte subsets upon connect for relaying of block contents then such an attack becomes impossible.


I thought of this also. How do you deal with collisions, though? What if the receiver of a hash list knows a transaction whose prefix collides with one in the list, but isn't in the list itself?

Would you be including the prefixes of the merkle tree nodes too? These would serve as a checksum, so the receiver would at least know which transaction has the colliding prefix. In fact, with the tree nodes included, you could be much less conservative with the prefix length, sending 2 or event 1 byte(s).

Perhaps that makes no sense. I'm pretty noobish with the protocol specifics.

I'm very interested in reducing block propagation times because I believe slow times will eventually drive transaction costs to uncompetitive levels. This is due to the inherent cost of "orphan risk" which comes from including transactions. see here https://bitcointalksearch.org/topic/m.3669703.



legendary
Activity: 1232
Merit: 1094
It was probably meant as they give the same answer as things are at the moment, but they wouldn't if the number of blocks per unit of time would be increased significantly.

Right, with a 10 minute block time, there are so few orphans that there is total agreement.

If the block time was 1 second, then there would be a huge number of orphans.

Some system for quickly finding the leaf block is important.  Each time a header is added, you have to update an entire tree.

An analysis of the optimal strategy for miners under GHOST would be important.

For bitcoin, broadcasting new blocks as fast as possible is optimal.  Is that still true for GHOST.

With GHOST, there would be lots of ties and near ties, so colluding to break ties in favour of a group may give those in the group an advantage.
sr. member
Activity: 430
Merit: 250
In practice, this isn't really a problem, because the 2 rules almost always give the same answer.

If this was true then using GHOST with all its added complexities would be pointless.
It was probably meant as they give the same answer as things are at the moment, but they wouldn't if the number of blocks per unit of time would be increased significantly.
sr. member
Activity: 360
Merit: 251
In practice, this isn't really a problem, because the 2 rules almost always give the same answer.

If this was true then using GHOST with all its added complexities would be pointless.
legendary
Activity: 1232
Merit: 1094
Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

Block headers could be broadcast for orphaned blocks.  However, it is possible that 2 nodes will have different headers accumulated.

With a linear blockchain, proof of work is always exactly agreed by all miners.

Since all block headers are linked.  All nodes can agree on the proof of work linked to a particular header.  If a node was missing any blocks on that chain, then it wouldn't be able to form the chain at all.

With GHOST, 2 nodes could disagree about the proof of work for a particular chain.  There is no way to be sure that you have all the orphans.

In practice, this isn't really a problem, because the 2 rules almost always give the same answer.

If you did push the the block rate to a point where network latency is a problem, you could have 2 chains where there is disagreement.

Ensuring that all headers are well distributed between nodes is critical.  Disagreements are not likely to last long anyway.
sr. member
Activity: 430
Merit: 250
Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

Blocks are propagated across the network just like transactions are, so where exactly do you see a problem?
member
Activity: 70
Merit: 10
It's done my magic. All the magic nodes transmit information faster than speed of light. They connect to parallel universes. Each universe is a uni-node, with probability of exactly 1/n. Due to quantum emission wormholes eliminate cost of friction, so miners can get the coinbase reward. IP address assignment by the federal reserve eliminates the so called -31.23% attack. I can prove this theorem using advanced stochastical analysis, but only for lambda smaller than 0.03.
Pages:
Jump to: