BIP: Increasing the Network Hashing Power by reducing block propagation time

rebroad

newbie

Activity: 26

Merit: 9

What happened with this? This seems like a very good idea - and it also facilitates more power to node operators who, if they want, can more easily choose not to relay blocks containing very few transactions.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: Mike Hearn on April 16, 2013, 09:26:58 AM

Bear in mind that nodes already tell each other about what transactions they have and their peers track that, via inv broadcasts and mapAlreadyHave.

If you have already sent a "template", then you can confirm a block by sending the 80 byte header and a 32 byte hash.

To distribute a block with 2000 350 byte transactions requires

Full Block: 2000 * 350 + 80 = 500kB
Header + tx hashes: 2000 * 32 + 80 = 64kB + coinbase
Header + short hashes: 2000 * 4 + 80 = 8kB + coinbase
Template: 32 + 80 = 112 bytes + coinbase

If the coinbase was included in the template, then it is simply 112 bytes.

If the template is sent 16 times per block, this represents 16 * 32 * 2000 = 1MB of extra data. This means a 3X increase in network traffic.

If templates required 1/4 of the block difficulty, then sending the templates would require 50% increase in network bandwidth.

However, there is nothing preventing someone from reusing a template. A miner who receives a template could verify it, and then just mine against it.

Mike Hearn

legendary

Activity: 1526

Merit: 1136

Bear in mind that nodes already tell each other about what transactions they have and their peers track that, via inv broadcasts and mapAlreadyHave.

The Bloom filtering infrastructure already uses this. If you set a no-op 0xFF filter on a connection, then as long as you've announced a transaction via an inv, it won't be sent to you again when a block is relayed. If you did NOT announce it, the peer will push the tx data to you automatically, you don't have to ask for it.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: TierNolan on April 13, 2013, 04:58:52 PM

Not sure how best to do spam protection. One way to do it would be to include a proof of work. Effectively, prove that some hashing power was thrown away to produce the message.

After thinking about it, the same effect can be achieved by sending one of the messages in the OP. If the POW is > 1/16 of the required POW and the message is otherwise valid, forward it to any node that you haven't sent a message with the same merkle root to. This reduces spam, but still allows miners to propagate their preview.

The threshold could be tweaked, but if 1/16 is used, then 15 times out of every 16 times a miner wins a block, the miner will be able to send a preview of the transaction hashes in advance of actually solving the block. 1 time in 16, the first solution that is less than 16 times the target will also be less than the target.

The preview is 32 bytes per transaction. If the average actual bytes per transaction is 350 bytes, then that is only a 10X improvement.

The short hashes idea would improve things. If there are 100 million transactions in the memory pool, then you only need on average 26 bits, so 4 bytes. A miner could reduce the size of his pending block by using short hashes. The rule could be to not propagate the messages, unless all hashes in the message hash to exactly one transaction contained in memory. The sub-hashes would increase with the log of the number of entries in the transaction memory pool.

TierNolan

legendary

Activity: 1232

Merit: 1094

A further improvement on this system would be to have miners pre-send their transaction lists.

When a miner decides on which transactions to include in a block. He sends a "pending" message. This is a list of hashes.

hash(coinbase)
hash(transaction1)
...
...
hash(last transaction)

When a client receives a pending message, it checks that it knows about all the transactions listed (except the coinbase).

It will then ask for any missing transactions in the list. When it has all the transactions, it will then forward the pending message.

it won't forward it if it contains unknown transactions.

The lookup table would map the merkle root to the transaction list.

Then when the miner solves the block he can send just the header.

When another client receives the header, it can pull the transaction list from the lookup table, using the merkle root as key. It will also have asked for all missing transactions in advance.

This creates the incentive to propagate transactions before including them in blocks. Otherwise your pending messages won't be forwarded.

Not sure how best to do spam protection. One way to do it would be to include a proof of work. Effectively, prove that some hashing power was thrown away to produce the message.

Mike Hearn

legendary

Activity: 1526

Merit: 1136

By the way, one possible elaboration of this idea is to have nodes send only the minimally required prefixes of the hashes in order to disambiguate transactions that it knows the peer has seen. For most blocks of current size you'd only need one or two bytes to distinguish the transaction that is included. It's probably not worth the additional complexity though.

Mike Hearn

legendary

Activity: 1526

Merit: 1136

I don't remember what the issue was with Matts code, just that there were some details that needed to be resolved in the code. It's one thing to say, another thing to do. But sure, it'd be a good improvement. Then we can tick it off the scalability page (it's listed there as a future improvement).

I hope you wouldn't submit a patch that adds a command only one letter different from an existing command? Wink

Sergio_Demian_Lerner

hero member

Activity: 555

Merit: 654

Quote from: Mike Hearn on February 19, 2013, 04:07:13 AM

Writing BIPs like this is better done after the pullreq to go with it. Matt already prototyped this optimization some time ago but didn't finish it, there were some details that needed addressing to handle cases where the exchange stops half way through and so on.

My proposal is almost stateless. There are no unknown or intermediate states. A single decision is made when a "inv" arrives containing a block hash.

Quote from: Mike Hearn on February 19, 2013, 04:07:13 AM

When somebody has implemented a patch to send blocks as hashes and it's passed code review, that's the time to write a BIP. For instance, you talk about a headers command, but that already exists and is used for something else (a response to getheaders).

Why write code if you know nobody will be using it? I prefer to know in advance.

The existent command is "headers" not "header"

Peter Todd

legendary

Activity: 1120

Merit: 1168

Quote from: Sergio_Demian_Lerner on February 19, 2013, 09:22:10 AM

Quote from: Peter Todd on February 19, 2013, 01:49:52 AM

You have to be careful with transmitting transaction hash lists rather than the transactions themselves. While it definitely makes propagation faster in the average case, it also means that the worst-case, a block entirely composed of transactions that have not been previous broadcast on the network, is significantly worse.

I don't think so. Since only the missing transactions are transmitted, the worst case is just like it is today. Maybe a little worse in the limit case that the decision to request individual txs instead of the full block was not a good one (many requested are made instead of a single request).

I don't mean worse than today, I mean worse than the average case.

Just implementing your idea is fine and probably should be done; the issue purely in making assumptions, particularly security assumptions, about how fast blocks will propagate based on it.

Sergio_Demian_Lerner

hero member

Activity: 555

Merit: 654

Quote from: Peter Todd on February 19, 2013, 01:49:52 AM

You have to be careful with transmitting transaction hash lists rather than the transactions themselves. While it definitely makes propagation faster in the average case, it also means that the worst-case, a block entirely composed of transactions that have not been previous broadcast on the network, is significantly worse.

I don't think so. Since only the missing transactions are transmitted, the worst case is just like it is today. Maybe a little worse in the limit case that the decision to request individual txs instead of the full block was not a good one (many requested are made instead of a single request).

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: Peter Todd on February 19, 2013, 08:20:58 AM

What you are proposing is a relay rule, so the issue is really what % of hashing power doesn't get the block because of the rule, but regardless if only 25% of the hashing power ignores the violating block, the other 75% should include an unknown TX in every block they mine. They'll have 25% less competition - an obvious advantage.

It depends on how many miners use the rule.

If 75% won't include the certain txs, then mining that tx type is a bad plan.

Also, even if only 25% refused to mine the txs, there is still a public goods problem for the majority coalition. It might be in the interests of the coalition to include at least 1 discouraged tx, but it is still in the interests of each individual member to make his blocks acceptable to as many miners as possible.

Refusing to mine against certain blocks also won't cause a fork and the rule could be "soft" and only apply to the last block in the chain.

Peter Todd

legendary

Activity: 1120

Merit: 1168

Quote from: TierNolan on February 19, 2013, 05:29:00 AM

There needs to be some way for people to confirm that transactions are known. Maybe the protocol rule could be changed to not propagate the message if any txs are unknown. This creates an incentive to only mine against well known transactions.

You need to be really careful with anything that discourages blocks. Any time a given rule for discouraging blocks is adopted by a minority, rather than a majority, the majority of hashing power that is not following that particular rule has an incentive to deliberately produce blocks that will be discouraged by the minority.

What you are proposing is a relay rule, so the issue is really what % of hashing power doesn't get the block because of the rule, but regardless if only 25% of the hashing power ignores the violating block, the other 75% should include an unknown TX in every block they mine. They'll have 25% less competition - an obvious advantage.

Sergio_Demian_Lerner

hero member

Activity: 555

Merit: 654

Quote from: gmaxwell on February 19, 2013, 01:15:09 AM

Quote

3. Compute the PoW and check if it's similar to the parent block PoW (+/- the possible expected variation). If it's not, then the command is ignored and the sender is blacklisted (DoS prevention).

The amount of work assigned to a header is a strict function of the prior chain. There is no permitted variation _at all_

Ok, it's not clear. What I meant is that the parent block may have required a difficulty adjustment that should be taken into account.

Quote from: gmaxwell on February 19, 2013, 01:15:09 AM

Quote

1. Verify that the block parent is known. If it's not, then discard the command.

The parent must be fetched or the network will not converge.

No, that's not correct.

If the parent is known, then we keep propagating the "header" command. If it's unknown, then we just wait for the full "block" command to arrive.

A single decision is made whenever a new block X is advertised with "inv":

If I have a previous "header" description of block X, then evaluate if it's better to download the block or to download the transactions separately.
If you decide the former, keep as normal.
If you decide the later, then ignore this "inv" command forever, and store a remainder in a memory table to reconstruct the block when all txs are received.

(If I have no "header" of X, keep as normal. )

This simplifies the implementation a lot. There is no race condition, no unstable state. Convergence is guaranteed.

Quote from: gmaxwell on February 19, 2013, 01:15:09 AM

If you're going to go crazy with compression you could actually transmit only half the transaction IDs (16 bytes rather than 32) and half that, I suppose.

Yes, I had noticed that too.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: Milkshake on February 19, 2013, 04:40:00 AM

Hmm, I have to say I don't really like this proposal. Orphaned blocks are not really wasted - if orphan rates were reduced, then the difficulty will be higher because more blocks will be solved, and that really doesn't change anything.

This means that the total proof of work increases, so the network is more secure.

The advantage is that it balances nodes that have slower propagation times.

However, it still means that mining a block that has a "hidden" transaction still gives the miner an edge. Under the current system, they have to provide the entire transaction to get the network hashing power to switch.

There needs to be some way for people to confirm that transactions are known. Maybe the protocol rule could be changed to not propagate the message if any txs are unknown. This creates an incentive to only mine against well known transactions.

I wonder if this could be combined with Bloom filtering from BIP-37 for notifying spent transaction outputs (TXOs).

Basically, with bloom filtering, each "object" sets a number of bits (say 4) based on a hash function. Each TXO matching an input into the block would set 4 bits in the filter output.

Unless all 4 bits matching a TXO are set, you are guaranteed that TXO is not in the block. However, false positives can occur.

If a block has 1000 transactions and each with an average of 3 inputs, then that is 3000 TXOs that are invalidated. This means that on average 12000 bits would be set in the filtered output. The filter length could be decided based on the number of TXOs and bits.

If the filter was 24000 long (so double the expected number of bits) and used 4 bits, then the odds of a false positive are (0.5) ^ 4 = 0.0625.

24000 bits is around 3kB.

However, Bloom filter length scales directly with the number to TXOs.

[Edit]

It is probably best to have a Bloom filter per tx. That way individual txs can be checked. With 4 bytes per filter, you get around a 6.25% chance of a false positive.

This allows nodes to verify individual txs.

Mike Hearn

legendary

Activity: 1526

Merit: 1136

Is anything new required to find transacations accepted into the mempool? Any accepted transaction is announced via an inv. You can just connect to a local node you run yourself and then getdata any tx than is announced.

Matt probably has his half-finished code around somewhere. It'd be great to have it integrated into the core P2P protocol, even though I agree that there are lots of other protocols that might be useful ways to move data around. The one I'd want to do as a highest priority is a Bluetooth protocol for phones to swap transactions with each other, so if one is offline the other can broadcast the transaction for it. Perhaps even tunneling a real P2P connection via BT. We prototyped a very basic example of that at a Berlin hackathon using insecure Bluetooth (so there's no pairing UI overhead), it worked but wasn't robust enough to integrate.

Milkshake

member

Activity: 98

Merit: 10

Milkshake

Hmm, I have to say I don't really like this proposal. Orphaned blocks are not really wasted - if orphan rates were reduced, then the difficulty will be higher because more blocks will be solved, and that really doesn't change anything.

Peter Todd

legendary

Activity: 1120

Merit: 1168

Two pull requests I would like to see, ones that make prototyping this stuff easier, would be "getrawblock/sendrawblock" RPC commands, and a "notifytransaction" mechanism.

The latter lets you find new incoming transactions that have been accepted to the mempool so you can inject them into your alternate distribution mechanism, later added to the client mempool with sendrawtransaction (a flag that disables re-broadcasting would be nice) The latter has already been attempted; I seem to remember there were some issues with locking that needed to be solved, but they looked tractable.

The former lets you do the same thing for entire blocks. (submitblock isn't quite what you want)

For instance a cool and potentially useful thing to do would be to operate a service that mulicasts the blockchain via Amazon's Simple Messaging Service. You'd sign up, and get a blockchain feed for your EC2 node without having to actually connect to anyone else. EC2 bandwidth is fairly expensive, so costs would be significantly less, and it could scale to absolutely enormous numbers of clients. Of course, you are trusting the service, but for some applications it's not a big deal.

Another cool thing to do would be to implement multicast broadcasting of the blockchain and transactions. While not commonly available, multicast is available on a few networks, so there will be some people who can make use of it. Equally you could start up a satellite downlink service or radio service.

The point is, actually implementing those ideas will be much easier with those two pull requests.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: Mike Hearn on February 19, 2013, 04:07:13 AM

Writing BIPs like this is better done after the pullreq to go with it.

Doesn't even have to be a pull request. It could also be an alternative protocol and a little gateway daemon. Both as a proof of concept and as a rapid deployment tool. We really could use some diversity in the P2P transports. As noted by Sergio Demian Lerner and others there are a lot of harry DOS attack corner cases... we can't have really any diversity in the Blockchain stuff, but we absolutely can and should have more in the node to node transport.

Mike Hearn

legendary

Activity: 1526

Merit: 1136

Writing BIPs like this is better done after the pullreq to go with it. Matt already prototyped this optimization some time ago but didn't finish it, there were some details that needed addressing to handle cases where the exchange stops half way through and so on.

When somebody has implemented a patch to send blocks as hashes and it's passed code review, that's the time to write a BIP. For instance, you talk about a headers command, but that already exists and is used for something else (a response to getheaders).

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: Peter Todd on February 19, 2013, 01:49:52 AM

You have to be careful with transmitting transaction hash lists rather than the transactions themselves. While it definitely makes propagation faster in the average case, it also means that the worst-case, a block entirely composed of transactions that have not been previous broadcast on the network, is significantly worse.

The obvious thing to do in that case is to have the sender choose whatever representation it believes will be more efficient based on what it knows from the peer. (e.g. transactions it has previously offered the peer, or the peer has offered it).

In the p2pool case for p2pool shares p2pool miners are incentive to pre-forward what the transactions they are mining by the fact that their shares are punished (nodes attempt to build off their parent instead of the share). Though I don't think that really usefully applies to the Bitcoin network.

Topic: BIP: Increasing the Network Hashing Power by reducing block propagation time (Read 6221 times)