O(1) block propagation | Bitcointalksearch.org

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

Indeed. Your results looks very good.

However, the first real-world IBLT implementation that Gavin wants to consider is to leverage Matt Corrallo's relay node software which already achieves nearly an order of magnitude reduction in propagated block sizes for nodes which use this service.

So the data is no longer raw transactions, but smaller transaction hashes.
https://github.com/TheBlueMatt/RelayNode/blob/master/c%2B%2B/server.cpp

kallerosenbaum

newbie

Activity: 6

Merit: 0

I have done some statistical tests on encoding/decoding blocks in IBLT. Apart from my previous tests that tries to find reasonable values for valueSize and hashFunctionCount, I have also done tests that plots cellCount vs failure probability and diffCount vs failure probability. Please have a look at https://github.com/kallerosenbaum/bitcoin-iblt/wiki.

Conclusions:

1. 64 bytes looks like a good valueSize
2. Space savings seems to increase as diffCount increases. This is of course based on the assumptions explained in the wiki.
3. k=3 seems to be the best hashFunctionCount no matter how you look at it.

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

Thanks for the info Gavin. I will learn about how the relay nodes work and see whether I can add value. There is, of course, a bunch of reasons why this is the best approach.

Gavin Andresen

legendary

Activity: 1652

Merit: 2316

Chief Scientist

If you want to work on IBLT stuff...

... start with Matt's fast-relay code: https://github.com/TheBlueMatt/RelayNode

That is an "I know what I've already told my peers, so I won't tell them again" optimization for transaction data. I haven't tried to figure out how far that already-written-and-running code lets us scale, but I think that would be the first step.

Once you understand what Matt is doing, then figure out how an IBLT can further optimize to eliminate sending even lists of transaction IDs. The first step there is to figure out what software miners are using to build their blocks, and figuring out how hard it would be to get that software to do the IBLT thing (have similar policies for selecting transactions, and identical policies for ordering transactions inside the block).

instagibbs

member

Activity: 114

Merit: 12

Quote from: solex on November 19, 2014, 04:37:56 PM

Since only miners create IBLT blocks, but all nodes would need to process them, it seems the best implementation method is split the IBLT software into two parts:

I might be wrong, but in the short term it's not as big a deal that regular nodes get these IBLT stuff. Full nodes care much less about ~15 seconds of latency.

Long term it might make sense.

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

I can't get unexcited about IBLT so I just want to memo my recent thoughts here. Feedback always welcome :-)

Since only miners create IBLT blocks, but all nodes would need to process them, it seems the best implementation method is split the IBLT software into two parts:

a) IBLT encode
b) IBLT decode (set reconciliation)

The idea is to get the decode software widely adopted first, by giving it a head-start. It gets included with version 0.n, while the encode logic is delayed until version 0.n+x, any later version which is deemed appropriate. i.e. when an arbitrarily large majority of nodes have the decode software

A mining consensus would also be needed, probably requiring a new block version to indicate that decoding is supported, and encoded blocks only being acceptable when a super-majority exists for that block version.

If the decoding is as generic as possible, such that basic element dimensions (below) are parameters at the start of each IBLT, then the optimum values of these do not need to be agreed in advance, and may be in flux for a long time:
keySize
valueSize
keyHashSize
Number of cells

The nice thing is that no miner needs to change its own block propagation method, by super-majority. Hypothetically, if for example, only 10% of the hashing power ever wanted to propagate new blocks via IBLT then that would work fine.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: kallerosenbaum on November 05, 2014, 03:49:04 PM

I think that how differences between mempools changes over time is irrelevant. What is relevant is that the size of the current IBLT being encoded is not affected by how many transactions I put into it, given that the receiver has enough information to decode it.

The size of the IBLT is proportional to the differences between the 2 tables. If there is 1% difference between the memory pools, then the IBLT size is at least 2% of the memory pool size.

This means as memory pools get bigger (as blocks increase), the IBLT increases.

What Gavin is saying is that as blocks get bigger, more effort will be spent syncing the memory pools.

This could mean that the IBLT will grow at a slower rate than the blocks (but still grow).

kallerosenbaum

newbie

Activity: 6

Merit: 0

Quote from: solex on October 26, 2014, 11:02:42 PM

Fair points.

Quote from: TierNolan on October 26, 2014, 06:28:27 PM

Quote from: solex on October 26, 2014, 05:24:18 PM

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

I don't understand what you mean.

What I mean there is that because the differences increase relatively slowly, in practice the O(1) can hold for a while. Fixed size 1MB IBLT blocks being normal while disk block volumes increase through 5MB, 10MB, 15MB, which may take quite a few years. So it is an approximation, a reasonable functional description of what is happening in that time period. Of course, mathematically, the differences mount until, at some point 2MB blocks are needed. Who knows, maybe sidechains are doing heavy lifting by then.

It's still a hell of a lot better than O(n)

I think that how differences between mempools changes over time is irrelevant. What is relevant is that the size of the current IBLT being encoded is not affected by how many transactions I put into it, given that the receiver has enough information to decode it. When the new block is found, then we'll start over and make new guesstimates on the differences between mempools. Basically, it's O(1) with respect to the transactions within the current block being worked on, regardless of what the next or previous block might look like.

Also on a side note, it's up to the sender to create a large enough IBLT. She might want to gamble and make the IBLT really small. That would make the propagation faster, but the risk of a decoding failure becomes higher.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: solex on October 26, 2014, 11:02:42 PM

Fixed size 1MB IBLT blocks being normal while disk block volumes increase through 5MB, 10MB, 15MB, which may take quite a few years.

Ahh, good point.

As time passes the odds of a decode failure increases exponentially and then it has to be stepped.

Quote

It's still a hell of a lot better than O(n)

It would still increase with the block size though, but in steps.

It just increases network efficiency and acts as an incentive to keep memory pool rules identical between miners.

Gavin Andresen

legendary

Activity: 1652

Merit: 2316

Chief Scientist

RE: O(1) versus O(some-function-of-total-number-of-transactions):

Yes, it will depend on whether or not the number of differences goes up as the number of transactions goes up.

The incentives align so it is in everybody's best interest to make the differences as small as possible. I wouldn't be surprised if that causes innovations to drive the actual size to O(1) minus an increasing constant, as code gets better at predicting which transactions our peers do or don't have.

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

Fair points.

Quote from: TierNolan on October 26, 2014, 06:28:27 PM

Quote from: solex on October 26, 2014, 05:24:18 PM

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

I don't understand what you mean.

What I mean there is that because the differences increase relatively slowly, in practice the O(1) can hold for a while. Fixed size 1MB IBLT blocks being normal while disk block volumes increase through 5MB, 10MB, 15MB, which may take quite a few years. So it is an approximation, a reasonable functional description of what is happening in that time period. Of course, mathematically, the differences mount until, at some point 2MB blocks are needed. Who knows, maybe sidechains are doing heavy lifting by then.

It's still a hell of a lot better than O(n)

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: solex on October 26, 2014, 05:24:18 PM

No one else is chiming in so this is what I think: it depends.

Yeah. The whole system assumes that all miners use similar rules.

Quote

If all nodes are 100% synchronized, then, yes, O(1) block propagation can occur, even when clearing the whole mempool.

With 100% syncing, you can just send the header.

O(1) is a claim that as blocks get larger, the differences between 2 blocks does not increase.

Quote

A 1% difference is a working number based upon allowing 6 seconds for a new transaction to fully propagate and 600 seconds between blocks. So, perhaps tx mempools are mostly non-synchronized at the margins (new tx incoming, and old tx being discarded without confirmation), and the middle 98% of the tx mempools have only a 0.1% difference. Maybe an IBLT would encounter a 0.01% difference by always selecting the "best" 50% of the unconfirmed transactions.

I think miners could include when their block was created as part of the info they give. If each tx has a timestamp linked with it, then this would remove network latency as a problem.

If a block had only transactions that are at least 20 seconds old, then 5 seconds of latency wouldn't matter.

Getting this to work means that there needs to be a way to agree on what the timestamp is for each transaction though.

Quote

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

I don't understand what you mean.

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

Quote from: TierNolan on October 24, 2014, 12:34:17 PM

Why is it O(1) for block propagation?

If 1% of the transactions were different between 2 blocks, then the size of the IBLT would have to be 10 times larger to handle blocks that are 10 times larger.

No one else is chiming in so this is what I think: it depends.

It depends upon how synchronized all the unconfirmed tx mempools are. If all nodes are 100% synchronized, then, yes, O(1) block propagation can occur, even when clearing the whole mempool. In reality, mempools vary, by how much?, well it would be nice if there were metrics on it.

A 1% difference is a working number based upon allowing 6 seconds for a new transaction to fully propagate and 600 seconds between blocks. So, perhaps tx mempools are mostly non-synchronized at the margins (new tx incoming, and old tx being discarded without confirmation), and the middle 98% of the tx mempools have only a 0.1% difference. Maybe an IBLT would encounter a 0.01% difference by always selecting the "best" 50% of the unconfirmed transactions.

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

TierNolan

legendary

Activity: 1232

Merit: 1094

Why is it O(1) for block propagation?

If 1% of the transactions were different between 2 blocks, then the size of the IBLT would have to be 10 times larger to handle blocks that are 10 times larger.

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

Quote from: TierNolan on October 21, 2014, 06:25:55 AM

Does this mean that block selection method gets locked-in?

You tell the receiver how the txs in the block were selected. However, it assumes priority size and fee per kb.

A block that doesn't use the standard transaction selection algorithm is at a disadvantage.

This could be a good thing, since it would mean more consistency between miners.

I guess if block size is greater than demand for space, then it isn't a big deal, since the default rule is "all known transactions".

Pretty much locked-in by consensus. And the conclusion in bold is important.

IBLT makes the Bitcoin protocol function more closely to how all its users expect it to work. Right now there are several thousand users who expect their transaction(s) to get confirmed in the next block, and thousands of nodes who have a very similar view of those transactions. Usually, miners are good citizens and include many of the pending transactions.

However, a block could be mined where all this consensus is ignored and the new block is full of transactions which the miner has "pulled out of his ass" (real-world business or not) with fees payable to himself, a scenario which makes the existing 1MB limit look like a safety-blanket. Apart from providing PoW security to older blocks, the new block is unhelpful, and many of them together is an attack.

Transactions from the consensus pool are in an IBLT but they are canonically ordered and XOR'd in an offset manner such that a small percentage do not have to be known in advance, but the rest do, because that is the only way to peel them off. This goes way beyond normal data compression because the receivers know most of the contents in advance, hence O(1) block propagation occurs, or at worst O(_logn).

The block propagation delay cost of including all transactions will be low, so the incentive improves for miners to rake in as many fees as possible, getting a high percentage of transactions confirmed within the 10 minute average.

A rogue miner gaming IBLT by withholding a new one for a few seconds, and broadcasting his secret/spam transactions first, could be frustrated by requiring 20% of IBLT transactions to be earlier than the mean age of the new block and the previous one.

TierNolan

legendary

Activity: 1232

Merit: 1094

Does this mean that block selection method gets locked-in?

You tell the receiver how the txs in the block were selected. However, it assumes priority size and fee per kb.

A block that doesn't use the standard transaction selection algorithm is at a disadvantage.

This could be a good thing, since it would mean more consistency between miners.

I guess if block size is greater than demand for space, then it isn't a big deal, since the default rule is "all known transactions".

solex

legendary

Activity: 1078

Merit: 1006

100 satoshis -> ISO code

Someone has picked up the torch

Kalle Rosenbaum has written an IBLT test package in java, utilizing bitcoinj.

http://www.reddit.com/r/Bitcoin/comments/2jszdl/encodingdecoding_blocks_in_iblt_experimets_on_o1/

Quote

I've been working on an IBLT written in Java, as well as a project to encode and decode Bitcoin blocks using this IBLT. The main inspiration comes from Gavin Anresens (/u/gavinandresen) excellent writeup on O(1) block propagation, https://gist.github.com/gavinandresen/e20c3b5a1d4b97f79ac2.
The projects are called ibltj (https://github.com/kallerosenbaum/ibltj) and bitcoin-iblt (https://github.com/kallerosenbaum/bitcoin-iblt). In bitcoin-iblt I've run some experiments to find a good value size and a good number of hash functions to use. Have a look at the results at https://github.com/kallerosenbaum/bitcoin-iblt/wiki/BlockStatsTest
I'm very interested in discussing this and listen to your comments. I also need some help to specify other tests to perform. I'm thinking it would be nice to have some kind of "Given that there is no more that 100 differing transactions, I need 867 cells of size 270 B to have <0.1% chance that decoding fails.". Any thoughts on this?

The test bench is pretty capable. I can perform tests on arbitrarily large fake blocks constructed from real world transactions. I can modify the following parameters:

Number of transaction in the block
Number of differences between the sender and the receiver, both extra and absent transactions.
Number of hash functions
keySize
valueSize
keyHashSize
Number of cells

hayek

sr. member

Activity: 370

Merit: 250

Not sure if this helps anyone but CCN has a writeup out on IBLTs

https://www.cryptocoinsnews.com/bitcoin-in-bloom-how-iblts-allow-bitcoin-scale/

jonny1000

member

Activity: 129

Merit: 14

As many have said, of course this doesn’t completely solve the blocksize problem or feature, whichever way you look at it. Prior to this proposal I thought that block propagation times incentivising mining centralisation was potentially one of the network's biggest security issues. Therefore this proposal could help significantly with this. Maybe this does tip the balance slightly in favour of a moderate increase in the blocksize limit, however there are still many reasons for a limit overall.

These reasons include:
1. Bandwidth limits, as the transactions still need to be relayed and downloaded at least once. Although this could become less of an urgent network security problem.
2. Issues surrounding new nodes catching up with the network
3. The need for scarcity in the blockchain to create a market price for transaction fees, which will prevent spam and eventually be required to incentivise miners
4. Storage limits
5. Fast propagation times in the event IBLT block reconstruction fails, due to a deliberate attack or pure accident

What kind of blocksize limit could there be if this becomes implemented?

jonny1000

member

Activity: 129

Merit: 14

Thanks Gavin

There could therefore by some cases of an old fashioned block propagation race in the event that a re-org takes place and miners are unable to reconstruct the blocks. I can see how this could be rare enough not to incentivise mining centralisation or creating smaller blocks.

How exactly does the memory pool work? Could one keep a record of what the memory pool was like in say the last n blocks, such that if the situation you describe above occurs, one can reconstruct the blocks more effectively?

Topic: O(1) block propagation (Read 5583 times)