Pages:
Author

Topic: O(1) block propagation (Read 5540 times)

legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
November 26, 2014, 03:16:16 PM
#30
Indeed. Your results looks very good.

However, the first real-world IBLT implementation that Gavin wants to consider is to leverage Matt Corrallo's relay node software which already achieves nearly an order of magnitude reduction in propagated block sizes for nodes which use this service.

So the data is no longer raw transactions, but smaller transaction hashes.
https://github.com/TheBlueMatt/RelayNode/blob/master/c%2B%2B/server.cpp

newbie
Activity: 6
Merit: 0
November 24, 2014, 02:22:35 PM
#29
I have done some statistical tests on encoding/decoding blocks in IBLT. Apart from my previous tests that tries to find reasonable values for valueSize and hashFunctionCount, I have also done tests that plots cellCount vs failure probability and diffCount vs failure probability. Please have a look at https://github.com/kallerosenbaum/bitcoin-iblt/wiki.

Conclusions:

1. 64 bytes looks like a good valueSize
2. Space savings seems to increase as diffCount increases. This is of course based on the assumptions explained in the wiki.
3. k=3 seems to be the best hashFunctionCount no matter how you look at it.
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
November 20, 2014, 08:56:35 PM
#28
Thanks for the info Gavin. I will learn about how the relay nodes work and see whether I can add value. There is, of course, a bunch of reasons why this is the best approach.
legendary
Activity: 1652
Merit: 2216
Chief Scientist
November 20, 2014, 04:45:23 PM
#27
If you want to work on IBLT stuff...

... start with Matt's fast-relay code: https://github.com/TheBlueMatt/RelayNode

That is an "I know what I've already told my peers, so I won't tell them again" optimization for transaction data. I haven't tried to figure out how far that already-written-and-running code lets us scale, but I think that would be the first step.

Once you understand what Matt is doing, then figure out how an IBLT can further optimize to eliminate sending even lists of transaction IDs. The first step there is to figure out what software miners are using to build their blocks, and figuring out how hard it would be to get that software to do the IBLT thing (have similar policies for selecting transactions, and identical policies for ordering transactions inside the block).

member
Activity: 114
Merit: 12
November 20, 2014, 10:31:18 AM
#26

Since only miners create IBLT blocks, but all nodes would need to process them, it seems the best implementation method is split the IBLT software into two parts:

I might be wrong, but in the short term it's not as big a deal that regular nodes get these IBLT stuff. Full nodes care much less about ~15 seconds of latency.

Long term it might make sense.
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
November 19, 2014, 05:37:56 PM
#25
I can't get unexcited about IBLT so I just want to memo my recent thoughts here. Feedback always welcome :-)

Since only miners create IBLT blocks, but all nodes would need to process them, it seems the best implementation method is split the IBLT software into two parts:

a) IBLT encode
b) IBLT decode (set reconciliation)

The idea is to get the decode software widely adopted first, by giving it a head-start. It gets included with version 0.n, while the encode logic is delayed until version 0.n+x, any later version which is deemed appropriate. i.e. when an arbitrarily large majority of nodes have the decode software

A mining consensus would also be needed, probably requiring a new block version to indicate that decoding is supported, and encoded blocks only being acceptable when a super-majority exists for that block version.

If the decoding is as generic as possible, such that basic element dimensions (below) are parameters at the start of each IBLT, then the optimum values of these do not need to be agreed in advance, and may be in flux for a long time:
keySize
valueSize
keyHashSize
Number of cells

The nice thing is that no miner needs to change its own block propagation method, by super-majority. Hypothetically, if for example, only 10% of the hashing power ever wanted to propagate new blocks via IBLT then that would work fine.
legendary
Activity: 1232
Merit: 1083
November 05, 2014, 05:01:53 PM
#24
I think that how differences between mempools changes over time is irrelevant. What is relevant is that the size of the current IBLT being encoded is not affected by how many transactions I put into it, given that the receiver has enough information to decode it.

The size of the IBLT is proportional to the differences between the 2 tables.  If there is 1% difference between the memory pools, then the IBLT size is at least 2% of the memory pool size.

This means as memory pools get bigger (as blocks increase), the IBLT increases.

What Gavin is saying is that as blocks get bigger, more effort will be spent syncing the memory pools.

This could mean that the IBLT will grow at a slower rate than the blocks (but still grow).
newbie
Activity: 6
Merit: 0
November 05, 2014, 04:49:04 PM
#23
Fair points.

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

I don't understand what you mean.

What I mean there is that because the differences increase relatively slowly, in practice the O(1) can hold for a while. Fixed size 1MB IBLT blocks being normal while disk block volumes increase through 5MB, 10MB, 15MB, which may take quite a few years. So it is an approximation, a reasonable functional description of what is happening in that time period. Of course, mathematically, the differences mount until, at some point 2MB blocks are needed. Who knows, maybe sidechains are doing heavy lifting by then.

It's still a hell of a lot better than O(n)  Smiley
 

I think that how differences between mempools changes over time is irrelevant. What is relevant is that the size of the current IBLT being encoded is not affected by how many transactions I put into it, given that the receiver has enough information to decode it. When the new block is found, then we'll start over and make new guesstimates on the differences between mempools. Basically, it's O(1) with respect to the transactions within the current block being worked on, regardless of what the next or previous block might look like.

Also on a side note, it's up to the sender to create a large enough IBLT. She might want to gamble and make the IBLT really small. That would make the propagation faster, but the risk of a decoding failure becomes higher.
legendary
Activity: 1232
Merit: 1083
October 27, 2014, 12:03:52 PM
#22
Fixed size 1MB IBLT blocks being normal while disk block volumes increase through 5MB, 10MB, 15MB, which may take quite a few years.

Ahh, good point. 

As time passes the odds of a decode failure increases exponentially and then it has to be stepped.

Quote
It's still a hell of a lot better than O(n)  Smiley

It would still increase with the block size though, but in steps.

It just increases network efficiency and acts as an incentive to keep memory pool rules identical between miners.
legendary
Activity: 1652
Merit: 2216
Chief Scientist
October 27, 2014, 09:27:22 AM
#21
RE: O(1) versus O(some-function-of-total-number-of-transactions):

Yes, it will depend on whether or not the number of differences goes up as the number of transactions goes up.

The incentives align so it is in everybody's best interest to make the differences as small as possible. I wouldn't be surprised if that causes innovations to drive the actual size to O(1) minus an increasing constant, as code gets better at predicting which transactions our peers do or don't have.
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
October 27, 2014, 12:02:42 AM
#20
Fair points.

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

I don't understand what you mean.

What I mean there is that because the differences increase relatively slowly, in practice the O(1) can hold for a while. Fixed size 1MB IBLT blocks being normal while disk block volumes increase through 5MB, 10MB, 15MB, which may take quite a few years. So it is an approximation, a reasonable functional description of what is happening in that time period. Of course, mathematically, the differences mount until, at some point 2MB blocks are needed. Who knows, maybe sidechains are doing heavy lifting by then.

It's still a hell of a lot better than O(n)  Smiley
 
legendary
Activity: 1232
Merit: 1083
October 26, 2014, 07:28:27 PM
#19
No one else is chiming in so this is what I think: it depends.

Yeah.  The whole system assumes that all miners use similar rules.

Quote
If all nodes are 100% synchronized, then, yes, O(1) block propagation can occur, even when clearing the whole mempool.

With 100% syncing, you can just send the header.

O(1) is a claim that as blocks get larger, the differences between 2 blocks does not increase.

Quote
A 1% difference is a working number based upon allowing 6 seconds for a new transaction to fully propagate and 600 seconds between blocks. So, perhaps tx mempools are mostly non-synchronized at the margins (new tx incoming, and old tx being discarded without confirmation), and the middle 98% of the tx mempools have only a 0.1% difference. Maybe an IBLT would encounter a 0.01% difference by always selecting the "best" 50% of the unconfirmed transactions.

I think miners could include when their block was created as part of the info they give.  If each tx has a timestamp linked with it, then this would remove network latency as a problem.

If a block had only transactions that are at least 20 seconds old, then 5 seconds of latency wouldn't matter.

Getting this to work means that there needs to be a way to agree on what the timestamp is for each transaction though.

Quote
Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.

I don't understand what you mean.
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
October 26, 2014, 06:24:18 PM
#18
Why is it O(1) for block propagation?

If 1% of the transactions were different between 2 blocks, then the size of the IBLT would have to be 10 times larger to handle blocks that are 10 times larger.

No one else is chiming in so this is what I think: it depends.

It depends upon how synchronized all the unconfirmed tx mempools are. If all nodes are 100% synchronized, then, yes, O(1) block propagation can occur, even when clearing the whole mempool. In reality, mempools vary, by how much?, well it would be nice if there were metrics on it.

A 1% difference is a working number based upon allowing 6 seconds for a new transaction to fully propagate and 600 seconds between blocks. So, perhaps tx mempools are mostly non-synchronized at the margins (new tx incoming, and old tx being discarded without confirmation), and the middle 98% of the tx mempools have only a 0.1% difference. Maybe an IBLT would encounter a 0.01% difference by always selecting the "best" 50% of the unconfirmed transactions.

Also, the starting point is not zero, it is 1MB blocks which can hold 1000 differences.
legendary
Activity: 1232
Merit: 1083
October 24, 2014, 01:34:17 PM
#17
Why is it O(1) for block propagation?

If 1% of the transactions were different between 2 blocks, then the size of the IBLT would have to be 10 times larger to handle blocks that are 10 times larger.
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
October 21, 2014, 06:34:45 PM
#16
Does this mean that block selection method gets locked-in?

You tell the receiver how the txs in the block were selected.  However, it assumes priority size and fee per kb.

A block that doesn't use the standard transaction selection algorithm is at a disadvantage.  

This could be a good thing, since it would mean more consistency between miners.

I guess if block size is greater than demand for space, then it isn't a big deal, since the default rule is "all known transactions".

Pretty much locked-in by consensus. And the conclusion in bold is important.

IBLT makes the Bitcoin protocol function more closely to how all its users expect it to work. Right now there are several thousand users who expect their transaction(s) to get confirmed in the next block, and thousands of nodes who have a very similar view of those transactions. Usually, miners are good citizens and include many of the pending transactions.

However, a block could be mined where all this consensus is ignored and the new block is full of transactions which the miner has "pulled out of his ass" (real-world business or not) with fees payable to himself, a scenario which makes the existing 1MB limit look like a safety-blanket. Apart from providing PoW security to older blocks, the new block is unhelpful, and many of them together is an attack.

Transactions from the consensus pool are in an IBLT but they are canonically ordered and XOR'd in an offset manner such that a small percentage do not have to be known in advance, but the rest do, because that is the only way to peel them off. This goes way beyond normal data compression because the receivers know most of the contents in advance, hence O(1) block propagation occurs, or at worst O(logn).

The block propagation delay cost of including all transactions will be low, so the incentive improves for miners to rake in as many fees as possible, getting a high percentage of transactions confirmed within the 10 minute average.

A rogue miner gaming IBLT by withholding a new one for a few seconds, and broadcasting his secret/spam transactions first, could be frustrated by requiring 20% of IBLT transactions to be earlier than the mean age of the new block and the previous one.
legendary
Activity: 1232
Merit: 1083
October 21, 2014, 07:25:55 AM
#15
Does this mean that block selection method gets locked-in?

You tell the receiver how the txs in the block were selected.  However, it assumes priority size and fee per kb.

A block that doesn't use the standard transaction selection algorithm is at a disadvantage. 

This could be a good thing, since it would mean more consistency between miners.

I guess if block size is greater than demand for space, then it isn't a big deal, since the default rule is "all known transactions".
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
October 20, 2014, 06:55:08 PM
#14
Someone has picked up the torch  Smiley

Kalle Rosenbaum has written an IBLT test package in java, utilizing bitcoinj.

http://www.reddit.com/r/Bitcoin/comments/2jszdl/encodingdecoding_blocks_in_iblt_experimets_on_o1/

Quote

I've been working on an IBLT written in Java, as well as a project to encode and decode Bitcoin blocks using this IBLT. The main inspiration comes from Gavin Anresens (/u/gavinandresen) excellent writeup on O(1) block propagation, https://gist.github.com/gavinandresen/e20c3b5a1d4b97f79ac2.
The projects are called ibltj (https://github.com/kallerosenbaum/ibltj) and bitcoin-iblt (https://github.com/kallerosenbaum/bitcoin-iblt). In bitcoin-iblt I've run some experiments to find a good value size and a good number of hash functions to use. Have a look at the results at https://github.com/kallerosenbaum/bitcoin-iblt/wiki/BlockStatsTest
I'm very interested in discussing this and listen to your comments. I also need some help to specify other tests to perform. I'm thinking it would be nice to have some kind of "Given that there is no more that 100 differing transactions, I need 867 cells of size 270 B to have <0.1% chance that decoding fails.". Any thoughts on this?

The test bench is pretty capable. I can perform tests on arbitrarily large fake blocks constructed from real world transactions. I can modify the following parameters:

Number of transaction in the block
Number of differences between the sender and the receiver, both extra and absent transactions.
Number of hash functions
keySize
valueSize
keyHashSize
Number of cells

sr. member
Activity: 370
Merit: 250
October 03, 2014, 11:38:45 AM
#13
Not sure if this helps anyone but CCN has a writeup out on IBLTs

https://www.cryptocoinsnews.com/bitcoin-in-bloom-how-iblts-allow-bitcoin-scale/
member
Activity: 129
Merit: 13
August 21, 2014, 05:13:08 AM
#12
As many have said, of course this doesn’t completely solve the blocksize problem or feature, whichever way you look at it. Prior to this proposal I thought that block propagation times incentivising mining centralisation was potentially one of the network's biggest security issues. Therefore this proposal could help significantly with this. Maybe this does tip the balance slightly in favour of a moderate increase in the blocksize limit, however there are still many reasons for a limit overall.

These reasons include:
1.      Bandwidth limits, as the transactions still need to be relayed and downloaded at least once.  Although this could become less of an urgent network security problem.
2.      Issues surrounding new nodes catching up with the network
3.      The need for scarcity in the blockchain to create a market price for transaction fees, which will prevent spam and eventually be required to incentivise miners
4.      Storage limits
5.      Fast propagation times in the event IBLT block reconstruction fails, due to a deliberate attack or pure accident

What kind of blocksize limit could there be if this becomes implemented?
member
Activity: 129
Merit: 13
August 12, 2014, 03:55:11 PM
#11
Thanks Gavin

There could therefore by some cases of an old fashioned block propagation race in the event that a re-org takes place and miners are unable to reconstruct the blocks.  I can see how this could be rare enough not to incentivise mining centralisation or creating smaller blocks.

How exactly does the memory pool work?  Could one keep a record of what the memory pool was like in say the last n blocks, such that if the situation you describe above occurs, one can reconstruct the blocks more effectively?
Pages:
Jump to: