Ultimate blockchain compression w/ trust-free lite nodes - page 18.

casascius

vip

Activity: 1386

Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

I am the one who originally suggested the 100 block interval... but I don't think I said that updating the meta tree only every 100 blocks is what should be done.

Rather, the meta tree should be rebalanced every 100 blocks, and between then, nodes should be added and deleted using methodology that avoids (procrastinates) having to recalculate the hashes for most or all the nodes in the tree any time there was a change. Otherwise, every incoming transaction will have a huge CPU time burden that's not sustainable. Rebalancing the tree is much like rebuilding a database index.

The reason why 100 blocks, is that there needs to be an agreement that everybody will do it at a certain time simultaneously, so that as hashes of the tree are exchanged, they will always refer to the same data set. An arbitrary number must be chosen that strikes a balance between the resource burden of rebalancing the tree (which favors a rebalance less frequently), versus the burden required of someone to get themselves up to speed (which favors a rebalance more frequently).

And while the meta tree may in fact be large, no node ever needs to accumulate older copies of the meta tree, so it is not as though having a 1GB meta tree is going to mean 1 GB for every 100 blocks. There is value to having a few old revisions of the meta tree (e.g. so that a node can opt to formulate its view of the network starting a certain number of blocks back and protect itself from orphan blocks) but there is no reason, for example, anyone needs to accumulate this meta data for historical purposes, as it is completely reconstructable from the normal block chain.

Realpra

hero member

Activity: 815

Merit: 1002

Quote from: Mageant on June 20, 2012, 03:10:15 PM

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

Seems like a another flaw in this design to me.

At VISA volumes I think each block would be one gigabyte. Even if half or less it would break the light nodes proposed here.

This solution is patch work, a swarm client combined with the ledger system is the only way. It doesn't have to be my design, but we will benefit a lot from swarm principles at some point.

apetersson

hero member

Activity: 668

Merit: 501

your nice graphs hosted on dropbox are no longer working. maybe upload them to imgur.com ?

Mageant

legendary

Activity: 1147

Merit: 1001

Quote from: jojkaart on June 20, 2012, 03:33:24 PM

Quote from: Mageant on June 20, 2012, 03:10:15 PM

You wouldn't have to update the meta-chain with every individual new block that comes out of the regular blockchain.

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

This also avoids any sychronization problems with the latest blocks out of the blockchain and the possibility of small forks (orphaned blocks) in the blockchain, since the meta-chain then would only sychronize with blocks of a highly confirmed number.

I don't see how it would reduce the load any to only update the meta-chain every 100 blocks. It'd just concentrate the update load on certain blocks. The planned tree structure for this would allow O(M*log N) updates, where M is number of updated transactions and N is the total number of transactions in the tree.

Because certain outputs could have been already respent within the last 100 blocks, so you could cut out those.

Also there is the issue of avoiding blockchain forks that get orphaned, so the lightweight client would store a small portion of the regular blockchain anyway, say the last X blocks, where X is the number of blocks where you can be relatively sure that they won't get orphaned. I don't know if avoiding orphaned blocks is very important though, maybe it's not.

jojkaart

member

Activity: 97

Merit: 10

Quote from: Mageant on June 20, 2012, 03:10:15 PM

You wouldn't have to update the meta-chain with every individual new block that comes out of the regular blockchain.

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

This also avoids any sychronization problems with the latest blocks out of the blockchain and the possibility of small forks (orphaned blocks) in the blockchain, since the meta-chain then would only sychronize with blocks of a highly confirmed number.

I don't see how it would reduce the load any to only update the meta-chain every 100 blocks. It'd just concentrate the update load on certain blocks. The planned tree structure for this would allow O(M*log N) updates, where M is number of updated transactions and N is the total number of transactions in the tree.

Mageant

legendary

Activity: 1147

Merit: 1001

You wouldn't have to update the meta-chain with every individual new block that comes out of the regular blockchain.

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

This also avoids any synchronization problems with the latest blocks out of the blockchain and the possibility of small forks (orphaned blocks) in the blockchain, since the meta-chain then would only synchronize with blocks of a highly confirmed number.

jojkaart

member

Activity: 97

Merit: 10

Quote from: unclescrooge on June 20, 2012, 01:06:14 PM

Do we need a fork for block size? (sorry I don't know a bit about this)

Yes, it will be a fork because all current nodes on the network will ignore any block bigger than 1 megabyte. The best way to actually do this is to set a date (well, a block number in practise), say a year or two in the future and release a version that will keep rejecting blocks bigger than one megabyte until that block number is reached. After that it will accept blocks bigger than one megabyte.

Then, when a block that is bigger than one megabyte is created, the nodes with client versions after the update was made will accept the block and all the older versions will reject it. In practise, there's unlikely to be a real fork unless a significant portion of users refuse to upgrade.

- Joel

unclescrooge

hero member

Activity: 868

Merit: 1000

Quote from: jojkaart on June 20, 2012, 12:49:03 PM

Quote from: unclescrooge on June 19, 2012, 02:56:05 PM

Quote from: DiThi on June 19, 2012, 11:03:50 AM

About rolling out new features and avoiding block chain splits, what we need is a good automatic system to automatically and democratically add any feature. Just like the implementation schedule of p2sh but being more like my proposal: time-flexible, with an additional temporal sub-chain, and for any feature. It may be difficult and problematic to code it only for one feature, but IMHO it's worth it if it's a generic implementation-deprecation system for determining the validity of blocks.

Maybe I'm misunderstanding the change your talking about. But I think this is dangerous.I use bitcoin because i trust the protocol behind to never change. If the majority or the devs can push a change in protocol, then I'm out. A way to compress the blockchain fine. A fork, hard or soft... mmmmm seems dangerous to me.

There is at least one guaranteed hard fork that is going to happen eventually. Blocks are currently limited to maximum of 1 megabyte in size. This could start hampering further growth of Bitcoin usage starting sometime next year.

However, this doesn't mean Bitcoin devs can just push any change they want. Users and miners can always opt to not use whatever new version they put out.

In short, getting any change out needs a massive majority support from Bitcoin users to happen.

Do we need a fork for block size? (sorry I don't know a bit about this)

jojkaart

member

Activity: 97

Merit: 10

Quote from: unclescrooge on June 19, 2012, 02:56:05 PM

Quote from: DiThi on June 19, 2012, 11:03:50 AM

About rolling out new features and avoiding block chain splits, what we need is a good automatic system to automatically and democratically add any feature. Just like the implementation schedule of p2sh but being more like my proposal: time-flexible, with an additional temporal sub-chain, and for any feature. It may be difficult and problematic to code it only for one feature, but IMHO it's worth it if it's a generic implementation-deprecation system for determining the validity of blocks.

Maybe I'm misunderstanding the change your talking about. But I think this is dangerous.I use bitcoin because i trust the protocol behind to never change. If the majority or the devs can push a change in protocol, then I'm out. A way to compress the blockchain fine. A fork, hard or soft... mmmmm seems dangerous to me.

There is at least one guaranteed hard fork that is going to happen eventually. Blocks are currently limited to maximum of 1 megabyte in size. This could start hampering further growth of Bitcoin usage starting sometime next year.

However, this doesn't mean Bitcoin devs can just push any change they want. Users and miners can always opt to not use whatever new version they put out.

In short, getting any change out needs a massive majority support from Bitcoin users to happen.

maaku

legendary

Activity: 905

Merit: 1014

Quote from: doobadoo on June 19, 2012, 09:20:00 PM

Can compression be used as an intermediate step along the way. That is, is the blockchain stored on the disk in an efficient manner? Also, why do noobs have to dl the whole block chain from peers? Its soooo slow. Couldn't each release come along with a zipped copy of the blockchain up to the date it was released, along with a hard-coded hash of that block chain up till then with a built in check. That way the user can just dl the whole shebang in one swoop.

These of course are not permanent measures but maybe as interim fixes for now.

Most of the sync time is spent doing ECDSA verification and disk I/O. Compression would actually make the problem worse. Packaging the block chain up to the latest checkpoint isn't a bad idea, but won't improve the situation as much as you probably think. The client would still have to verify the block chain, which means hours on end of 100% CPU utilization.

The real solution is to develop a way to avoid verification of historical data without compromising the security that verification provides. That's what this thread is about.

tevirk

newbie

Activity: 15

Merit: 0

Because most of the data in the block chain is hashes, it's not compressible at all.

doobadoo

sr. member

Activity: 364

Merit: 250

Can compression be used as an intermediate step along the way. That is, is the blockchain stored on the disk in an efficient manner? Also, why do noobs have to dl the whole block chain from peers? Its soooo slow. Couldn't each release come along with a zipped copy of the blockchain up to the date it was released, along with a hard-coded hash of that block chain up till then with a built in check. That way the user can just dl the whole shebang in one swoop.

These of course are not permanent measures but maybe as interim fixes for now.

2112

legendary

Activity: 2128

Merit: 1074

Quote from: cantor on June 19, 2012, 08:46:43 PM

So as I understand it, the issue here is to build a meta-chain that would contain digests of the contents of the main blockchain,

This isn't a good summary.

The issues are:

1) saving local storage space by purging spent transaction info, but at the same time maintaining cryptographic verifiability of the stored info.

2) augmenting the p2p protocol such that in order to participate in network client doesn't have to start from the genesis all the way until now, but start at now (or close past) and go back in time only to the oldest coin dealt in the transaction, not all the way back to the genesis.

3) relaxing the original peer-to-peer protocol to allow at least partial parasite-to-peer operation, where parasite is a pretend-peer that doesn't fully verify the relayed information but just repeats the latest rumors really fast. The goal is to limit the possible damage caused by such network rumormongers.

My description probably isn't clearer but it is closer to the truth.

etotheipi

legendary

Activity: 1428

Merit: 1093

Core Armory Developer

Quote from: cantor on June 19, 2012, 08:46:43 PM

So as I understand it, the issue here is to build a meta-chain that would contain digests of the contents of the main blockchain, in a way that would allow a lite-client to query a server for information stored into the blockchain, and use the meta-chain to verify the answer. And even the meta-chain itself would be constructed in such a way that it can be partially queried and verified using only its root hash, meaning the lite-client would only need a bound amount of storage.

I hope I'm doing a good summary of what is being discussed in this thread, considering that it's pretty late at night here Tongue

At any rate, subscribing, this looks really interesting.

Yeah, fairly accurate. I'll re-summarize here because my view of my own proposal has evolved over discussions of the last few days, so I figured it was a good time to restate it, anyway

The first goal is blockchain pruning: the amount of information needed to store the entire state of the network at any point in time is much less than the entire blockchain history. You can basically just save the "outer surface" of the transaction map instead of the entire history and still do full validation.

So I propose a structure that achieves this compression, and further organizes it accommodate a specific, common problem we want to solve anyway: a new node gets on the network with its imported wallet and doesn't need the whole chain, but would like to get a complete history of it's own transactions in a verifiable manner. I argue that with a more-straightforward "snapshot" tree, there's still room for deception by malicious peers, albeit not a whole lot.

Either way, I believe it's possible that new nodes can use a structure like this one to get up in running with full confidence in less than 50 MB of downloading, even in the far future. And such a solution will be necessary, so let's hash it out now...

However, for lite-nodes to reliably use the new information, there must be some kind of enforcement that miners provide correct answers when updating the root node. This could be done by hard-forking the network by changing the headers to require a valid root node, soft-forking by requiring a specific tx or coinbase script to contain a valid root, or as I propose: create a separate chain solely to allow mining power to "vote" on the correct snapshot root. Such a solution would then be completely optional and transparent to anyone who doesn't know or care about it -- though I would expect most miners and developers would be anxious to leverage it.

As galambo brought up -- the alt-/meta-chain idea is a kind of "staging area" for this new piece of the protocol. Once the community starts using it and becomes dependent on the information it provides, it could be integrated into the main chain (via hard- or soft-forking) as it would have super-majority support at that point.

cantor

newbie

Activity: 31

Merit: 0

So as I understand it, the issue here is to build a meta-chain that would contain digests of the contents of the main blockchain, in a way that would allow a lite-client to query a server for information stored into the blockchain, and use the meta-chain to verify the answer. And even the meta-chain itself would be constructed in such a way that it can be partially queried and verified using only its root hash, meaning the lite-client would only need a bound amount of storage.

I hope I'm doing a good summary of what is being discussed in this thread, considering that it's pretty late at night here Tongue

At any rate, subscribing, this looks really interesting.

casascius

vip

Activity: 1386

Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

I actually disagree that a hard fork would be required to implement this. A simple majority of mining power would be enough. New blocks meeting the new requirements would still be valid blocks to the old clients, the only change being that the majority of miners would work to orphan blocks not containing the proper meta tree root, so miners mining with an old client would have an impossible time getting any blocks.

galambo

sr. member

Activity: 966

Merit: 311

unclescrooge

hero member

Activity: 868

Merit: 1000

Quote from: DiThi on June 19, 2012, 11:03:50 AM

About rolling out new features and avoiding block chain splits, what we need is a good automatic system to automatically and democratically add any feature. Just like the implementation schedule of p2sh but being more like my proposal: time-flexible, with an additional temporal sub-chain, and for any feature. It may be difficult and problematic to code it only for one feature, but IMHO it's worth it if it's a generic implementation-deprecation system for determining the validity of blocks.

Maybe I'm misunderstanding the change your talking about. But I think this is dangerous.I use bitcoin because i trust the protocol behind to never change. If the majority or the devs can push a change in protocol, then I'm out. A way to compress the blockchain fine. A fork, hard or soft... mmmmm seems dangerous to me.

casascius

vip

Activity: 1386

Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Quote from: Realpra on June 19, 2012, 01:15:12 PM

I would also be surprised if someones upstream connection was stolen Wink

Perhaps you are unfamiliar with how the Internet works in places like Iran and China, where not only do they do MITM attacks on their citizens, they coerce SSL certificate providers to issue them bogus certificates so their citizens will be caught unaware.

Bitcoin needs to work there, too.

http://www.bgr.com/2011/08/30/iranian-government-said-to-be-using-mitm-hack-to-spy-on-gmail-other-google-services/

Realpra

hero member

Activity: 815

Merit: 1002

Quote from: casascius on June 19, 2012, 11:17:30 AM

Quote from: Realpra on June 19, 2012, 03:11:20 AM

Its called SSL I think.

I would be pretty surprised if nodes started identifying themselves through SSL certificates.

I would also be surprised if someones upstream connection was stolen Wink

At least with SSL known, it would only happen once before an SSL update was made.

Quote

That said however, what it looks like you have proposed is tiers of nodes and a structure that includes supernodes. I actually agree with you that such a structure will be critical to scalability of the network.

No my structure theoretically could operate entirely with swarm clients.

However in the case of mining pools you might have one node orchestrating which hash will be worked on.

The guy of this thread has super nodes, I don't. Maybe you got confused that way.

I don't like super nodes, I think it's bad centralized design.

Quote

DiThi

Could you give me a link to your proposal?

Topic: Ultimate blockchain compression w/ trust-free lite nodes - page 18. (Read 87974 times)