Pages:
Author

Topic: Proposal to solve bitcoin scaling problem - page 2. (Read 4507 times)

hero member
Activity: 767
Merit: 500
yoir your proposal had merit but I think such a fundamental change to the basic bitcoin message is unlikely to occur with the protocol already at the maturity that it's at. I think solutions to this problem are more likely to build on the existing ideas in the bitcoin paper e.g. tree pruning rather than introducing the new elided block construct.

It's a pity this thread is in the newbie forum where not many people will read it. Hopefully the discussion here will promote you out of noob purgatory and you can move this discussion elsewhere.

Will
bji
member
Activity: 112
Merit: 10
To verify a transaction the client doesn't need the whole block chain. It merely stats listening for blocks and waits until it sees a block with the transaction in it. Then it merely has to watch for all subsequent blocks and ensure that those blocks when attached to the block containing the transaction remain the longest chain. An attacker would have to create 'valid' blocks on the block containing the invalid transaction at a faster rate than the rest of the network is generating board blocks on the valid chain in order for the transaction to remain valid.

Will

There will come a time when just listening for blocks is too expensive.  Right now the average block size is something like 10K and that's with very few transactions per second.  Can end users really expect to put their mouth over the bitcoin firehose when the average block size is megabytes?  Can end users ever reasonably be expected to have to watch every transaction that goes by just to pick out their own?

It's working now because the transaction counts are so low.  It will not work at all once bitcoin starts being used 'for real'.

This is why I propose a standardized protocol to allow clients to query for just the transactions, Merkle trees, and block headers that they need, iteratively, and allow for a way for clients to pay peers for delivering these costly pieces of data to them.
legendary
Activity: 1204
Merit: 1015
hero member
Activity: 767
Merit: 500
To verify a transaction the client doesn't need the whole block chain. It merely stats listening for blocks and waits until it sees a block with the transaction in it. Then it merely has to watch for all subsequent blocks and ensure that those blocks when attached to the block containing the transaction remain the longest chain. An attacker would have to create 'valid' blocks on the block containing the invalid transaction at a faster rate than the rest of the network is generating board blocks on the valid chain in order for the transaction to remain valid.

Will
bji
member
Activity: 112
Merit: 10
OP: go read the bitcoin paper. The block chain is intentionally stored as a merkle tree for the very reasons you describe. The paper also describes how a low resource client can trim the tree and yet atill maintain security.

It just happens that the official bitcoin client doesn't do this yet but I expect other clients e.g. bitcoinj that are specifically designed for for resource constrained platforms to implement these optimisations soon enough.

Will

Please read the paper yourself.  The paper doesn't say anything about the problem of distributing the Merkle tree to clients.  That is what my proposal is about.  With the current client and the current protocol, each client will have to download the entire block chain even if they only store the transactions, Merkle trees, and block headers they care about.  This is infeasable and is going to be a big problem within 6 months to 1 year.

Or perhaps there is something obvious I didn't see in the white paper?  Can you explain how clients are going to trim their tree without first downloading gigabytes of data once the block chain is that big?
hero member
Activity: 767
Merit: 500
OP: go read the bitcoin paper. The block chain is intentionally stored as a merkle tree for the very reasons you describe. The paper also describes how a low resource client can trim the tree and yet atill maintain security.

It just happens that the official bitcoin client doesn't do this yet but I expect other clients e.g. bitcoinj that are specifically designed for for resource constrained platforms to implement these optimisations soon enough.

Will
bji
member
Activity: 112
Merit: 10
not to be harsh, you sound like a idiot.

Contradiction?

Quote
the "early" in "early-adopters" was two years ago

OK, if you say so.

Quote
what the hell is an elid? don't be a troll Smiley

Look up the word 'elide' in a dictionary.

Quote
btw, I have a word in my brain called "Netflix"

I don't understand what you mean.
newbie
Activity: 14
Merit: 0
not to be harsh, you sound like a idiot.
the "early" in "early-adopters" was two years ago
what the hell is an elid? don't be a troll Smiley

btw, I have a word in my brain called "Netflix"
bji
member
Activity: 112
Merit: 10
EDIT: Hello, now that this was moved from the newbie forum, I have removed the section at the beginning in which I complained about some unrelated (and probably not interesting to anybody) aspects of bitcoin.  If anyone feels it is important to see these, please let me know and I will re-post them.END EDIT

Proposal: Bitcoin Elided Blocks

Introduction:

The bitcoin protocol in its most basic form does not scale for end users.  Downloading the entire block chain currently, as of June 16, 2011, requires several hundred megabytes of data transfer, and the rate of generation of new data is increasing.  Within a year it is likely that the block chain will be over 1 GB in size.  It is impractical to expect that end users will be able to download this much data to start using bitcoin, and additionally that they will have sufficient resources to download new blocks on an ongoing basis at this increasing rate.

Furthermore, there are some current clients (smartphones as an example) for which the current multi-hundred-megabyte block chain download is already impractical.  Therefore this problem is already relevant to the success of bitcoin (how can bitcoin succeed if new users are unable to practically join?) and will only become more relevant as time goes on.  Right Now is a criticial time for bitcoin: the barriers to entry must be as minimal as possible in order for the user base to grow to a self-sustaining level.

The Problem:

Currently the bitcoin client validates transactions locally by examining the entire block chain.  This is the reason that the client must download the entire block chain - to be able to validate transactions.  The client does not need to download blocks to initiate a transaction, but even then, it is likely that the user will want to maintain an up-to-date local block chain so that they can verify that any payments they send were received.

As of this writing (June 16 2011, 7:17 pm PDT), the total number of blocks is 131356, the average transaction size is 341 and the average block size is 2063, considering the entire block chain.  However, these average values are misleading because they incorporate a large number of blocks that were generated during the first year and a half or so of bitcoin's existence where the blocks and transactions were very small as they were exclusively coin generation blocks.  Using the last 1,000 blocks yields an average transaction size of 419 and an average block size of 22673.

Thus, currently the total size of all blocks is 270987428 bytes (258.433 MiB); if all blocks were as large as the average over the last 1,000 blocks, then the total size would be 2978234588 bytes, or 2.773 GiB.  Even at today's relatively low volume, less than two more years of blocks would occupy almost 3 GiB.  It is clearly impractical for all clients to download and store all of this data.

A Proposed Solution:

A proposed solution is for clients to store "elided blocks".  "Elided blocks" are blocks for which transactions not of interest have been stripped out, leaving only transactions interesting to the client.  For most blocks, this will be all transactions, reducing the block to 80 bytes.  Using the previous example of 131356 blocks, this reduces the storage to a lower bound of 10.021 MiB.  Of course, some blocks will have transactions of interest, but these should add an average only the sizes of the transactions themselves plus the Merkle hash tree entries for that transaction; 1000 bytes seems generous given a 419 byte average transaction size.  Thus a user with 10 active bitcoin addresses, each of which requires a chain of 100 transactions to verify their contents, might need only an extra 0.954 MiB of data.

So how do clients acquire "elided blocks"?  For each client, the transactions of interest are unique, so each client's set of elided blocks is different.  Therefore it is not possible to store elided blocks ahead of time in a shared pool; they must be fabricated on demand for each client.

One way to do this would be for each client to download the entire block chain and then elide transactions of interest.  However, this leaves some problems unsolved:

1. The client still must download the entire block chain as input to the eliding operation.

2. The client would need to re-download the entire block chain whenever a new bitcoin address became interesting to the client, in order to re-elide with the new set of interesting transactions.

This solution is clearly not appropriate as it does not solve one of the fundamental problems (downloading the entire block chain) and it adds an additional problem of requiring re-downloading of the entire block chain over and over again with each new bitcoin address of interest.

A better solution would be for a client to ask its peers for "elided blocks", which would allow the client to download elided blocks in a progressive manner.  The client would indicate bitcoin addresses of interest and its peer would provide the elided blocks.  The client could come back later and query for all new elided blocks since the last time that the client queried in order to acquire the most up-to-date transactional information for bitcoin addresses of interest.  The client could also come back later and query about an entirely new bitcoin address and receive only the elided blocks for that address.

Note that the client must always receive enough information in order to verify the transactions itself.  It is not asked to trust the peer's verification of a transaction chain; instead it must only trust that the peer is delivering accurately elided versions of the true block chain blocks, and it would do the verification itself.

What is to prevent a peer from fabricating elided blocks in response to client queries?  The client simply needs to validate each block header against the set of block headers that it acquired from the peer ahead of time (which themselves should be validated against other sources of the block headers), in order to ensure that the peer could not have fabricated any elided block on demand, as it would be impossible for the peer to fabricate a block and yet have the block header match the true bitcoin block header for that block.

Client Identification:

Because there are costs of storing the entire block chain and eliding blocks on demand, peers are given the option to charge for this service.  The means for accepting payment and verifying account balances are outside of the scope of the bitcoin protocol; but the protocol does include the key ingredient necessary to facilitate such a system: the requests may include an optional bitcoin address and be signed by that address, in order to associate the request with an account, that account being known to the peer by the bitcoin address that pays into it.  This is optional - peers can be completely free and never require signed requests, or can be completely for-hire and only accept signed requests, or accept both kinds of requests and give priority to paid accounts, or anything in between.

Postscript:

This last bit, where I talk about client signed requests, is part of the proposal.  I think that all client requests ought to be optionally signable so that bitcoin peers can decide who they want to talk to (i.e. who has paid them for the service and is carrying a positive balance) and optionally charge.  I think this should be some fundamental part of the protocol that can apply (at the client's discretion and the peer's acceptance) to any protocol message.

I was going to write up the protocol messages that would be added to the bitcoin protocol (to be added to the Bitcoin Protocol wiki page), but I'm not going to bother.  Like I said, maybe some developer will find what I have proposed useful, if not then ... please realize that the problem does need to be solved, even if what I have proposed is not the best way.  But please take my advice: you have to find a way now to solve the current bitcoin client data scalability problem.  In a matter of months, bitcoin clients as they currently stand will be impractical due to the ever-growing size of the block chain.
Pages:
Jump to: