Pages:
Author

Topic: A proposal for a scalable blockchain. - page 3. (Read 6215 times)

legendary
Activity: 1904
Merit: 1002
November 25, 2011, 05:06:26 PM
#14
To be clear, merkle pruning only applies to clients that need to conserve space.  You still need the entire chain to mine.
member
Activity: 84
Merit: 13
November 25, 2011, 04:53:01 PM
#13
Quote
3. Instead of storing the transactions, the balances are stored, which could cut back on data.

This would require quite a rework on the entire bitcoin network.
Right now it is based on transactions not just for the sake of it, but for flexibility which is achieved with scripts. Basically a transaction can have many different inputs and many different outputs, and many different conditions that will have to be med to claim the outputs. Just throwing this all away and simply storing balances of each address would not be compatible with this and would mean we would have to rework the whole concept.
Cool idea though.
full member
Activity: 385
Merit: 110
November 25, 2011, 02:59:17 PM
#12
I completely do not understand your proposal it's enormously complex and relies on additional communication providing data which might or might not be available. But reading your ideas I have my own idea:
(Editi/Addition:Ok, now I understand your proposal better, I didn't know what a ledger is lol Wink I think our ideas are similiar, except my idea was to create a new protocol to exchange the ledger hashes, apperently your idea is to embed those into the block chains, so no new protocol would be necessary, However a drawback of your idea would be that only miners get to verify the ledger/balance sheet, an obvious flaw I think Wink It needs to be seperated so everybody can have a vote on it ! Wink Then again, blocks are the way the network protocol and verification works, this could mean miners are now in control and decide what transactions are valid and which are not, those it seems bitcoin is coming down crashing and burning it's no longer p2p, it's no longer everybody in control, only the miners are now in control, probably a very dangerous situation Wink at least non-miners can still verify, but rejecting will be useless it seems, since they can never win).

How about this instead:

1. Instead of storing all transactions which ever occured, a point in historic time is chosen, where the software makes up the balances of all bitcoin addresses.

2. Bitcoin addresses which have turned into zero balances are thrown away.

3. Instead of storing the transactions, the balances are stored, which could cut back on data.

What kind of problems could this solution face and what could be the additional solutions:

1. Somebody could change it's own balance in it's own data, to give himself a million dollars.

This would then conflict with the datasets of others.

An idea could be to calculate a hash over all bit coin address balances.

This hash is then broadcasted throughout the system.

The number of confirmations is tracked.

If the majority agrees that the hash is indeed correct it is accepted into a "balance chain".

To make it a little bit more difficult to fake this balance chain, the hash could follow the principle of "difficulty", except since there is no rush, the difficulty could be set 100x times the current difficulty.
(Perhaps the difficult setting for the balance should match the number of "transaction/blocks collapses" * "current difficulty". In other words, blocks are calculated every 10 minutes, the balance block is calculated every 1000 blocks. So the difficulty for the balance chain is 1000x the block difficulty, which should keep both in sync).

Seems simply enough idea.

In principle there is probably no difference between "storing transactions" or "storing the sum of all those transactions" ?!?!?

Except that those transactions are "hashed into a block chain".

Well the same can be done with a "balance chain".

Perhaps it should also become a "racing balance chain" where the longest balance chain wins.

The difference is however: the balance chain is much harder to calculate.

Also the balance chain lags behind the transaction chain, by for example 1000 or 2000 blocks.

So the transaction chain gets a chance to stabilize, so the balance chain can work on stabilized transaction/block data.
hero member
Activity: 910
Merit: 1005
November 25, 2011, 02:44:54 PM
#11
Fascinating idea - if I understand it. It sounds like this proposal is sort of like an "oral history" of a block chain - as long as enough recently-connected clients are around to testify as to the "recent" history. Is that an accurate description?

Sort of miners would be generating a hash that says "this is the result of all transactions at this block", as long as enough hashing power agrees with this state then it is accepted.

a) when would it be "safe" to "forget" a genesis block from such a blockchain? As soon as all coins generated by it have been spent, and 51% of clients have learned of these spends (received the blocks they're contained in)?

There would be no 100% safe point. I guess the idea would be most clients would hold blocks from the past few weeks (possibly less, few days) but miners would would hold blocks for a much longer period. If your holding the blocks for the past year then an attacker would need to put in a years worth of hashing power to beat it.

b) IF 51% of the network were forced offline for 2+ weeks, could a malevolent actor with 51% hash power step in and present a complete two-week false history?

Yes assuming every node only had the past two weeks blocks.

Whatever... waste your time on a solved problem.  I'm tired of beating my head against this particular wall.  I never said the reference client should have this as default, but there is room for many different clients.  Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.  And I'm pretty sure the Satoshi solution is the way to go.  Have you read the whitepaper?  I highly recommend it . http://www.bitcoin.org/bitcoin.pdf

Satoshi's paper is pretty vague on this, where is it explained how you prove a transaction was in a block with the merkel tree? With pruning you still have to hold all unspent outputs, including the transaction hash and scriptPubKey. This method clusters unspent txOutputs and no longer requires the transaction hash.

Edit: Even if you can prove a transaction is in a block with the merkel tree you cannot validate a block without all transactions. If you can't validate a block then how do you know you haven't been given false data?

If clients cannot validate a block with only the headers then what is to stop a malicious attacker generating a set of block headers with fake hashes and difficulty targets. There would be no way for nodes to determine which chain is valid without downloading the transactions.
full member
Activity: 154
Merit: 102
Bitcoin!
November 25, 2011, 02:00:43 PM
#10
Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
But would there be any built-in incentive to mirror the block chain?  If the default client wouldn't, then that might not be a good direction to go.

Whatever... waste your time on a solved problem.  I'm tired of beating my head against this particular wall.  I never said the reference client should have this as default, but there is room for many different clients.  Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.  And I'm pretty sure the Satoshi solution is the way to go.  Have you read the whitepaper?  I highly recommend it . http://www.bitcoin.org/bitcoin.pdf
I'm not convinced it's solved.  But I agree with you that bitcoin needs to actually become useful and be adopted before we have to worry about these things.
legendary
Activity: 1904
Merit: 1002
November 25, 2011, 01:55:54 PM
#9
Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
But would there be any built-in incentive to mirror the block chain?  If the default client wouldn't, then that might not be a good direction to go.

Whatever... waste your time on a solved problem.  I'm tired of beating my head against this particular wall.  I never said the reference client should have this as default, but there is room for many different clients.  Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.  And I'm pretty sure the Satoshi solution is the way to go.  Have you read the whitepaper?  I highly recommend it . http://www.bitcoin.org/bitcoin.pdf
bc
member
Activity: 72
Merit: 10
November 25, 2011, 01:47:47 PM
#8
Fascinating idea - if I understand it. It sounds like this proposal is sort of like an "oral history" of a block chain - as long as enough recently-connected clients are around to testify as to the "recent" history. Is that an accurate description?

So I'm sure that I understand this:

a) when would it be "safe" to "forget" a genesis block from such a blockchain? As soon as all coins generated by it have been spent, and 51% of clients have learned of these spends (received the blocks they're contained in)?

b) IF 51% of the network were forced offline for 2+ weeks, could a malevolent actor with 51% hash power step in and present a complete two-week false history?
full member
Activity: 154
Merit: 102
Bitcoin!
November 25, 2011, 01:44:37 PM
#7
Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
But would there be any built-in incentive to mirror the block chain?  If the default client wouldn't, then that might not be a good direction to go.
legendary
Activity: 1904
Merit: 1002
November 25, 2011, 01:42:04 PM
#6
Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
full member
Activity: 154
Merit: 102
Bitcoin!
November 25, 2011, 01:39:48 PM
#5
Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.
hero member
Activity: 560
Merit: 501
November 25, 2011, 01:06:58 PM
#4
watching this
legendary
Activity: 1904
Merit: 1002
November 25, 2011, 12:54:40 PM
#3
Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
full member
Activity: 154
Merit: 102
Bitcoin!
November 25, 2011, 12:44:53 PM
#2
Don't have any comments at the moment, but this looks like an interesting proposal.

EDIT: Upon second reading, I don't see any glaringly obvious flaws and it looks like it might work.  I am, however, very much a neophyte in terms of understanding the bitcoin protocol.
hero member
Activity: 910
Merit: 1005
November 25, 2011, 12:42:17 PM
#1
The problem:

The blockchain will not scale how it is used currently. There is some mention of pruning unspent outputs mention on https://en.bitcoin.it/wiki/Scalability however this method still requires storage of all blockheaders, meaning there is still a unlimited cap on the blockchain size. Merkel tree pruning will not help to any significant extent as you maybe be able to tell if a transaction is in a block, however you cannot validate that block without all transactions. If lightweight clients cannot validate blocks then they cannot mine, relay blocks or relay transactions there is almost not point to them validating anything at all and might as well use a centralised blockchain.

a) A smaller blockchain helps lower the barrier of entry for the new users.
b) With less risk of blockchain bloat transaction fees could be lowered
c) The larger the blockchain the less users who will run the client and the more centralised the network becomes.

Proposed solution.

At certain points in time the client generates a snapshot of the of every unspent tx output in the chain. This snapshot encapsulates the state of the blockchain upto, but not including, that block.  When a miner produces a block he generates a SHA256 hash of this ledger and includes the hash it in the blocks coinbase.

When a client begins the initial block chain download it starts from the chain head and works backwards. The client downloads a minimum of 2016 blocks before it will accept a ledger hash. If there is a fork in the chain the client will continue to download blocks until it finds a pair of blocks that at least one ledger hash can be agreed upon. When an identical ledger is found the chain with the best proof of work wins. When the client accepts a hash it will ask the node to provide it with the full ledger corresponding to it which can be self hashed and verified. If the node doesn't have the ledger for that hash it may ask other nodes, if no nodes have a copy then it should continue to download past blocks until it can find a hash and a full copy of the ledger.

To validate a transaction the client locates the each txIn outpoint in the unspent ledger and checks the corresponding script for validity. The client checks the validity of a transaction by looking at the txOutputs in it's latest ledger and at in the transactions included in blocks after. Therefore nodes do need to not generate a balance sheet every transaction instead they would keep a balance sheet for approximately two weeks (2016 blocks) before regenerating. Two weeks has been chosen as a base value because it provides enough blocks to use for difficulty targeting, however nodes are free to keep more or less blocks depending on their storage capacity.

When the client decides it is time to generate a new ledger it looks through the chain for a more recent block which has a ledger hash in it's coinbase and is at least 2016 blocks behind the chain head. It then generates a new ledge sheet for that block and checks that the hash matches. It if matches then it is free to purge all transactions/blocks before that time. If the hash does not match then it is important to note the client does not reject the chain, as long as the proof of work is valid. The order of transactions is already decided by the order in the blockchain so the client would simply wait until a miner produces a hash it can agree with, it should not purge transactions until a hash is found. Miners may want to keep blocks for a longer period of time to ensure they have the necessary proof of work should it be needed.

Would this fork the chain?

No. Miners are free to insert whatever data they like into their block's coinbase. Clients that wish to hold a entire blockchain history can simply ignore it.

How much data would clients need to hold?

Quote
Approximately 4.5 million txOuts and 3.3 million txIns - so ~1.2 million unspent outputs.

At the present blockchain size, the ledger would consume at most:

(256 + 160 + 16 + 64) * 1.2 million = 71MB

+ Approximatly two weeks worth of blocks = 100 MB total

This is the initial estimate with compression it maybe possible to halve this value.

Could you mine without the entire blockchain?
Yes. The network could operate fully without any node having the entire blockchain. It is possible that a chain fork could go so far into the past that no nodes have a copy of the chain long enough to resolve the split, however this is extremely unlikely without a malicious attacker having 51% hashing power for a significant period of time.

How would this be adopted, would all miners need to switch immediately?
There needs to be at least one miner producing a ledger hash around every two weeks. So initially this would be possible to implement with only a small pool adopting the scheme. The more frequent miners produce a ledger the more efficiently clients will be able to prune old transactions.

File format
The initial proposed file format would simply be a dump of all unspent txOutputs, in the order they appeared in the blockchain, in the same format as they are serialised over the nework. This has the advantage that any bitcoin client that participates on a network level can decode the file with minimum effort.

The file will probably need to be indexed after it is downloaded as it will not be suitable for locating a txOutput efficiently. The file format for the ledger will be included in the coinbase along with the hash. In the file there will be many duplicate scripts and transaction hashes giving the possibility of much greater compression in future.

Coinbase
Magic Value - File format - Ledger size - Hash
uint32_t, uint16_t, uint64_t, uint256_t

** magic value is a flag indicating this coinbase holds a ledger hash


/Discuss. Feel free to point out any glaringly obvious flaws Smiley

Pages:
Jump to: