Author

Topic: [Proposal] Merkle tree of unspent transactions (MTUT) (Read 1810 times)

kjj
legendary
Activity: 1302
Merit: 1026
If this was just pruning, then I'd agree with you that he had more important things to do.  But this is way beyond mere pruning.  This effectively moves all of the state and all of the history of the entire system into the current block.
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
Blockchain size was not a problem for almost 3 years. Satoshi dissapeared much before this. He didn't implement tree prunning because there was much more important issues to think about.
kjj
legendary
Activity: 1302
Merit: 1026
Why do you think Satoshi didn't choose to keep the entire current state present?
I don't understand the question.
I didn't either, at first, but I think what he's trying to get at is: Why didn't Satoshi do this from the beginning? And before you say anything, keep in mind that more often than not, Satoshi had an explicit reason for what he did. This is a valid question that all changes to Bitcoin should have thoroughly answered before they move forward.

Bingo.  And at this point, that is my only remaining big objection.  It could be that he just didn't think of it, but it seems more likely that he did think of it and rejected it for some reason unknown to us.

I still think that the logistics and hashing of the tree will be troublesome, but not unsolvable.  And I may be wrong about this and the exact implementation offered will be good enough as-is.

You would still need to fetch the block body for any block involved in a transaction that you care about.
Only the transactions used as inputs, the merkle tree and MTUT verification hashes. Unless you want to validate those transactions too, in that case you need the full blockchain. It shouldn't be needed if implemented correctly.
I see your objection that you would then have to trust a third party to do the search to make sure that the transaction hasn't already been spent, but that really isn't a big deal.
It is a big deal. Unless new users are encoruaged to download electrum before trying the official client. Let's change bitcoin.org, bitcoin.it, weusecoins, wikipedia, etc, to point to electrum instead of the full client. I mean, the idea of thin clients is great, but such functionality is not built in the official client because it somewhat defeats the purpose of decentralization (and it's a different protocol/API, etc).

This was for the case of a light client using a different modification of the system, not for MTUT.  The modifications to the protocol to allow searches by address or transaction ID would be pretty simple, and would support light clients in almost all situations.  Or, you could do it with the bitcoin protocol without changes.  With bitcoin + electrum you could search electrum for the blocks containing the transactions you need, then use bitcoin (the protocol) to fetch and verify them.
legendary
Activity: 1204
Merit: 1015
Why do you think Satoshi didn't choose to keep the entire current state present?
I don't understand the question.
I didn't either, at first, but I think what he's trying to get at is: Why didn't Satoshi do this from the beginning? And before you say anything, keep in mind that more often than not, Satoshi had an explicit reason for what he did. This is a valid question that all changes to Bitcoin should have thoroughly answered before they move forward.
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
I still think you are underestimating the number of hashes that will need to be done, and how often.
Please enlighten me. Show me the numbers.

You would still need to fetch the block body for any block involved in a transaction that you care about.
Only the transactions used as inputs, the merkle tree and MTUT verification hashes. Unless you want to validate those transactions too, in that case you need the full blockchain. It shouldn't be needed if implemented correctly.

I see your objection that you would then have to trust a third party to do the search to make sure that the transaction hasn't already been spent, but that really isn't a big deal.
It is a big deal. Unless new users are encoruaged to download electrum before trying the official client. Let's change bitcoin.org, bitcoin.it, weusecoins, wikipedia, etc, to point to electrum instead of the full client. I mean, the idea of thin clients is great, but such functionality is not built in the official client because it somewhat defeats the purpose of decentralization (and it's a different protocol/API, etc).

Why do you think Satoshi didn't choose to keep the entire current state present?
I don't understand the question.
kjj
legendary
Activity: 1302
Merit: 1026
Double that, since the standard hashing that we do is highly optimized in ways that aren't useful for hashing a tree.  But something that takes 6 minutes on a high end GPU is no longer a lightweight, or even middleweight client, since it will exclude 99.9% of all current nodes.  And you are talking about something that every node must do.
By the time we have a 10 TB blockchain (I repeat: 10 TERABYTES), I'm sure you'll need much less than 6 minutes. And again, that's an operation for the whole tree. A single change only needs approx 11 hashes (in my 10 year example).

I still think you are underestimating the number of hashes that will need to be done, and how often.

This isn't a threat to the network, this is a threat to one node.  Without the whole block chain, you can't add up the difficulty sum by tracing it all the way back to the genesis block.
You can, just the block headers is what you need (40 mb for 10 years of headers). Satoshi already thought of that. Read his paper again if unsure.

Good point.

But with that in mind, does any light client need more than just the headers and the latest X blocks, where X is the reorg depth you are worried about?  You would still need to fetch the block body for any block involved in a transaction that you care about.  I see your objection that you would then have to trust a third party to do the search to make sure that the transaction hasn't already been spent, but that really isn't a big deal.

Why do you think Satoshi didn't choose to keep the entire current state present?  With your system, not even the most paranoid users would ever need to keep more than a few hundred blocks on hand.
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
Double that, since the standard hashing that we do is highly optimized in ways that aren't useful for hashing a tree.  But something that takes 6 minutes on a high end GPU is no longer a lightweight, or even middleweight client, since it will exclude 99.9% of all current nodes.  And you are talking about something that every node must do.
By the time we have a 100 TB blockchain (I repeat: 100 TERABYTES), I'm sure you'll need much less than 6 minutes. And again, that's an operation for the whole tree. A single change only needs approx 11 hashes (in my 10 year example).

This isn't a threat to the network, this is a threat to one node.  Without the whole block chain, you can't add up the difficulty sum by tracing it all the way back to the genesis block.
You can, just the block headers is what you need (40 mb for 10 years of headers). Satoshi already thought of that. Read his paper again if unsure.

Edit: s/10/100/g
kjj
legendary
Activity: 1302
Merit: 1026
Interesting, but I see a couple of things.

1.  All of the unspent transactions on the network since the beginning of time is a lot of hashing to have to do for every block.

2.  You can't really trust the blockchain unless you've verified it all yourself.  You are taking a big risk if you grab a recent block and just assume that it is good.  There are ways to mitigate this risk to some extent, but the only way to eliminate it entirely is to do all of the math yourself.

1. It's 105120000000*2 hashes for the theorical 10 year period I've calculated. 6 minutes with a current GPU. It's almost nothing for a blockchain of almost 100 TB. According to this there are ~1.2 million of unspent transactions (2 months ago); let's assume we have 2 million, so we need about 4 million hashes (assuming the same number of tx per block). How much does need your GPU to hash that? Typical mining GPUs does it in 0.01 seconds. My CPU maybe a couple of seconds. Much less than the time it needs to check all ECDSA signatures.

Double that, since the standard hashing that we do is highly optimized in ways that aren't useful for hashing a tree.  But something that takes 6 minutes on a high end GPU is no longer a lightweight, or even middleweight client, since it will exclude 99.9% of all current nodes.  And you are talking about something that every node must do.

2. If the majority of hashing power rejects blocks with an invalid MTUT, it's impossible for a block with 6 confirmations to be wrong. Even if you take a bogus MTUT, what are you risking? That a transaction you're doing is rejected? As a paranoid you would wait a transaction to have 6 confirmations anyway.

Compare this solution with the current protocols for "thin" clients being in development (not one or two, but three!). MTUT doesn't require trusting a third party. The only issue with not downloading the whole chain is anonimity.

This isn't a threat to the network, this is a threat to one node.  Without the whole block chain, you can't add up the difficulty sum by tracing it all the way back to the genesis block.  What is stopping someone from making 6 chained blocks claiming to have more total difficulty than the rest of the network, but a lower current difficulty, and feeding them to you?
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
Interesting, but I see a couple of things.

1.  All of the unspent transactions on the network since the beginning of time is a lot of hashing to have to do for every block.

2.  You can't really trust the blockchain unless you've verified it all yourself.  You are taking a big risk if you grab a recent block and just assume that it is good.  There are ways to mitigate this risk to some extent, but the only way to eliminate it entirely is to do all of the math yourself.

1. It's 105120000000*2 hashes for the theorical 10 year period I've calculated. 6 minutes with a current GPU. It's almost nothing for a blockchain of almost 100 TB. According to this there are ~1.2 million of unspent transactions (2 months ago); let's assume we have 2 million, so we need about 4 million hashes (assuming the same number of tx per block). How much does need your GPU to hash that? Typical mining GPUs does it in 0.01 seconds. My CPU maybe a couple of seconds. Much less than the time it needs to check all ECDSA signatures.

2. If the majority of hashing power rejects blocks with an invalid MTUT, it's impossible for a block with 6 confirmations to be wrong. Even if you take a bogus MTUT, what are you risking? That a transaction you're doing is rejected? As a paranoid you would wait a transaction to have 6 confirmations anyway.

Compare this solution with the current protocols for "thin" clients being in development (not one or two, but three!). MTUT doesn't require trusting a third party. The only issue with not downloading the whole chain is anonimity.
kjj
legendary
Activity: 1302
Merit: 1026
Interesting, but I see a couple of things.

1.  All of the unspent transactions on the network since the beginning of time is a lot of hashing to have to do for every block.

2.  You can't really trust the blockchain unless you've verified it all yourself.  You are taking a big risk if you grab a recent block and just assume that it is good.  There are ways to mitigate this risk to some extent, but the only way to eliminate it entirely is to do all of the math yourself.
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
Yep, that has been my motivation to write this proposal. But much worse than disk space is the problem of bandwith, specially in this P2P fashion. I've written this because is possible to have a fully functional and decentraliced client that just downloads a couple of megabytes to start working. If you read my proposal you'll see I've done the math as if bitcoin already had the volume of big payment processors.
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
Whether this method or some other I think that the issue of the growing size of the block chain is going to be one of the biggest problems down the track.

If the number of transactions grew tenfold in the next year (which would still be pretty much nothing compared to the big commercial transaction processors) then I'd be starting to run out of disk space and waiting *days* for the block chain to download would not make for a very good first impression with new users.


Cheers,

Ian.
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
Hello,

I've written this proposal. C&P of the overview:

Quote
Satoshi's original paper describes a way of prunning spent transactions in the blockchain to save storage space while it remains consistent and verifiable, but it's useless for partial blockchain downloads: while you can know if a given transaction is in the blockchain, you can't know if it has been spent in a subsequent transaction.

This proposal describes how to add a hash-tree based check in the blockchain that allows to verify if a transaction is unspent without downloading and checking all the blockchain. The idea is not new, but at the time of this writing there isn't any technical description of how this should be done. Aditionally, this solution is rather simple.

Read the rest here: https://en.bitcoin.it/wiki/User:DiThi/MTUT

Cheers
Jump to: