Pages:
Author

Topic: Is the feature "reclaiming disk space" really implemented in Bitcoin Core? (Read 2207 times)

hero member
Activity: 771
Merit: 528
I get your point.
By using my first example, if e.g. a malicious node pruned the tx "Bob->Charlie (25BTC)", then the tx "coinbase -> Alice (50BTC)" would be partially orphaned, since it was originally linked to the tx "Alice->Bob (25BTC)" and then became linked to "Bob->Charlie (25BTC)". Alice would effectively get all the coins back from charlie.
So the only solution would be to enforce CoinJoin as gmaxwell suggested, however these coinjoin transactions are interactive and optional.
Maybe an incentive to the users to use such transactions (zero tx fees?) would help, which would in turn enable pruning the intermediate coinjoin transactions safely.
staff
Activity: 3374
Merit: 6530
Just writing some code
Ok let me rephrase my question: Suppose we change the security model of bitcoin, and enforce transaction pruning at the blockchain level (not at the client level) in a fashion described above, for blocks that their height is < (H - N), where H is the current block height and N is a set constant. Would that model be insecure? If not, why?
It is still insecure.

The case that you must consider is a new full node that is just coming online and is syncing the blockchain. It has to download it from its peers. So how does it know that a peer didn't just prune a bunch of transactions that have unspent output that are a couple thousand blocks deep and relay that version of the blockchain to them?
hero member
Activity: 771
Merit: 528
You need to either know where Bob got the output to verify that the block is valid or you need to have some other way to prove that a block is valid. You can't just assume a block is valid even if it is deep in the blockchain, that's not the security model of Bitcoin.

Ok let me rephrase my question: Suppose we change the security model of bitcoin, and enforce transaction pruning at the blockchain level (not at the client level) in a fashion described above, for blocks that their height is < (H - N), where H is the current block height and N is a set constant. Would that model be insecure? If not, why?
staff
Activity: 3374
Merit: 6530
Just writing some code
Why would you need where Bob got the output? If that transaction is included in a verified block, it means that it is valid. And if that block is like 1000 blocks behind the current block, it's impossible to change it.
(I am not talking about how the current bitcoin protocol works).
I was thinking of something like transaction cut-through https://bitcointalksearch.org/topic/transaction-cut-through-281848 which is already being implemented in mimble wimble.
You need to either know where Bob got the output to verify that the block is valid or you need to have some other way to prove that a block is valid. You can't just assume a block is valid even if it is deep in the blockchain, that's not the security model of Bitcoin.
hero member
Activity: 771
Merit: 528
You can do that locally once you have downloaded and verified the blockchain. You cannot do that to the blockchain as a whole because I don't know whether the transaction "Bob->Charlie (25BTC)" is actually legitimate when I am syncing a new node. For me to check that it is legit, I need to know where Bob got the output to spend. Just because a transaction is in a block with a valid proof of work does not automatically mean that all transactions in the block are valid; that's not how Bitcoin works.

You can certainly do this locally as that is basically what Satoshi suggests in the whitepaper. But as gmaxwell pointed out above, what we do now for pruning locally is way more efficient than what Satoshi suggests. Satoshi suggests that we throw away parts of blocks as UTXOs are spent. But what we do is that we maintain a separate database with our UTXOs and chainstate data so we don't actually need to have the blocks themselves. So we just throw away old blocks entirely because we have validated them and taken the things from them that we need and stored them elsewhere in a more compact form.


Why would you need where Bob got the output? If that transaction is included in a verified block, it means that it is valid. And if that block is like 1000 blocks behind the current block, it's impossible to change it.
(I am not talking about how the current bitcoin protocol works).
I was thinking of something like transaction cut-through https://bitcointalksearch.org/topic/transaction-cut-through-281848 which is already being implemented in mimble wimble.
staff
Activity: 3374
Merit: 6530
Just writing some code
Ok let me express a simple example:
Suppose Alice got 50BTC from a coinbase transaction on block #n. Alice then transfers 25BTC to Bob on block #(n+1) which results Bob having 25BTC and Alice a 25BTC UTXO.
Up to that point, we need all blocks and transactions for blockchain validation.
Then on block#(n+2) Bob sends Charlie all of his funds, 25BTC, leaving Bob with 0 BTC.
Now the transaction "Alice->Bob (25BTC)" is not needed to remain on block#(n+1) since Bob has 0 UTXO, and the transaction "Bob->Charlie (25BTC)" was verified on block#(n+2).
Also this improves privacy since it makes harder to link transactions and taint coins.

I believe that's what Satoshi means in his whitepaper #7 by pruning Tx0-2 from the block on the right.
If this implementation requires a hard-fork, that's another story..
Correct me if I'm wrong.
You can do that locally once you have downloaded and verified the blockchain. You cannot do that to the blockchain as a whole because I don't know whether the transaction "Bob->Charlie (25BTC)" is actually legitimate when I am syncing a new node. For me to check that it is legit, I need to know where Bob got the output to spend. Just because a transaction is in a block with a valid proof of work does not automatically mean that all transactions in the block are valid; that's not how Bitcoin works.

You can certainly do this locally as that is basically what Satoshi suggests in the whitepaper. But as gmaxwell pointed out above, what we do now for pruning locally is way more efficient than what Satoshi suggests. Satoshi suggests that we throw away parts of blocks as UTXOs are spent. But what we do is that we maintain a separate database with our UTXOs and chainstate data so we don't actually need to have the blocks themselves. So we just throw away old blocks entirely because we have validated them and taken the things from them that we need and stored them elsewhere in a more compact form.
hero member
Activity: 771
Merit: 528

No. Pruning is working exactly as intended. Pruning and what Satoshi said in the whitepaper are two completely different things.
...
Because without the full transaction history, that data can be forged. You can't know whether a UTXO is legitimate without knowing the transaction that created it and what that transaction spent. You need the full transaction history to verify the validity of a UTXO. With UTXO commitments (which do not yet exist) we could do that, but we will need a fork to enable such functionality.

Ok let me express a simple example:
Suppose Alice got 50BTC from a coinbase transaction on block #n. Alice then transfers 25BTC to Bob on block #(n+1) which results Bob having 25BTC and Alice a 25BTC UTXO.
Up to that point, we need all blocks and transactions for blockchain validation.
Then on block#(n+2) Bob sends Charlie all of his funds, 25BTC, leaving Bob with 0 BTC.
Now the transaction "Alice->Bob (25BTC)" is not needed to remain on block#(n+1) since Bob has 0 UTXO, and the transaction "Bob->Charlie (25BTC)" was verified on block#(n+2).
Also this improves privacy since it makes harder to link transactions and taint coins.

I believe that's what Satoshi means in his whitepaper #7 by pruning Tx0-2 from the block on the right.
If this implementation requires a hard-fork, that's another story..
Correct me if I'm wrong.
staff
Activity: 3374
Merit: 6530
Just writing some code
Ok as far as I understand, you have to download the whole ~150GB blockchain nevertheless, then enable pruning afterwards to be left with a couple of gigabytes of blockchain data.
No. You can enable pruning at any time and it will reduce the amount of space used on disk to a few GB at most. You do not, at any point in time, need to have the full blockchain on disk.

The other part that I understand is that not all nodes can have pruning enabled, some nodes must keep the whole blockchain anyway. All this makes pruning much less effective.
That is correct.

By reading #7 of Satoshi's white paper, It seems that current pruning functionality is not working as intended.
No. Pruning is working exactly as intended. Pruning and what Satoshi said in the whitepaper are two completely different things.

So why don't we just store coinbase transactions and UTXOs? Am I missing something?
Because without the full transaction history, that data can be forged. You can't know whether a UTXO is legitimate without knowing the transaction that created it and what that transaction spent. You need the full transaction history to verify the validity of a UTXO. With UTXO commitments (which do not yet exist) we could do that, but we will need a fork to enable such functionality.
hero member
Activity: 771
Merit: 528
Ok as far as I understand, you have to download the whole ~150GB blockchain nevertheless, then enable pruning afterwards to be left with a couple of gigabytes of blockchain data.
The other part that I understand is that not all nodes can have pruning enabled, some nodes must keep the whole blockchain anyway. All this makes pruning much less effective.
By reading #7 of Satoshi's white paper, It seems that current pruning functionality is not working as intended.
So why don't we just store coinbase transactions and UTXOs? Am I missing something?
legendary
Activity: 2618
Merit: 1252
You need a matching UTXO set for every block you want to start with. Practically it might be enough to have an UTXO snapshot every month or every year.

The identification of the correct chain is easy. Only the correct block header chain will lead to the already validated block hash. A different chain will lead to a different block hash unless there is a collision.
legendary
Activity: 3388
Merit: 4615
The idea is to store an UTXO hash in the block header. If nodes or any other source provides an UTXO set for download, I can bootstrap from the last verified block (user provided checkpoint). This should be safe as long as there is no easy way to create hash collisions.

So every node would need to keep track of two UTXO sets?  The current set, and the confirmed set?

What if you receive two different valid blocks with two different UTXO hashes?  How will your node know which is the correct hash?

legendary
Activity: 2618
Merit: 1252
It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).
Are you suggesting storing the complete UTXO list in every block?  Or are you suggesting that nodes share their UTXO list with peers, and that the block just store a hash of the UTXO list?

The idea is to store an UTXO hash in the block header. If nodes or any other source provides an UTXO set for download, I can bootstrap from the last verified block (user provided checkpoint). This should be safe as long as there is no easy way to create hash collisions.
legendary
Activity: 3388
Merit: 4615
It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).

Are you suggesting storing the complete UTXO list in every block?  Or are you suggesting that nodes share their UTXO list with peers, and that the block just store a hash of the UTXO list?

Either way seems to require a significant amount of trust in the list that you receive.
legendary
Activity: 2618
Merit: 1252
... This however does require still downloading all 110+ GB ...
With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.
Does a node not always need to be complete sinyced? Meaning downloading the whole blockchain?

It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).
newbie
Activity: 41
Merit: 0
... This however does require still downloading all 110+ GB ...

With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.


Does a node not always need to be complete sinyced? Meaning downloading the whole blockchain?

I think it is very interstning that satoshi has some code in there not being active.
No. That is not at all what this thread is about. Satoshi came up with an idea in the whitepaper, but his idea specifically was never implemented.

So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
Again, you are completely missing the point of this thread. This thread has nothing to do with Core checking disk space but rather whether a specific feature is implemented in Core. Anyways, Core does tell you if you are running out of space to store the blockchain. However it cannot tell you whether you have enough disk space because that would imply it knows the actual size of the blockchain, and the only way to do that is by downloading the whole thing.

Oh ok thank you. Maybe satoshi will return in a few years Smiley
legendary
Activity: 2618
Merit: 1252
... This however does require still downloading all 110+ GB ...

With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.
staff
Activity: 3374
Merit: 6530
Just writing some code
I think it is very interstning that satoshi has some code in there not being active.
No. That is not at all what this thread is about. Satoshi came up with an idea in the whitepaper, but his idea specifically was never implemented.

So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
Again, you are completely missing the point of this thread. This thread has nothing to do with Core checking disk space but rather whether a specific feature is implemented in Core. Anyways, Core does tell you if you are running out of space to store the blockchain. However it cannot tell you whether you have enough disk space because that would imply it knows the actual size of the blockchain, and the only way to do that is by downloading the whole thing.
newbie
Activity: 41
Merit: 0
I think it is very interstning that satoshi has some code in there not being active.
So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
legendary
Activity: 4018
Merit: 1299
Sorry to hijack your thread OP, but I want to add a question here. In my limited knowledge of pruning I suspect that if EVERYONE runs a pruned

version of the software, then those old tx's are lost forever? I have a basic understanding of Pruning, and I have not taken the time to brush up

on the research... so I am asking this out of pure laziness to do the research myself.  Sad

The short answer is yes, the old transactions would be lost.

The longer answer (which I'm sure you figured out) also includes, all backups would have to be lost for them to be lost forever.  Likewise without software modifications new nodes wouldn't be able to start up if that occurred and in all likelihood bitcoin would collapse - that is as the software is now, future changes could be made to mitigate some (maybe all) of these impacts.
legendary
Activity: 1904
Merit: 1073
Sorry to hijack your thread OP, but I want to add a question here. In my limited knowledge of pruning I suspect that if EVERYONE runs a pruned

version of the software, then those old tx's are lost forever? I have a basic understanding of Pruning, and I have not taken the time to brush up

on the research... so I am asking this out of pure laziness to do the research myself.  Sad
Pages:
Jump to: