Re: What are checkpoints in bitcoin code?

DumbFruit

sr. member

Activity: 433

Merit: 267

Quote from: DannyHamilton on November 07, 2014, 12:10:19 PM

Quote from: DumbFruit on November 07, 2014, 12:00:48 PM

When block 100,000 is mined, drop the oldest 50,000 blocks. The remaining oldest block is the new "genesis block". Any remaining unspent outputs on the dropped blocks are given to the miner.

Bam, done. Grin

This is a horrible idea.

Thankfully it can't be implemented without consensus from all of the users of bitcoin, and I'm confident that you'll never be able to get consensus on such a poor proposal.

Yep, it's a pretty bad idea for Bitcoin.

Do you mean it couldn't get consensus to make that change among users, because I totally agree with that, or do you mean that the blockchain couldn't get consensus with a sliding history like this, because I don't agree with that (For the most part.).

DannyHamilton

legendary

Activity: 3528

Merit: 4945

Quote from: DumbFruit on November 07, 2014, 12:00:48 PM

When block 100,000 is mined, drop the oldest 50,000 blocks. The remaining oldest block is the new "genesis block". Any remaining unspent outputs on the dropped blocks are given to the miner.

Bam, done. Grin

This is a horrible idea.

Thankfully it can't be implemented without consensus from all of the users of bitcoin, and I'm confident that you'll never be able to get consensus on such a poor proposal.

DumbFruit

sr. member

Activity: 433

Merit: 267

Quote from: work2heat on November 01, 2014, 10:10:26 PM

Essentially what I'm trying to figure out is a mechanism for blockchain compression so that we can drop very old txs with minimal to no loss in security.

Hardcore "compression"; Implement max blockchain height of 100,000. When block 100,000 is mined, drop the oldest 50,000 blocks. The remaining oldest block is the new "genesis block". Any remaining unspent outputs on the dropped blocks are given to the miner.

Bam, done. Grin

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: work2heat on November 01, 2014, 09:23:21 PM

This is where I'm losing you. Yes there may be a checkpoint, but highest total difficulty still wins out. If the highest difficulty chain conflicts with a check-pointed one, surely the client should go with the higher difficulty one, as you say

Even with headers first, checkpoints still mean that old txs don't need to be validated (assuming you trust the reference client programmers).

You could leave the checkpoints in, and just say that all txs before a checkpoint are automatically valid.

If there was a conflict, I think going into some kind of emergency mode is better than saying nothing.

Better is detecting large forks. If there is a fork that is 1000 blocks long within the last 2000 blocks, then flag a warning and tell users that their balances could be wrong.

Quote from: work2heat on November 01, 2014, 10:10:26 PM

Essentially what I'm trying to figure out is a mechanism for blockchain compression so that we can drop very old txs with minimal to no loss in security.

Pruning is not a big deal really. As long as at least 1 node keeps everything, then the network can recover from forks.

If 10,000 nodes each hold 1% of the data, then you are highly likely to have everything.

It is likely that at least a few nodes will be "archive" nodes that will store everything.

work2heat

newbie

Activity: 21

Merit: 0

Quote

What you're describing is not a checkpoint then. A checkpoint forces the identity of the selected chain, regardless of it has the most work or not. So that would be the point of departure.

Indeed I see where the misunderstanding has been then. Perhaps check-point was the wrong term but it sure has a "check-point-like" feel to it. Glad we are more on the same page now.

Essentially what I'm trying to figure out is a mechanism for blockchain compression so that we can drop very old txs with minimal to no loss in security. Perhaps what I have proposed is not sufficient, I merely thought of it yesterday and thought we could explore something like it here. You're right, the complexity of such a protocol may simply not be worth it. But consider a new node in 50 years having to go back to genesis and start validating all those txs. Poor soul.

In a certain sense, it boils down to resetting the genesis block to something more recent (of course including the utxo hash) in a manner compatible with the network's consensus. Do you suppose there is any secure way to do this? Would it even be worth it?

Quote

and no, newly generated coins cannot be spent for 100 blocks

Right. I forgot about this.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: work2heat on November 01, 2014, 09:23:21 PM

This is where I'm losing you. Yes there may be a checkpoint, but highest total difficulty still wins out. If the highest difficulty chain conflicts with a check-pointed one, surely the client should go with the higher difficulty one, as you say

Quote

chain will simply be unwound and replaced, so giving it that extra data is harmless, once the honest network is observed its like the node never saw the forgery

. The checkpoint is merely a mechanism to avoid verifying very old txs. But if I see a competing chain with higher difficulty, I ought to go with that one, whether it has a checkpoint or not.

What you're describing is not a checkpoint then. A checkpoint forces the identity of the selected chain, regardless of it has the most work or not. So that would be the point of departure.

What is the point of the rest of the complexity in what you're discussing then? The "propose", "earmark" milibits? etc. (and no, newly generated coins cannot be spent for 100 blocks). The blockchain itself is already the measurement of its history. If you're willing to trust the data in the blockchain, you just can no extra information is required. (Though if you're willing to adopt that reduced security model, why are you not going all the way and using SPV (see section 8 of bitcoin.pdf)? Presumably you're aware that you still need to transfer the data and process it to be able to verify further blocks, right?

[If you're talking about small numbers of blocks like that, what you're proposing would be a considerable reduction in the security model, since the reward for a miner to hop ahead of the network would basically be unbounded, so that the incentives arguments on behaviour are much weaker. Weaker has it's place, but thats what SPV already accomplishes.]

work2heat

newbie

Activity: 21

Merit: 0

Quote

the tone you've taken here is irritating and is likely to cause experienced people to ignore your messages in the future if you continue with it.

My apologies. Did not intend to irritate. Just trying to understand this problem better. And many thanks to you for the time you're taking to go through it here - very much appreciated.

I hope you don't mind if I continue:

Quote

once the honest network is observed its like the node never saw the forgery. When you start talking about "check-pointing" based on that chain the situation changes and you get the attack

This is where I'm losing you. Yes there may be a checkpoint, but highest total difficulty still wins out. If the highest difficulty chain conflicts with a check-pointed one, surely the client should go with the higher difficulty one, as you say

Quote

chain will simply be unwound and replaced, so giving it that extra data is harmless, once the honest network is observed its like the node never saw the forgery

. The checkpoint is merely a mechanism to avoid verifying very old txs. But if I see a competing chain with higher difficulty, I ought to go with that one, whether it has a checkpoint or not.

Quote

I can mine 80 blocks in a row at height 100,000 trivially in a few seconds myself

Granted. But if you go back and do that, the chain you create will not have difficulty of the canonical chain. So even if I see yours first, again, so long as I eventually see the real chain I will ignore yours. This check-pointing mechanism would have to start from the current head if it wants to stay valid. We could check point block 100,000 by submitting today a tx with the hash of that block. If you try to actually create that checkpoint further back by forking around block 100,100, say, you will not be able to create a chain on par with the current difficulty. So despite your checkpoint, I will still ignore you, even if it means I have to hop on a chain that starts from satoshis genesis and has no checkpoints.

Is this not correct?

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: work2heat on November 01, 2014, 06:40:40 PM

pow is a form of validity.

It isn't. See the discussion about fraud proofs and probabilistic testing. Please also note that the tone you've taken here is irritating and is likely to cause experienced people to ignore your messages in the future if you continue with it.

Quote

Great now I create a simulated history which that sets a bogus 'checkpoint' back early in the chain, but any _new_ nodes that attach to me I give this simulated history to before they know there is a better chain elsewhere and they start enforcing that rule and they are now forked off onto this bogus alternative chain;

this argument applies to any blockchain. If I can get the node to think the chain I give it is the right one before it even sees any other, I win. But here, there is still a PoW element, so as soon as the node sees a chain with higher total diff it will know the one I sent was bogus.

Thats not correct. If you give a bitcoin node a chain and then someone else gives it a mutually exclusive longer one, your chain will simply be unwound and replaced, so giving it that extra data is harmless, once the honest network is observed its like the node never saw the forgery. When you start talking about "check-pointing" based on that chain the situation changes and you get the attack. (and not just that attack, there are several others, e.g. announcing two competing equally valid forks concurrently and leaving the network in a never-resolving perpetual consensus split).

Quote

So you can start your fork wherever you want, but so long as I haven't been partitioned off the internet completely, this isn't a problem (and if I have been, it's a problem for bitcoin proper too).

Every node begins its life partitioned, our security model also allows temporary partitioning. Assuming you will never be partitioned requires resolving the sybil problem in a strong sense at a minimum and isn't generally compatible with the reality of computer networks today (they're just sometimes partitioned).

Quote

The result is that you give miners a new power, instead of just being able to reorder the history, they could also create arbitrary inflation just by adding new utxo to their updates. (which, if course, would be in all of their short-term interests to do)

They can already do this by arbitrarily augmenting the coinbase reward. But they don't, because they know other nodes will drop the block and their efforts will go to waste.

Uh. You're pointing out precisely why they cannot. An invalid mined block is invalid, no less than if it didn't meet the target.

Quote

Similarly here. My proposal involved X of Y consecutive blocks to include the same checkpoint for it to be valid. Set that to 70 and 80 say. So for a checkpoint to be valid, 70 of 80 blocks in a row must include it. It is very unlikely a single entity will control all that.

I can mine 80 blocks in a row at height 100,000 trivially in a few seconds myself (there are many thousand in a row mined by me in testnet, for example). I could also mine 80 blocks in a row forking from the current heights at considerable cost given a few months. A bitcoin node doesn't have an absolute synchronous ordered view of the world, it only learns what it learns from its peers... and it can't tell if a simulated blockchain took 6 months of computation or one day, it can't tell if it was created recently or long in the past. etc. If we could tell these things we wouldn't need a blockchain.

work2heat

newbie

Activity: 21

Merit: 0

Quote

since with headers first it knows the amount of work on top of them and can perform the tests only probabilistically past a certain point.

Indeed, so contrary to andytoshi's assertion, pow is a form of validity. If you haven't verified every single sig yourself, can you really be called a full node?

Quote

Great now I create a simulated history which that sets a bogus 'checkpoint' back early in the chain, but any _new_ nodes that attach to me I give this simulated history to before they know there is a better chain elsewhere and they start enforcing that rule and they are now forked off onto this bogus alternative chain;

this argument applies to any blockchain. If I can get the node to think the chain I give it is the right one before it even sees any other, I win. But here, there is still a PoW element, so as soon as the node sees a chain with higher total diff it will know the one I sent was bogus.

Quote

Worse, because the forking off can be arbitrarily far back it becomes exponentially cheaper to do so long as hash-power is becoming exponentially cheaper.

The mechanism I proposed requires a tx that is much more recent than the block it is actually checkpointing. And there is still the normal difficulty calculation. The canonical chain as it stands and the canonical chain with a checkpoint back at block 10,000 will have heads with identical difficulty. So you can start your fork wherever you want, but so long as I haven't been partitioned off the internet completely, this isn't a problem (and if I have been, it's a problem for bitcoin proper too).

Quote

The result is that you give miners a new power, instead of just being able to reorder the history, they could also create arbitrary inflation just by adding new utxo to their updates. (which, if course, would be in all of their short-term interests to do)

They can already do this by arbitrarily augmenting the coinbase reward. But they don't, because they know other nodes will drop the block and their efforts will go to waste. Similarly here. My proposal involved X of Y consecutive blocks to include the same checkpoint for it to be valid. Set that to 70 and 80 say. So for a checkpoint to be valid, 70 of 80 blocks in a row must include it. It is very unlikely a single entity will control all that. If they can, bitcoin is already screwed. Since they can't, they have the same incentive to be honest about the utxo set at a checkpoint as they do about following the coinbase reward schedule.

The honest proposal is the schelling point. We can easily increase the X/Y ratio to be more secure. If one pool is mining 100 blocks in a row, we have much bigger problems on our hands...

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: TierNolan on November 01, 2014, 05:22:40 PM

special about 60 days of work?

It's an arbitrary big number which is hopefully macroscopic relative to whatever external process you'd hope to rescue the system if it failed, but hopefully still small enough to give a good speedup on the checking.

I only used a specific number because thats an example of the kind of numbers that were banded about before... even given pretty healthy hashrate growth far in excess of computer industry historic trends numbers in that space should still be adequate to keep downloads bandwidth bounded instead of cpu bounded. (Also keep in mind that libsecp256k1 is >6x faster than openssl).

Quote

That number increasing means that the hash rate doubling time is increasing?

Yes, for many months: http://bitcoin.sipa.be/growth.png

Quote

This is a good idea anyway. A 60 block re-org would be indicate that something major has happened.

Yep.

Quote

Fraud proofs would presumably be new messages, so they would be sent to an external process? Isn't that more complex?

Complexity by itself isn't really the engineering constraint (passing a message is no big deal, in any case) the concern would be complexity coupled with risk. If there is something complex but its strongly isolated you can reason about what kind of things can happen if it goes wrong. E.g. fraud proofs can initially run over a separate protocol, communicate to an isolated sandboxed process, and even if they've massively gone wrong the threat is likely limited to a nuisance of making all your transactions show as unconfirmed, rather than splitting the consensus or stealing keys or what have you. See Matt's relaynode network client (https://github.com/TheBlueMatt/RelayNode) for an example of how to develop and try out protocol features (in that case, relaying blocks more efficiently by taking advantage of transaction pre-forwarding) with reduced risk to the production Bitcoin system.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: gmaxwell on November 01, 2014, 04:47:55 PM

Pieter and I-- at least-- have been talking about it for a while as part of the motivation for headers first, even prior to the fraud proofs: past some limit (E.g. max(60days work at the highest difficulty ever observed in the best chain, 2016 blocks) ) a huge reorganization at tip is a guaranteed system failure.

What is special about 60 days of work?

Quote

For a while this wasn't as interesting because the total time to replace the chain with the tips hashrate was very low, but it's finally expanding nicely: http://bitcoin.sipa.be/powdays-50k.png

Right, you can replace the entire blockchain with two doubling times worth of hashrate.

That number increasing means that the hash rate doubling time is increasing?

Quote

Another potential softer safety mechanism is simply holding the confirmation count in the RPC at zero when there has been a 'system failure grade' reorganization to allow time for a "higher process" to sort out the mess. This also needs a couple other pieces to be completely useful, like the ability to manually invalidate a block over the RPC.

This is a good idea anyway. A 60 block re-org would be indicate that something major has happened.

If headers were broadcast for invalid forks, then clients on both sides of the chain could detect the condition.

Quote

Obviously fraud proofs have been on my mind for a long time, but they have a lot of protocol surface area. The recent reorganizations of Bitcoin core (e.g. libscript work) will make it easier to work on them with confidence, e.g. be able to build an external process that receives and checks fraud proofs.

Fraud proofs would presumably be new messages, so they would be sent to an external process? Isn't that more complex?

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: TierNolan on November 01, 2014, 04:14:31 PM

Quote from: gmaxwell on November 01, 2014, 02:53:15 PM

No it doesn't. It can skip verifying very deeply burred signatures (as is the checkpoint behaviour), since with headers first it knows the amount of work on top of them and can perform the tests only probabilistically past a certain point.

Is this planned for the reference client (along with fraud proofs, presumably)?

Pieter and I-- at least-- have been talking about it for a while as part of the motivation for headers first, even prior to the fraud proofs: past some limit (E.g. max(60days work at the highest difficulty ever observed in the best chain, 2016 blocks) ) a huge reorganization at tip is a guaranteed system failure. (Once you cross 100 blocks a reorg starts forever invalidating an exponentially expanding cone of transactions). For a while this wasn't as interesting because the total time to replace the chain with the tips hashrate was very low, but it's finally expanding nicely: http://bitcoin.sipa.be/powdays-50k.png

Another potential softer safety mechanism is simply holding the confirmation count in the RPC at zero when there has been a 'system failure grade' reorganization to allow time for a "higher process" to sort out the mess. This also needs a couple other pieces to be completely useful, like the ability to manually invalidate a block over the RPC.

First priority was getting headers first in, tested, and mature.

Obviously fraud proofs have been on my mind for a long time, but they have a lot of protocol surface area. The recent reorganizations of Bitcoin core (e.g. libscript work) will make it easier to work on them with confidence, e.g. be able to build an external process that receives and checks fraud proofs.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: gmaxwell on November 01, 2014, 02:53:15 PM

No it doesn't. It can skip verifying very deeply burred signatures (as is the checkpoint behaviour), since with headers first it knows the amount of work on top of them and can perform the tests only probabilistically past a certain point.

Is this planned for the reference client (along with fraud proofs, presumably)?

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: work2heat on November 01, 2014, 01:46:05 PM

Good point. Parallel downloading is awesome. But the CPU still has to crunch ALL those EC verifies. ::sigh::

No it doesn't. It can skip verifying very deeply burred signatures (as is the checkpoint behaviour), since with headers first it knows the amount of work on top of them and can perform the tests only probabilistically past a certain point.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: work2heat on November 01, 2014, 02:32:46 PM

I still have not heard a reasonable argument as to why this can't/won't work.

Because it doesn't make any sense. Lets say you program nodes to enforce some criteria burred in a chain they're handed. Great now I create a simulated history which that sets a bogus 'checkpoint' back early in the chain, but any _new_ nodes that attach to me I give this simulated history to before they know there is a better chain elsewhere and they start enforcing that rule and they are now forked off onto this bogus alternative chain; so you've introduced a vulnerability. Worse, because the forking off can be arbitrarily far back it becomes exponentially cheaper to do so long as hash-power is becoming exponentially cheaper.

This is even _before_ getting to the argument that what you're suggesting weakens the security model even if it works fine: The result is that you give miners a new power, instead of just being able to reorder the history, they could also create arbitrary inflation just by adding new utxo to their updates. (which, if course, would be in all of their short-term interests to do)

work2heat

newbie

Activity: 21

Merit: 0

Quote

PoW does not imply validity.

The point is, everyone trusts the genesis block, and all updates from there are made via pow. This is essentially a proposal for a consensus process to update the genesis block forward, and to attach a tree of utxos so that the txs between the new gen block and the old never need to be seen again (we can put them in a museum of bitcoin history, if you like, but spare the new full nodes!). I still have not heard a reasonable argument as to why this can't/won't work.

andytoshi

full member

Activity: 179

Merit: 156

-

Quote from: work2heat on November 01, 2014, 01:46:05 PM

Exactly the point. Node's already trust that blocks are valid because they have PoW on them.

PoW does not imply validity.

work2heat

newbie

Activity: 21

Merit: 0

Quote

Time-related words like "old", "new" and "long after" only make sense if you have an existing blockchain with which to tell time. So the circularity is still there.

You do have an existing blockchain. The bitcoin one, up to now. And you can tell time in number of blocks. The genesis block was the first checkpoint. We could have the hashing power vote to checkpoint block 10,000, including the patricia tree hash of the utxo set up to that point. Then anything from before the checkpoint can be ignored, since the checkpoint can be considered part of the PoW consensus mechanism - if you trust the PoW generally to make ledger updates, then (conceivably) you can trust it to checkpoint.

Quote

because any transaction data that you "compress out" is transaction data that can't be validated by new nodes.

Exactly the point. Node's already trust that blocks are valid because they have PoW on them. The checkpoint will have PoW too, and hence be trusted in the same way, relieving the new client from having to validate anything before the checkpoint. That's the point, it's like a new genesis block, plus a utxo set.

Quote

From what I understand, headers first doesn't affect the new full node sync time at all. Please correct me if I'm wrong

It does, for two reasons:
- By downloading the headers first, you can quickly (in low bandwidth) eliminate stales, orphans and bad chains.
- Once you have the headers, you can download full blocks out of order from multiple peers (currently blocks are downloaded sequentially from a single peer, which if you get a bad one, can be extremely slow).

You're right that the time taken to validate the correct chain is unaffected.

Good point. Parallel downloading is awesome. But the CPU still has to crunch ALL those EC verifies. ::sigh::

andytoshi

full member

Activity: 179

Merit: 156

-

Quote

Old checkpoints distort the selection of the chain, but there's no reason new checkpoints can't be done with network consensus on chain long after a previous checkpoint (thus its more bootstrapping than circular).

Time-related words like "old", "new" and "long after" only make sense if you have an existing blockchain with which to tell time. So the circularity is still there.

Quote

It's potentially a powerful new way to compress the history and bring new nodes up to speed fast, especially since you can include a utxo patricia tree hash in there too.

Unfortunately no, because any transaction data that you "compress out" is transaction data that can't be validated by new nodes. You can get serious compression this way for SPV-security nodes (Appendix B of the sidechains whitepaper talks about a similar idea), but not for full-security ones.

Quote

From what I understand, headers first doesn't affect the new full node sync time at all. Please correct me if I'm wrong

It does, for two reasons:
- By downloading the headers first, you can quickly (in low bandwidth) eliminate stales, orphans and bad chains.
- Once you have the headers, you can download full blocks out of order from multiple peers (currently blocks are downloaded sequentially from a single peer, which if you get a bad one, can be extremely slow).

You're right that the time taken to validate the correct chain is unaffected.

work2heat

newbie

Activity: 21

Merit: 0

Quote from: gmaxwell on November 01, 2014, 03:28:52 AM

Far better to just get rid of them: Headers first makes most reasons obsolete. The circus above doesn't really help, since it's using the chain itself, which of course checkpoints distort the selection of, so it's just circular.

Old checkpoints distort the selection of the chain, but there's no reason new checkpoints can't be done with network consensus on chain long after a previous checkpoint (thus its more bootstrapping than circular). It's potentially a powerful new way to compress the history and bring new nodes up to speed fast, especially since you can include a utxo patricia tree hash in there too.

From what I understand, headers first doesn't affect the new full node sync time at all. Please correct me if I'm wrong

Topic: Re: What are checkpoints in bitcoin code? (Read 2838 times)