Author

Topic: How do nodes validate blocks, then update their own versions of the blockchain? (Read 311 times)

staff
Activity: 4326
Merit: 8951
  • validates all the transactions contained in the block

If you're getting into implementation details,

It might also be worth mentioning that most of the validation is typically done *before* the block comes in. In Bitcoin a great effort has been made to design things so that most of the validation is a pure function of the transaction-- for almost all rules given transaction is either valid or invalid depending only on its own content and nothing else.  The exceptions to this are checking that the inputs exist and aren't already spent, checking that the transaction nlocktimes are valid according to the block time, and checking that various block-wide global limits are obeyed (primarily that the block weight isn't exceeded).

This means that in the common case of transactions that were received ahead of the block most of their validation (amounts, scripts, signatures, etc) is cached and doesn't need to be done again when the block shows up, greatly reducing latency.

I think it's important to be aware of this because I've seen a number of posts over the years that seem to think that every time a new block is accepted everything in it must be laboriously reprocessed or even that the whole blockchain must be reprocessed. This isn't true, and it gives the wrong impression about where bottlenecks are on average.

This is made more complicated in Bitcoin because since 2010 validity in relay uses somewhat different, more strict, rules than validity in blocks to improve forward-compatibility.

The idea consensus rules can be improved in a way which is completely compatible with old software by strictly reducing the set of things that are permitted and by purposefully setting aside extension space in the protocol to eventually get restricted (such as NOP op-codes), old nodes then know something 'future' is going on and thus won't relay or mine it which would risk including something invalid, but are happy to accept whatever new nodes mine using the extensions.   Along with that, the consensus rules for a given block might be more strict than the rules for the block before it e.g. when a new rule activates.  

So the caching has to handle the rules changing and does that by augmenting the key for the cache with the set of rules, so the cache can only have a hit when the right rules are being used... also when a node receives a transaction it validates it under both the stricter relay rules and the looser block rules (why both when the relay rules are stricter?  because then there would be a risk if there were a bug in the relay rules it could result in accepting an invalid block, which would make it much slower to develop code for relay-rules).

[Aside, I removed a thread-derailing repetitive post by franky; and I'd strongly recommend everyone follow my lead and stick him on ignore to avoid wasting time and energy on his aggressive misinformation and disruption]

5. If a node receives a valid block which links cryptographically to a stale block, and this happens to create a chain (B) with a higher total difficulty, but which is still shorter than the active chain (A) in terms of number of blocks, then chain B will become the active chain, and all of the blocks in chain A after it splits (although not discarded) will now become stale.

Yes, in Bitcoin it is technically possible for the height to go down!  But it's not likely something we'll ever observe on mainnet, at least not near the tip of a synced node.  The reason is that the difficult of blocks only changes (retargets) once every 2016 blocks and reflects the time that the last 2015 blocks took. So for fewer blocks to have more work there has to be a fork around a retarget and the difficulty on the two forks has to be different enough that a branch with N blocks can have more work than a Block with N+1 blocks, and the fork has to be at least N blocks long.  Commonly forks are only 1 block long and for a branch with one block to have more work than a branch with two the difficulty will have had to more than double in that one fork.  It would be really hard for such a large difficulty difference because the two blocks will share most of the same history going into their difficulty calculation.  Having it happen with a longer fork would require less of a difficulty difference but longer forks are themselves more rare.

So the only time with a mainnet node that I would expect to see height go down is during initial sync, if you connected initially to a malicious clown who intentionally forked the blockchain early on and purposefully kept the difficulty down and churned out a lot of blocks at low difficulty, then height might go down temporarily as you switch from his joke chain to the real chain. Smiley

Various altcoins have various psycho over-eager difficulty update rules where its very easy for height to go down, either intentionally or by accident. In theory that's okay, but I bet a lot of software won't handle it well.

Fun fact: the Bitcoin whitepaper and the software up until IIRC early 2010 made the best valid chain decision on total blocks and ignored total work. Unfortunately, as I mentioned, it's quite possible for a fork that starts really early in the chain to win by number of blocks but not work-- even though it's something you'll never see in normal operation, so Satoshi's original way of handling it was vulnerable and easily attacked.
legendary
Activity: 2898
Merit: 1823
achow, I believe this thread should be stickied, for all newbies to read before the trolls get them with the wrong information first.

Like the first half of the original post?

That all non-mining full nodes validate? Point out the mistakes and let's discuss. It's always good to learn from one another, and learn which people are serious and which are the trolls.


oh windy. [facepalm]
if anything should be learned is now that new opcodes and transaction formats  can be slid in without hardforks as they please. it should actually be very much highlighted that not all nodes on the network are the newest up-to-date full archiving nodes able to understand all the new txformats.


You mean adding Schnorr and Taproot through soft forks?

Quote

this means
a. not all nodes actually fully validate, take people running nodes that are not segwit compatible,, they do not validate segwit signatures/scripts(now called witnesses) instead they blindly see a tx format they do not understand and presume its safe as they recieved the block containing the tx from a peer so the peer must have validated it(trust not full validation). and the same goes for even more newer differing tx formats. again for importance without checking that the spending addresses scripts/signatures validate.


Yes, I believe we already eatablished that fact in this topic, https://bitcointalksearch.org/topic/are-non-segwit-nodes-full-nodes-5067738

Quote

b. also apart from the criteria of validation, to be a TRUE "full node" you ned to be archival thus able to seed data for your peers to get. this also not unanimous to all peers as some run in pruned mode, etc.
just go check https://bitnodes.earn.com/nodes/?q=1037 and you will see at time of writing this. out of 9200 nodes listed only 6400 nodes are uptodate to the latest rule following standards of what would be "full node" standard.


"True?" A node that's pruned has validated everything, and will validate everything, but only difference is they don't seed the blockchain. Are you telling us that pruned nodes are not part of the network even if they do validate?

Quote

c. some nodes use checkpoints(min chain work) and when initially syncing the blockchain. they dont validate every transactions and every block from genesis, they just download the blockchain and only validate transactions/blocks after the milestoned point indicated. again presuming if the peers have the same then it must be good.
which from a security point presuming a new funky txformat not yet understood is 'valid' or early blocks are 'valid' purely due to presumption/trust that peers done their jobs for you. is bad bad security
specially if its only one group of devs that decide what chainwork to have as the default

Quote from: bitcoincore.org
If someone who starts a new full node for the first time knows about any valid blocks, they can then provide the highest-height one of those blocks to Bitcoin Core 0.14.0 and the software will skip verifying signatures in the blocks before the assumed valid block. Since verifying signatures consumes a lot of CPU during IBD, using assumed valid blocks can significantly speed up IBD. All blocks after the assumed valid block will still have their signatures checked normally
Quote from: bitcoincore.org
New users to Bitcoin probably won’t know about any valid blocks, but they probably also won’t know all the consensus rules—so they can simply use the copy of the full node software they download. Bitcoin Core 0.14.0 ships with a default assumed valid block that is set during the release process by having multiple well-known developers each confirm that a certain block is known to be valid by them
'trust core' [facepalm] (do you start seeing that core have more power than an open community want)


It's risky. But after that, and everything is "fine", they still validate.

Quote

d. even without you knowing the technicals. you seem to be trying to want to make this point for a popular social debate going on about presuming who has the real power in the network. EG trying to hide the fact that devs and the major exchanges have sway
however in social attack scenarios. if you as a node runner are told merchants and miners are being made to accept one version of a block the home users react by upgrading thier node to that version out of fear that they will get thrown off the network or be unable to spend thier funds because the merchants dont recognise them if they end up on the wrong version.
yep thats right merchants and miners have more sway because people fear being left off a network that is not compatible with their favourite merchant.


What social debate? They are technical facts. Run a full node, you validate, you are part of the network.

Quote

take the min chainwork stuff by cor and merchants using nodes with cores prefered minchainwork. means nodes not wanting to be core sheep dont get a voice. thus core and merchants have more power than home users to decide what the rules are
(as proven by the controversy of the NYA and august 1st 2017 drama )


achow, this is for you. Cool

Quote

anyway back to the technicals and point.
if a node is not 1037 its not doing all the full node requirements but is treated blindly as being a fullnode in respect of being on the network
not all nodes validate everything.


Yes, legacy nodes don't validate.
legendary
Activity: 4424
Merit: 4794
achow, I believe this thread should be stickied, for all newbies to read before the trolls get them with the wrong information first.

Like the first half of the original post?

That all non-mining full nodes validate? Point out the mistakes and let's discuss. It's always good to learn from one another, and learn which people are serious and which are the trolls.

oh windy. [facepalm]
if anything should be learned is now that new opcodes and transaction formats  can be slid in without hardforks as they please. it should actually be very much highlighted that not all nodes on the network are the newest up-to-date full archiving nodes able to understand all the new txformats.

this means
a. not all nodes actually fully validate, take people running nodes that are not segwit compatible,, they do not validate segwit signatures/scripts(now called witnesses) instead they blindly see a tx format they do not understand and presume its safe as they recieved the block containing the tx from a peer so the peer must have validated it(trust not full validation). and the same goes for even more newer differing tx formats. again for importance without checking that the spending addresses scripts/signatures validate.

b. also apart from the criteria of validation, to be a TRUE "full node" you ned to be archival thus able to seed data for your peers to get. this also not unanimous to all peers as some run in pruned mode, etc.
just go check https://bitnodes.earn.com/nodes/?q=1037 and you will see at time of writing this. out of 9200 nodes listed only 6400 nodes are uptodate to the latest rule following standards of what would be "full node" standard.

c. some nodes use checkpoints(min chain work) and when initially syncing the blockchain. they dont validate every transactions and every block from genesis, they just download the blockchain and only validate transactions/blocks after the milestoned point indicated. again presuming if the peers have the same then it must be good.
which from a security point presuming a new funky txformat not yet understood is 'valid' or early blocks are 'valid' purely due to presumption/trust that peers done their jobs for you. is bad bad security
specially if its only one group of devs that decide what chainwork to have as the default

It might also be worth mentioning that most of the validation is typically done *before* the block comes in. In Bitcoin a great effort has been made to design things so that most of the validation is a pure function of the transaction-- for almost all rules given transaction is either valid or invalid depending only on its own content and nothing else.  The exceptions to this are checking that the inputs exist and aren't already spent, checking that the transaction nlocktimes are valid according to the block time, and checking that various block-wide global limits are obeyed (primarily that the block weight isn't exceeded).
for instance some nodes are not 'native' segwit ready thus although they are network nodes they are not validating segwit signatures/scripts.
legendary
Activity: 2898
Merit: 1823
achow, I believe this thread should be stickied, for all newbies to read before the trolls get them with the wrong information first.

Like the first half of the original post?

That all non-mining full nodes validate? Point out the mistakes and let's discuss. It's always good to learn from one another, and learn which people are serious and which are the trolls.
newbie
Activity: 3
Merit: 8
Thanks for the additional insight!
Is my whole overview a better reflection now of what happens at this "block appending" stage? I've tried to take into consideration all the different feedback I received.
legendary
Activity: 1463
Merit: 1886
Your notes about the mempool seem fine. Just remember, the mempool is sort of an implementation detail to make the system work smoother.

The mempool just makes it easy for nodes to predict what transactions are going to be in future blocks. And also for miners, it helps miners know what to put in blocks. But each node can have it's own mempool policies if it wants, and in fact you can (and I regularly do) run a full-node without a mempool at all.
newbie
Activity: 3
Merit: 8
Thanks for all your helpful and detailed replies! Having considered your comments and thought about it all some more, I've come up with the following updated overview. The bits in bold italics are additional assumptions I've made about the mempool, and definitely need reviewing... Wink

Further feedback and corrections, gratefully received! Grin

1. When a new block has been successfully mined, it is propagated throughout the network for nodes to validate.

2. When a node receives a newly mined block, it first validates the whole block (e.g. hash of block header < difficulty target; hash of previous block = hash of block which node wants to build on, etc.) If it fails, the block is immediately rejected. If it is validated, the node immediately propogates the block to other nodes in the network for them to validate, and then procedes to validate the block's individual transactions (all inputs are valid UTXOs; sum of outputs < or = sum of inputs, etc.). If it passes both of these validation steps, the block is accepted by the node, added to its version of the blockchain, and the block's transactions removed from the node's mempool.

3. If the same node subsequently receives a valid competing block (with the same parent as the one added in 2. above), the initially-added block remains at the tip of the active chain, but the valid competing block is also kept, marked stale, but still monitored. If the node then receives a valid block which links cryptographically to the stale block, the block at the tip of what was the active chain becomes stale, and the now longer branch becomes the active chain instead.

4. Stale blocks are not discarded because they are still valid blocks with an ancestor within the active chain. Their transactions, however, remain in the node's mempool, or are returned to the mempool if previously removed. If a stale block becomes part of the active chain again, its transactions will be removed from the node's mempool.

5. If a node receives a valid block which links cryptographically to a stale block, and this happens to create a chain (B) with a higher total difficulty, but which is still shorter than the active chain (A) in terms of number of blocks, then chain B will become the active chain, and all of the blocks in chain A after it splits (although not discarded) will now become stale.
legendary
Activity: 3528
Merit: 4945
2. When a node receives a newly mined block, it:
  • validates all the transactions contained in the block by cross-checking them against its version of the blockchain;

You are correct that it validates all the transactions in the block.  However, it isn't clear what you mean by "cross-checking them against its version of the blockchain".  Everything about the transactions have to be valid.  The transaction inputs MUST all be valid UNSPENT outputs from earlier transactions. The sum of the value of the outputs can NOT exceed the sum of the value of the inputs. The scriptSig must generate a valid result. The in-counter MUST match the number of inputs. The out-counter MUST match the number of outputs. The lock_time (if any) MUST have expired. And so on.

  • validates the whole block by running the PoW algorithm to check that adding the block to its version of the blockchain generates the same hash value.

Again, you are correct that it validates the whole block. However, it isn't clear what you mean when you say "by running the PoW algorithm to check that adding the block to its version of the blockchain generates the same hash value".  Again, Everything about the block is validated.  The merkle root calculated from the list of transactions MUST match the merkle root in the block header. The block MUST have a valid timestamp. It must have a valid previous block hash. The nBits (difficulty target) value must be valid.  The hash of the block header MUST be lower than the current target. And so on.

4. Each node also continuously compares its growing version of the blockchain with the versions maintained by the other nodes it communicates with.

You've just repeated what you said in number 3.  You've basically said the same thing twice here, just phrasing it slightly differently each time.  The "continuously compares" that you are talking about here is a comparison to the valid blocks that are received from peers.
legendary
Activity: 2898
Merit: 1823
achow, I believe this thread should be stickied, for all newbies to read before the trolls get them with the wrong information first.
legendary
Activity: 1463
Merit: 1886
I'd really appreciate some more clarity on these points... I've been tying myself up in knots trying to reason it all through...  Wink

To simplify things, forget about the difficulty stuff for now. A node will always reject any blocks that are invalid (e.g. double spending money, or otherwise silly). Each block references a parent block or the original block. This in effect creates a chain (hence: blockchain).  A node will also consider the longest chain of blocks to be the only ones that matter. If two different chains have the same length, it's not really that important what it does but nodes will stick the one they saw first.

Bitcoin miners follow this procedure, but are also trying to find a new block to add on top of the best block they know of.

Basically the real important thing here is the rule: "the longest valid chain" results in nodes converging (i.e. if you have the longest chain, you can prove it to any node and they will accept it. Or if they have the longest chain, they can prove it to you).


Once that all makes intuitive sense, you need to realize that the difficulty (how hard it is to find a new block) is dynamic and the rule "the longest chain" would be trivially exploitable, so nodes instead use the metric "most work" instead (although its generally the same as 'longest' under normal circumstances).


Hopefully that makes it a little clearer
staff
Activity: 3458
Merit: 6793
Just writing some code
3. If the same node subsequently receives a valid competing block (to the one just validated and added in 2. above), will it immediately replace the previously added block (which would then become a stale block) if the competing block has a higher difficulty?
Yes. Note that higher difficulty means that the target value for that block is lower than the target value for the block at the current tip, not that one block's hash is lower than the other's. Given two blocks that descend from the same ancestor, this is impossible.

Or will it also add the valid competing block and temporarily maintain a split version of the blockchain until it receives another valid block which links cryptographically to one of the two branches, discarding at this point the block on the "shorter" branch?
When two competing blocks are received and their PoWs are equal (as two blocks descending from the same ancestor would be), the first block received is the one that remains on the tip. The second block is kept, but marked as stale. It is still tracked, so if another block is received extending the stale block, then the node will mark its original tip as stale and switch to the new most difficulty chain (a block mined on top of the stale block makes that branch the most difficulty).

If a node discovers a version that is "longer" (in terms of total difficulty, not necessarily number of blocks), does the protocol require it to automatically add the other version's additional, or different, block(s), and to discard any blocks included in its own version which are excluded from the other (which would then become stale blocks)?
Yes.

Or does each node decide for itself if and when to amend its own version in this way? For example, could it decide to wait to make such amendments until a certain number of blocks have been added to the other node's "different" block(s)?
No. If you could, there would be a whole lot of forks.

Does this have anything to do with the process of synching?
Yes. Syncing is basically just this updating. The main thing is that while syncing, certain functionality is disabled to avoid operating with an old chain tip. There are a few optimizations to speed up syncing (e.g. assumevalid and minimumchainwork), but the process of receiving and validating blocks during sync is pretty much the same as receiving and validating them during normal operation.
newbie
Activity: 3
Merit: 8
Is the following an accurate summary of what happens?
(I've also included some questions)

1. When a new block has been successfully mined, it is propagated throughout the network for nodes to validate.

2. When a node receives a newly mined block, it:
  • validates all the transactions contained in the block by cross-checking them against its version of the blockchain;
  • validates the whole block by running the PoW algorithm to check that adding the block to its version of the blockchain generates the same hash value.
If either of the above validation steps fail, then the block is rejected by the node. If both steps pass, then the block is added to the node's own version of the blockchain, and propagated to other nodes in the network for them to validate.

3. If the same node subsequently receives a valid competing block (to the one just validated and added in 2. above), will it immediately replace the previously added block (which would then become a stale block) if the competing block has a higher difficulty? Or will it also add the valid competing block and temporarily maintain a split version of the blockchain until it receives another valid block which links cryptographically to one of the two branches, discarding at this point the block on the "shorter" branch?

4. Each node also continuously compares its growing version of the blockchain with the versions maintained by the other nodes it communicates with.
If a node discovers a version that is "longer" (in terms of total difficulty, not necessarily number of blocks), does the protocol require it to automatically add the other version's additional, or different, block(s), and to discard any blocks included in its own version which are excluded from the other (which would then become stale blocks)? Or does each node decide for itself if and when to amend its own version in this way? For example, could it decide to wait to make such amendments until a certain number of blocks have been added to the other node's "different" block(s)?
Does this have anything to do with the process of synching?

I'd really appreciate some more clarity on these points... I've been tying myself up in knots trying to reason it all through...  Wink
Jump to: