Author

Topic: How to manually verify blk*.dat and rev*.dat files? (Read 376 times)

full member
Activity: 179
Merit: 131
Existed network nodes do not ask old blocks. So, your node will be fine.
Transaction validation does not require data from old blocks either.
So, if your UTXO database is fine - your node is also fine and robust.
The only thing you can not do is cloning.
Thanks for your confirmation.

Perhaps I am just being too paranoid about maintaining the integrity of my full nodes. That is perhaps because I recently keep updating my criteria to ban misbehaving peers, on top of the ones automatically banned by bitcoind.

I think I will post another topic about this as I am not sure about something.
sr. member
Activity: 770
Merit: 305
But the peer which has corrupted blk*.dat files will eventually be isolated as a lot of peers ban it.
Existed network nodes do not ask old blocks. So, your node will be fine.
Transaction validation does not require data from old blocks either.
So, if your UTXO database is fine - your node is also fine and robust.
The only thing you can not do is cloning.

Quote
mechanism to notify the peer which sends corrupted blocks
No reason to notify a daemon.
full member
Activity: 179
Merit: 131
It's my understanding that the blk*.dat files are append only, so everything but the current file is 100% immutable. (Until I started pruning on my oldest Bitcoin node, my low numbered blk*.dat files had a last-modified timestamp from 2011.) If this is correct, there's no need to repeatedly verify the entire blockchain with exhaustive checks, since the data in the large majority of the dat files will never be updated by the client. Something like this should suffice:
Indeed. It is only the latest blk*.dat file that is being updated. The rest of the blk*.dat files are left untouched and they are only being read when peers request the blocks located in those older blk*.dat.

My question was, what happen when a peer requests old blocks that are located in a corrupted blk*.dat file (e.g. due to bitcoind crashes or not properly shutdown), so bitcoind sends corrupted blocks as well.

From Bitcoin network perspective, I think those peers will just reject the corrupted blocks (perhaps banned the peer which sends it) and request the same blocks to other peers. So there is no issue at bitcoin network level. But the peer which has corrupted blk*.dat files will eventually be isolated as a lot of peers ban it.

So I was wondering if there would be a mechanism to notify the peer which sends corrupted blocks, so that it can update its blk*.dat based on the data from its outbound peers. But the main problem is that, nobody can know for sure whether the corrupted block is due to the peer has corrupted blk*.dat file or the peer intentionally sends garbage block. So I think it might be difficult to implement such mechanism. But perhaps the top Bitcoin programmers have some ideas to deal with this kind of issues.
legendary
Activity: 2268
Merit: 1092

How about my other question? I am sorry for a basic question as I didn't realise that until now.

What will happen if my main full node (which has all blocks from blk00000.dat) fails to send valid data to its peers due to corrupted (for whatever reasons) blk*.dat and/or rev*.dat files? What kind of mechanism applies in bitcoind?

It's my understanding that the blk*.dat files are append only, so everything but the current file is 100% immutable. (Until I started pruning on my oldest Bitcoin node, my low numbered blk*.dat files had a last-modified timestamp from 2011.) If this is correct, there's no need to repeatedly verify the entire blockchain with exhaustive checks, since the data in the large majority of the dat files will never be updated by the client. Something like this should suffice:

1. A byte level checksum of all blk*.dat and rev*.dat files except the current file to confirm that previously verified data has not changed;
2. A complete verification of individual blocks and transactions in the current blk*.dat file (-checkblocks=x at startup, or the verifychain RPC call)

The value of x would need to be generous in order to ensure it covers the final block file. (Would be handy if there was a way to specify the height, blockhash, or blk*.dat file to start at.)

Back in 2013 it seems the client checked the last 2500 blocks by default! https://bitcointalksearch.org/topic/what-is-checkblocks-for-and-why-does-it-default-so-high-141200
full member
Activity: 179
Merit: 131
Next time I will ask you to write a small program which intercepts network traffic
from your node to 8333 ports and puts some garbage in it.  Grin
Why do you think I would do that? Even if I were a top Bitcoin programmer, that would defy my own personal principle. I have already spent a lot of my own personal efforts and (some) money to support Bitcoin network since version 0.9.x as I like the idea and the objective. I hate the change of its name to "Bitcoin Core" though, just because of some stupid people tried to disrupt it.

Take the binary file editor, open one of the oldest files ( for example blk00005.dat )
and change some data in it. I am quite sure that the bitcoind on your node will not
discover any problems.
I believe this would only happen on a crappy in-house developed software and only maintained by 1 or 2 programmers. There are a lot of companies involve as this becomes an industry with huge market valuation and supported by thousands of developers around the globe. So I really doubt that we (including me) are so stupid to trust software that has no mechanism to maintain the integrity of the database.

What I asked in this topic is how to do manual check which is better and faster than using "bitcoin-cli verifychain" command. Maybe there is no other way to do manual check. But I believe there must be a mechanism within bitcoind (and bitcoin-qt) software to make sure the integrity of blockchain data is properly maintained. I asked about that mechanism is for me to understand the impact on my full node. A few years back when the blockchain data on my full node was corrupted, I had to do reindex resulting it to re-download the entire blocks from blk00000.dat.

But you pointed out something that I have to double check myself. I think I will intentionally corrupt the 2nd last blk*.dat file and run my full node on Bitcoin testnet to see what will happen.
sr. member
Activity: 770
Merit: 305
I think that would be enough. But that means I assume that everything must be
valid as all block files are valid. I really don't like that kind of assumption though.

Take the binary file editor, open one of the oldest files ( for example blk00005.dat )
and change some data in it. I am quite sure that the bitcoind on your node will not
discover any problems.

Next time I will ask you to write a small program which intercepts network traffic
from your node to 8333 ports and puts some garbage in it.  Grin
full member
Activity: 179
Merit: 131
Out of curiosity, I just executed
Code:
anto@deeppurple:~$ bitcoin-cli verifychain 1 10000
true
anto@deeppurple:~$

And the result
Code:
2019-02-20T22:24:01Z Verifying last 10000 blocks at level 1
2019-02-20T22:24:01Z [0%]...[10%]...[20%]...[30%]...[40%]...[50%]...[60%]...[70%]...[80%]...[90%]...[DONE].
2019-02-20T22:32:38Z No coin database inconsistencies in last 10000 blocks (0 transactions)

It took 8 minutes and 37 seconds to verify just the validity of the last 10000 blocks. It will take about 8 hours to verify 565947 blocks. I think that is do-able. So the next time I restart my bitcoind, I will use the following command to check the validity of all block files.
Code:
bitcoind -checkblocks=0 -checklevel=1

I think that would be enough. But that means I assume that everything must be valid as all block files are valid. I really don't like that kind of assumption though.

I am wondering how many people (who run full nodes obviously) check all block files (checkblocks=0) with checklevel=4. Is there anybody who does that regularly?

There must be better way than that, as otherwise it will take a lot longer time next year when we reach above 1 million blocks. For instance, applying some kind of a process to notify the peer that the block which it sent out is invalid so that the peer can update its block file based on the data on its outbound peers.
full member
Activity: 179
Merit: 131
Your peers will disconnect/ban your misbehaviour node and will try to get valid data from another sources.
This is exactly what I want to avoid, hence my questions.

If something went wrong outside my control causing the corruption on blk*dat and/or rev*.dat files, e.g. power outage or glitch, my full node should not be categorised as "misbehave node". There must be better mechanism to avoid such full nodes from being banned. In my case, I want to be able to make sure the integrity of the blockchain files on my full node.
legendary
Activity: 2394
Merit: 1216
The revolution will be digital
Your files are not corrupted.
a) BitcoinCore writes blocks into blk*.dat files not in their order.
b) BitcoinCore keeps orphan blocks in blk*.dat files
This causes differences.

The databases are equal - but the files can be different
Why the old orphans are not removed? Is it a flaw in architecture?
sr. member
Activity: 770
Merit: 305
What will happen if my main full node (which has all blocks from blk00000.dat) fails to send valid
data to its peers due to corrupted (for whatever reasons) blk*.dat and/or rev*.dat files?
Nothing will happen.
Your peers will disconnect/ban your misbehaviour node and will try to get valid data from another sources.
(Nodes do not send rev*.dat information to each other. These files are local database files for
updating the current utxo set)

Quote
What kind of mechanism applies in bitcoind?
Trust to nobody. Check everything youself. Follow the white rabbit longest chain.
full member
Activity: 179
Merit: 131
a) BitcoinCore writes blocks into blk*.dat files not in their order.
b) BitcoinCore keeps orphan blocks in blk*.dat files
Thanks a lot. That explains why my pruning full node and my main full node have different md5sum from blk01513.dat and rev01513.dat onward, as starting from those files they are running independently.

How about my other question? I am sorry for a basic question as I didn't realise that until now.

What will happen if my main full node (which has all blocks from blk00000.dat) fails to send valid data to its peers due to corrupted (for whatever reasons) blk*.dat and/or rev*.dat files? What kind of mechanism applies in bitcoind?

I am running bitcoin core 0.17.1 by the way.

And does it make sense to run for instance "bitcoin-cli verifychain 4 563927" once in a while to make sure the integrity of my full node?

On my VPS that might take about 7 hours as it took about 4 seconds to verify 100 blocks.

Edited:
Sorry... wrong calculation on "4 seconds to verify 100 blocks". It took about 1 minutes 14 seconds to verify 100 blocks. So to verify 563927 blocks will take about 5 days on my VPS.
sr. member
Activity: 770
Merit: 305
Your files are not corrupted.
a) BitcoinCore writes blocks into blk*.dat files not in their order.
b) BitcoinCore keeps orphan blocks in blk*.dat files
This causes differences.

The databases are equal - but the files can be different
full member
Activity: 179
Merit: 131
I apologise if this had been asked and answered before, but I failed to find that on this forum.

Every time I found a new VPS provider offering cheaper and bigger resources, I move my full node to the new VPS. I make sure the integrity of blk*.dat and rev*.dat files by comparing their md5sum on the source and target VPS.

I recently decided not to renew the contract of one of my VPS'. Instead of letting it waiting for its contract expiry date and doing nothing, I configured another full node with pruning as it has only about 100 GB storage space.

When I compared the pruning full node with the main one, the most recent blk*.dat and rev*.dat files have different md5sum as below.

Code:
.
.
49d09f29e04dd8c91fea980a1978ee9c  blk01510.dat 49d09f29e04dd8c91fea980a1978ee9c  blk01510.dat
6bdeff13773a02b73b38a859b5f8a461  blk01511.dat 6bdeff13773a02b73b38a859b5f8a461  blk01511.dat
e419727a0b5a370256f623ccbac64e13  blk01512.dat e419727a0b5a370256f623ccbac64e13  blk01512.dat
802fa5848a9cb9fa4adc0b5e2b584207  blk01513.dat      | 3edb80b29dbb6b80c9fbe477d26b5edd  blk01513.dat
bb8f563889804ebce134f371cdd61213  blk01514.dat      | eb7642fcc1de5ff0825a23e23d991e71  blk01514.dat
eeedb8d91fe9fa7cc19021d142744b0f  blk01515.dat      | 08786174285354d992c730c14fb50edb  blk01515.dat
.
.
2e3c8f5cf0bfa1e9147390d238d71001  blk01533.dat      | 07e3d7f105cf504830afae31be5c9574  blk01533.dat
caf153aaf6433814b6579f0a531ae6b4  blk01534.dat      | 5f3021e57cfd4a97204bf25f38ef56a8  blk01534.dat
9a7d8d931540f64a06c7fce73db791cf  blk01535.dat      | 49ae0a801b7fd36ff22f578e521141a8  blk01535.dat
.
.
3a70613f860c242201f4f84d27524992  rev01510.dat 3a70613f860c242201f4f84d27524992  rev01510.dat
7dc17fc34d5ffe5dca1f4fccf0591641  rev01511.dat 7dc17fc34d5ffe5dca1f4fccf0591641  rev01511.dat
ac77bbefd85305fce50dfcc63e41cd1d  rev01512.dat ac77bbefd85305fce50dfcc63e41cd1d  rev01512.dat
95c0f5bc1b19ceee3ee7ba727e580fd3  rev01513.dat      | 0c861743a64dfa3b50002aee40e391b9  rev01513.dat
9d6d81f0429fc33f36ed7fb49cdfe1bd  rev01514.dat      | 74caa8ce3429ccb55e4fb50de0a23630  rev01514.dat
375cf8c1ffef33d6899981d09087205b  rev01515.dat      | 43c2a326b5c5c66da8172f924ba4357d  rev01515.dat
.
.
207ce7cc6b82f9dd0e514edfbfaa3332  rev01533.dat      | dd0dc8175770ea616dce3d7c4fb292fc  rev01533.dat
3a22716cecc8759d96545c8be7274c4a  rev01534.dat      | 167bce7ead5ee42fcdb7629fb86edb2e  rev01534.dat
befed144db33cd208be59bdedb360098  rev01535.dat      | 63fa05c946ad883797ed6053b3a87da5  rev01535.dat

That makes me wonder on the integrity of my blockchain files.

As far as I know, the only command to verify the blockchain database is "bitcoin-cli verifychain" as below

bitcoin-cli shell
Code:
anto@deeppurple:~$ bitcoin-cli verifychain 4 100
true
anto@deeppurple:~$

debug.log
Code:
.
.
2019-02-20T16:46:18Z Verifying last 100 blocks at level 4
2019-02-20T16:46:18Z [0%]...[10%]...[20%]...[30%]...[40%]...[50%]...[DONE].
2019-02-20T16:47:32Z No coin database inconsistencies in last 100 blocks (236944 transactions)
.
.

But I don't think that will tell me which blk*.dat and rev*.dat files that are corrupted (if any). And it will take quite a long time to verify the whole blockchain files.

Is there any better and faster way to manually verify those blk*.dat and rev*.dat?

What will happen if my full node sends corrupted data to its peers who asked very old blocks, due to the corrupted blk*.dat and rev*.dat files? Will my full node get notified so that it can automatically repair the relevant corrupted blk*.dat and rev*.dat files?

Thanks a lot in advance for your help.
Jump to: