Author

Topic: Reading Block directory : Sequential write ? (Read 1384 times)

hero member
Activity: 714
Merit: 662
August 10, 2014, 08:20:29 PM
#5
After benchmarking, on my machine,
One whole scan on blk folder takes 7 minutes.
A local full download with the protocol (not RPC) take between 3 and 6H. But I might improve a bit with some multi threading.

I guess I will continue to scan the folder directly for now. I'll follow up any change in block storage format on github.
If I hit a problem, I'll make a custom process that will maintain its own block directory thanks to protocol connection to the local, trusted node.
Thanks,
staff
Activity: 4326
Merit: 8951
Thanks, I understand this solution is fragile.
However, I don't see any solution yet that permit enumeration of blocks of bitcoind with high performance.
RPC is usable, but at enumeration of 300 000 with RPC is 10 000 times slower than using the blk directory directly.
I don't want either to implement a full node in NBitcoin, this is serious business and any subtle incompatibility with core would provoke a fork.
Is there another solution ? If not, is it possible at least, to expect if it were to change in the future, a flag to bitcoind to always store full blocks in directory ? (but don't use it)
Or a getblocks (with 's') in the RPC API ?
You can speak the P2P protocol just to fetch blocks— right now this is the fastest way... Note that I'm not suggesting you implement a full node (you are wise to avoid that), but instead use bitcoind as a filter and fetch blocks over the p2p protocol.

RPC getblock"s" would likely not be a lot faster due to the fact that much of the time is spent on the JSON handling.
hero member
Activity: 714
Merit: 662
Thanks, I understand this solution is fragile.
However, I don't see any solution yet that permit enumeration of blocks of bitcoind with high performance.
RPC is usable, but at enumeration of 300 000 with RPC is 10 000 times slower than using the blk directory directly.

I don't want either to implement a full node in NBitcoin, this is serious business and any subtle incompatibility with core would provoke a fork.

Is there another solution ? If not, is it possible at least, to expect if it were to change in the future, a flag to bitcoind to always store full blocks in directory ? (but don't use it)

Or a getblocks (with 's') in the RPC API ?
staff
Activity: 4326
Merit: 8951
Thats certantly the case today but we make no promise to maintain that in the future, if changing it serve some useful end. The block files are not really a user facing interface. Headers first will make it write to them out of order (but still append only), but pruning may delete whole blocks out from under you and also in the future we may implement things like compression which changes the format.
hero member
Activity: 714
Merit: 662
In NBitcoin, I have coded a class that allows me to enumerate blocks in the Block Directory folder of bitcoind.

It works perfectly.
Then I am creating an open source indexer like BlockChain.info with stealth and CC support, and here how it works :
-I run bitcoind that maintain the Block Directory.
-Every minutes, the Indexer run, and traverse the Block Directory from its last position, to the end, and save the new position, indexing everything on the way.

It works fine, under the assumption that bitcoind will never append a block to blk5.dat, if its last block file is blk10.dat.

But one of my user seems to tell me that my assumption is wrong, and got a bug because of it.
So I looked at bitcoind code source.

I noticed that LoadBlockIndexDB() method, that is called at startup, retrieve the last position where it wrote a block into the nLastBlockFile and nLastBlockFile global variables.
Quote
    pblocktree->ReadLastBlockFile(nLastBlockFile);
    LogPrintf("LoadBlockIndexDB(): last block file = %i\n", nLastBlockFile);
    if (pblocktree->ReadBlockFileInfo(nLastBlockFile, infoLastBlockFile))
        LogPrintf("LoadBlockIndexDB(): last block file info: %s\n", infoLastBlockFile.ToString());

Then I have seen that when you save a new block, you find the next free position in such file with the FindBlockPos.
The search for free space start from the nLastBlockFile position.

So with this information, I conclude that bitcoind is writing sequentially to the BlockDirectory, and can never write back.

Can a dev would confirm my conclusion or am I missing something ?
Jump to: