Author

Topic: How to read block file? (Read 2675 times)

legendary
Activity: 1428
Merit: 1093
Core Armory Developer
September 07, 2011, 12:57:10 AM
#5
Btw, one minor detail I left out was the fact that btween the header and the TxData is a var_int giving you the number of transactions included in that block. 

If you are interested in examining other files, you might consider using my mysteryHex tool that will help you extract unknown file formats:  https://bitcointalksearch.org/topic/mysteryhexpy-10-figure-out-unidentified-binaryhex-data-38336

Check out the linked git repo via "git clone git://github.com/etotheipi/PyBtcEngine.git" and run something like:

Code:
python mysteryHex.py -b --byterange=0,1000 -f ~/.bitcoin/blk0001.dat

This will open the blk0001.dat file (-f)  as binary (-b), and read bytes 0-1000.  It will then find everything recognizable in that chunk of data and display the results visually.  It is quite useful for identifying random files/serialized fragments, or picking apart BTC data formats.
hero member
Activity: 589
Merit: 500
September 04, 2011, 02:43:31 AM
#4
etotheipi, thank you for your detail explanation.

legendary
Activity: 1428
Merit: 1093
Core Armory Developer
September 03, 2011, 09:04:34 PM
#3
Here's the relevant code in my project, though I pulled out a lot of vailidity checking and is using my own data structures, so it's not directly usable... only for informational purposes.  But you should be able to adapt it to your project.  The structure of blk0001.dat really is quite simple:

Code:
4 | 4 | 80 | TxData | 4 | 4 | 80 | TxData | 4 | 4 | 80 | TxData | ...

First 4 bytes - magic bytes (identifying which network you are on)
Second 4 bytes- the number of bytes of the remaining block
Next 80 bytes - block header itself
NumBlockBytes-80 - Transaction data in this block [ numTx | Tx1 | Tx2 | Tx3 | ... ]


Code:
uint32_t importHeadersFromBlockFile(std::string filename)
{

      BinaryData  thisHash(32);
      BinaryData  magicNum(4);
      BinaryData  thisHeaderSer(80)
      BlockHeader thisHeader;

      // While there is still data left in the stream (file)...
      while(!bsb.isEof())
      {
            // Get the magic bytes
            magicNum = bsb.reader().get_BinaryData(4);

            // Get total number of bytes in this block (including header)
            numBlockBytes = bsb.reader().get_uint32_t();

            // In case I want to retrieve block data from file later
            uint64_t blkByteOffset = bsb.getFileByteLocation();

            // Pull the header from the block data
            thisHeaderSer = bsb.reader().get_BinaryData(80);

            // Interpret header data and compute hash
            thisHeader.unserialize(thisHeaderSer);
            thisHash = thisHeaderSer.getHash256Digest();

            // Finally, skip the rest of the block data because only pulling headers
            bsb.reader().advance(numBlockBytes-80);
 
      }
   }
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
September 02, 2011, 03:45:37 AM
#2
I don't have the code directly in front of me, and I'm very short on time.  But if it helps, I've done this before, and it is actually quite simple.  I just don't have time to go dig up my source code right now... perhaps tomorrow if you don't have it yet.

In blk0001.dat, the first four bytes of every block is the magic number (f9beb4d9).  Followed by 4 bytes which is the number of bytes in the block, N.  The following 80 bytes is the header.  Then the following N-80 is the block data which can be ignored.

Rinse, repeat.
hero member
Activity: 589
Merit: 500
September 01, 2011, 08:33:04 PM
#1
I just want to read block header information from the block file. Is there any simple c/c++ code segment related to this simple problem? So I can do some change of the code to use it for this purpose. Thanks!
Jump to: