Author

Topic: parsing leveldb txindex (Read 571 times)

newbie
Activity: 3
Merit: 0
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
February 15, 2024, 06:59:05 AM
#3
Maybe this will help you parse the varints?

Code:
def parse_varint_hex(data):
    varint_type = hex_to_int(data[0:2])
    if varint_type < 0xFD:
        return varint_type, 1
    elif varint_type == 0xFD:
        return hex_to_int(data[2:6]), 3
    elif varint_type == 0xFE:
        return hex_to_int(data[2:10]), 5
    elif varint_type == 0xFF:
        return hex_to_int(data[2:18]), 9

def hex_to_int(string):
    return int.from_bytes(bytes.fromhex(string), byteorder="little")

This is what I use in my implementation.

Your code looks like it's reading something else, but not a varint.

I would advise you to test your function on some sample cases for which you know what the varint is supposed to decode to, because generally speaking, C and Python have slightly different semantics that might break ported code.
newbie
Activity: 3
Merit: 0
February 15, 2024, 04:50:21 AM
#2
I've also opened a related thread at: https://bitcoin.stackexchange.com/questions/121888/what-is-the-data-format-layout-for-txindex-leveldb-values I've made some progress in Python, but I must still have a bug in my code.
newbie
Activity: 5
Merit: 0
May 25, 2015, 06:41:14 PM
#1
I'm trying to parse the LevelDB txindex, i.e. fetch raw transaction data given a transaction hash.

I'm reading a key-value pair from LevelDB with 't' + hash_bytes, and e.g. for transaction 444b7ecbda319e184da1a3d68968e6e0ca9346ddcf7afd0e2b887a7949128805 the key's value is

80 58 8f c4 c8 66 80 9b 24

Now, if I'm reading the bitcoin source correctly, this should be three varints:

  • nFile
  • nPos
  • nTxOffset

If I'm correctly decoding them, the 3 values are: 216 34694374 20004 -- which looks good. However, the data at offset 34694374 + 20004 in file 216 is unexpected:

version: 4293214589
inCount: 93952409796607

It looks like either I'm decoding the varints the wrong way or I'm calculating the file offset wrongly?

The varint read+decode code is mostly a direct port from C:

Code:
  def varint(s):
      n = 0
      while True:
          ch = read1(s)
          n = (n << 7) | (ch & 0x7f)
          if ch & 0x80:
              n += 1
          else:
              return n
Jump to: