Pruning in the reference client: ultraprune mode

Pieter Wuille

legendary

Activity: 1072

Merit: 1189

Quote from: jim618 on July 06, 2012, 12:18:39 PM

Re: the blocks you store to disk:
1) I noticed doing something similar that with the later 500KB blocks if you zip them they go down to 300/350KB

Right, blocks themselves are certainly also compressible. A custom serializer could remove a lot of redundancy too, already: signatures in 64 instead of 72 bytes, public keys compressed to 33 bytes, not wasting 4 bytes for version, locktime, nsequence, ...

Quote

2) the early blocks are only 200 bytes or so but (on my Mac) occupy 4KB on the disk. This boosts the total size of all the blocks on disk by 500MB.

Sure, I just chose 1 file per block now because it was clean and easy to implement. Also for efficiency combining at least some blocks per file would be beneficial. It's two files per block right now even, actually, as the undo data is stored in a separate file.

Quote

3) for a hard disk a good amount of that 22 mins to do a full import is head seek time (5ms * 200,000 = 1000secs)

Not necessarily. There are cashes at several levels, and when many files are created per second, I'm sure the OS batches them together in a single or only a few writes.

Quote

For 2) and 3) I wondered about combining the small blocks as zip entries in zip files to reduce the disk wastage and seek times. For the tiny blocks up to, say, block 100k they will never be reorg-ed so bundling them up would not matter. You could have the 'containing zip file' in your block index.

I was toying with a rule like 'add to the current zip if it contains less than 500 entries and is less than 1 MB'.

Sure, things like that are possible. Note that if they are really zip files, querying a block would require reading and parsing the ZIP index to locate the data. Just a binary concatenation of (potentially compressed) blocks, with the start byte position stored in the index is probably easier.

paraipan

legendary

Activity: 924

Merit: 1004

Firstbits: 1pirata

Great concept Pieter, I will be watching this thread with interest.

jim618

legendary

Activity: 1708

Merit: 1069

Re: the blocks you store to disk:
1) I noticed doing something similar that with the later 500KB blocks if you zip them they go down to 300/350KB
2) the early blocks are only 200 bytes or so but (on my Mac) occupy 4KB on the disk. This boosts the total size of all the blocks on disk by 500MB.
3) for a hard disk a good amount of that 22 mins to do a full import is head seek time (5ms * 200,000 = 1000secs)

For 2) and 3) I wondered about combining the small blocks as zip entries in zip files to reduce the disk wastage and seek times. For the tiny blocks up to, say, block 100k they will never be reorg-ed so bundling them up would not matter. You could have the 'containing zip file' in your block index.

I was toying with a rule like 'add to the current zip if it contains less than 500 entries and is less than 1 MB'.

For 3) an SSD would speed things up immensely.

Pieter Wuille

legendary

Activity: 1072

Merit: 1189

(cross-post from the mailing list)

Hello all,

I've implemented a new block/transaction validation system for the reference client, called "ultraprune".

Traditionally, pruning for bitcoin is considered to be the ability to delete full transactions from storage once all their outputs are spent, and they are buried deeply enough in the chain. That is not what this is about.

Given that almost all operations performed on the blockchain do not require full previous transactions, but only their unspent outputs, it seemed wasteful to use the fully indexed blockchain for everything. Instead, we keep a database with only the unspent transaction outputs. After some effort to write custom compact serializations for these, I could reduce the storage required for such a dataset to less than 70 MB. This is kept in a BDB database file (coins.dat), and with indexing and overhead, and takes around 130 MB.

Now, this is not enough. You cannot have a full node wit just these outputs. In particular, it cannot undo block connections, cannot rescan for wallet transactions, and cannot serve the chain to other nodes. These are, however, quite infrequent operations. To compensate, we keep non-indexed but entire blocks (just each block in a separate file, for now), plus "undo" information for connected blocks around in addition to coins.dat. We also need a block index with metadata about all stored blocks, which takes around 40 MB for now (though that could easily be reduced too). Note that this means we lose the ability to find an actual transaction in the chain from a txid, but this is not necessary for normal operation. Such an index could be re-added later, if necessary.

Once you have this, the step to pruning is easily made: just delete block files and undo information for old blocks. This isn't implemented for now, but there shouldn't be a problem. It simply means you cannot rescan/reorg/server those old blocks, but once those are deep enough (say a few thousand blocks), we can tolerate that.

So, in summary, it allows one to run a full node (now) with:

130 MB coins.dat
40 MB chain.dat
the size of the retained blocks, +12% of that for undo information.

Oh, it's also faster. I benchmarked a full import of the blockchain (187800 blocks) on my laptop (2.2GHz i7, 8 GiB RAM, 640 GB spinning harddisk) in 22 minutes. That was from a local disk, and not from network (which has extra overhead, and is limited by bandwidth constraints). That is with some tweaking though.

If people want to experiment with it, see my "ultraprune" branch on github: https://github.com/sipa/bitcoin/tree/ultraprune.

Note that this is experimental, and has some disadvantages:

you cannot find a (full) transaction from just its txid.
if you have transactions that depend on unconfirmed transactions, those will not get rebroadcasted
only block download and reorganization are somewhat tested; use at your own risk
less consistency checks are possible on the database, and even fewer are implemented

Also note that this is not directly related to the recent pruning proposals that use an alt chain with an index of unspent coins (and addresses), merged mined with the main chain. This could be a step towards such a system, however.

Topic: Pruning in the reference client: ultraprune mode (Read 6926 times)