Ultraprune merged in mainline - page 4.

casascius

vip

Activity: 1386

Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Here's how it ought to work in my mind:

The user ought to have a simple way to decide what he wants to contribute to the network, with the default being something that ensures that the user remains a "full citizen node" but perhaps without automatically seeding large amounts of history without the user's consent. I imagine having four or five settings, but a real implementation will probably expound on the idea. (I realize that this is a thread about "ultraprune" and my examples mention "metatree", but please see past that - I am only presenting a 30,000-foot-level view of how I imagine this working)

What the other settings might be:

MINIMAL:
* Recommended for low-bandwidth or high-cost network connections.
* No incoming connections from peers allowed.
* Downloaded data set consists only of the minimum necessary to determine the latest block.
* Information about balances queried from peers on an as-needed basis
* Lowest possible security. Add trusted peers to the preferred peer list whenever possible.

LOW:
* No incoming connections from peers allowed.
* A pruned dataset is downloaded and maintained.

MEDIUM: (this would be the default setting)
* Incoming connections from peers allowed
* A pruned dataset is downloaded and maintained.
* Peers may download the dataset up to the configured upload limit

MEDIUM-HIGH: see image...

HIGH:
* Incoming connections from peers allowed
* Accepts metatree queries from peers, and seeds historical
   versions of metatree to assist in recovery/rollback if needed
* Full transaction history is maintained (requires XX GB,
   which increases over time)
* Allows peers to download the data set up to the
   configured bandwidth limit.
* Full network citizen/historian which assists in allowing other nodes
   to recover the entire network history in case recovery is needed
* Recommended setting for mining nodes wherever feasible

Ideally, if all of these modes were implemented, a new installation could start running in the "MINIMAL" mode regardless of user choice so it is instantly usable without a day of downloading, and then slowly upgrade itself to the level of the user's choice as objects are downloaded and verified.

Pieter Wuille

legendary

Activity: 1072

Merit: 1189

An answer to MysteryMiner, who asked in another thread:

Quote from: MysteryMiner on October 21, 2012, 09:38:34 AM

One more question - will the new database discard spent addresses? Some places says it will, some says it will not. I am confused. What will happen to clients that rely on downloading the complete transaction history and verify all blocks and transactions in them on-the-way, like 0.3.xx does?

The current code does not prune anything - it uses a pruned copy (in addition to the blockchain itself) for validation. Since this copy is much smaller, far less data needs to be accessed during block and transaction validation (it's around 120 MB right now). This makes it faster to validate and to update the database.

Also, Bitcoin at the protocol level does not know anything about addresses or balances - those are client-side things provided by the wallet abstractions. What we're talking about is removing individual transaction outputs that have been spent.

At some later point in time we may add actual pruning, by removing blocks (but not their unspent outputs in the pruned copy) that are old enough. This will imply they cannot be served to other nodes, cannot be rescanned, and cannot be reorganised away. Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore. This is why I believe in a move towards validation nodes and archive nodes.

Also, Bitcoin is a zero-trust system (at least full nodes are). This means that no data ever received from the network is ever taken for granted, and needs validation. This implies you can't ever bootstrap a (zero trust) node without having it validate the entire block chain (although it is not necessary that everyone keeps that data around forever).

niko

hero member

Activity: 756

Merit: 501

There is more to Bitcoin than bitcoins.

Quote from: LightRider on October 21, 2012, 06:19:16 AM

Will there be win32 compiles of this any time soon?

Yes, please.

LightRider

legendary

Activity: 1500

Merit: 1022

I advocate the Zeitgeist Movement & Venus Project.

Will there be win32 compiles of this any time soon?

paraipan

legendary

Activity: 924

Merit: 1004

Firstbits: 1pirata

Quote from: Syke on October 20, 2012, 10:41:48 PM

Quote from: jgarzik on October 20, 2012, 09:35:13 PM

Yes. However, to save downloading, you may provide

Code:

-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Excellent work.

+1 Will test

kokjo

legendary

Activity: 1050

Merit: 1000

You are WRONG!

testing it now!!

Atlas

jr. member

Activity: 56

Merit: 1

Hey jgarzik, I noticed you lost your Bitcoins under this upgrade but they do appear on the current release. Pretty serious, huh?

http://bitcoinstats.com/irc/bitcoin-dev/logs/2012/10/21

jgarzik

legendary

Activity: 1596

Merit: 1100

Quote from: Stephen Gornick on October 20, 2012, 11:58:12 PM

So is BDB still used at all? (e.g, for peers.dat and wallet.dat? ) Or will that likely be changing with the upcoming release that includes ultraprune?

peers.dat is a flat file with a bitcoin-specific file format, unrelated to any database system.

wallet.dat remains BDB, though there are proposals on changing that.

Atlas

jr. member

Activity: 56

Merit: 1

I recommend not releasing this until it has been thoroughly tested and analysed for at least 6 months.

Stephen Gornick

legendary

Activity: 2506

Merit: 1010

Quote from: Pieter Wuille on October 20, 2012, 05:37:51 PM

Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.

So is BDB still used at all? (e.g, for peers.dat and wallet.dat? ) Or will that likely be changing with the upcoming release that includes ultraprune?

Syke

legendary

Activity: 3878

Merit: 1193

Quote from: jgarzik on October 20, 2012, 09:35:13 PM

Yes. However, to save downloading, you may provide

Code:

-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Excellent work.

jgarzik

legendary

Activity: 1596

Merit: 1100

Quote from: Syke on October 20, 2012, 09:25:44 PM

Quote from: Pieter Wuille on October 20, 2012, 05:37:51 PM

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).

Does this require downloading and re-processing the blockchain from the beginning?

Yes. However, to save downloading, you may provide

Code:

-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Syke

legendary

Activity: 3878

Merit: 1193

Quote from: Pieter Wuille on October 20, 2012, 05:37:51 PM

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).

Does this require downloading and re-processing the blockchain from the beginning?

Pieter Wuille

legendary

Activity: 1072

Merit: 1189

(copy of mailinglist post)

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work). This is a very significant change, and all testing is certainly welcome. As a result of this, many pull requests probably don't apply cleanly anymore. If you need help rebasing them on the new structure, ask me.

The idea behind ultraprune is to use an ultra-pruned copy (only unspent transaction outputs in a custom compact format) of the block chain for validation (as opposed to a transaction index into the block chain). It still keeps all blocks around for serving them to other nodes, for rescanning, and for reorganisations. As such, it is still a full node. So, despite the name, it does not implement any actual pruning yet, though pruning would be trivial to implement now. This would have profound effects on the network though, so may still need some discussion first.

A small summary of the changes:

Instead of blk000?.dat, we have blocks/blk000??.dat files of max 128 MiB, pre-allocated per 16 MiB
Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.
A new LevelDB directory coins/, which contains data about the current unspent transaction output set.
New files blocks/rev000??.dat contain undo data for blocks (necessary for reorganisation).
More information is kept about blocks and block files, to facilitate pruning in the future, and to prepare for a headers-first mode.
Two new RPC calls are added: gettxout and gettxoutsetinfo.

The most noticeable change should be performance: LevelDB deals much better with slow I/O than BDB does, and the working set size for validation is an order of magnitude smaller. In the longer run, I think it is an evolution towards separation between validation nodes
and archive nodes, which is needed in my opinion.

Topic: Ultraprune merged in mainline - page 4. (Read 25405 times)