Pages:
Author

Topic: Ultraprune merged in mainline - page 4. (Read 25405 times)

vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
October 21, 2012, 10:47:24 AM
#14
Here's how it ought to work in my mind:



The user ought to have a simple way to decide what he wants to contribute to the network, with the default being something that ensures that the user remains a "full citizen node" but perhaps without automatically seeding large amounts of history without the user's consent.  I imagine having four or five settings, but a real implementation will probably expound on the idea.  (I realize that this is a thread about "ultraprune" and my examples mention "metatree", but please see past that - I am only presenting a 30,000-foot-level view of how I imagine this working)

What the other settings might be:

MINIMAL:
 * Recommended for low-bandwidth or high-cost network connections.
 * No incoming connections from peers allowed.
 * Downloaded data set consists only of the minimum necessary to determine the latest block.
 * Information about balances queried from peers on an as-needed basis
 * Lowest possible security.  Add trusted peers to the preferred peer list whenever possible.

LOW:
 * No incoming connections from peers allowed.
 * A pruned dataset is downloaded and maintained.

MEDIUM: (this would be the default setting)
 * Incoming connections from peers allowed
 * A pruned dataset is downloaded and maintained.
 * Peers may download the dataset up to the configured upload limit

MEDIUM-HIGH: see image...

HIGH:
 * Incoming connections from peers allowed
 * Accepts metatree queries from peers, and seeds historical
    versions of metatree to assist in recovery/rollback if needed
 * Full transaction history is maintained (requires XX GB,
    which increases over time)
 * Allows peers to download the data set up to the
    configured bandwidth limit.
 * Full network citizen/historian which assists in allowing other nodes
    to recover the entire network history in case recovery is needed
 * Recommended setting for mining nodes wherever feasible

Ideally, if all of these modes were implemented, a new installation could start running in the "MINIMAL" mode regardless of user choice so it is instantly usable without a day of downloading, and then slowly upgrade itself to the level of the user's choice as objects are downloaded and verified.
legendary
Activity: 1072
Merit: 1189
October 21, 2012, 10:01:46 AM
#13
An answer to MysteryMiner, who asked in another thread:

One more question - will the new database discard spent addresses? Some places says it will, some says it will not. I am confused. What will happen to clients that rely on downloading the complete transaction history and verify all blocks and transactions in them on-the-way, like 0.3.xx does?

The current code does not prune anything - it uses a pruned copy (in addition to the blockchain itself) for validation. Since this copy is much smaller, far less data needs to be accessed during block and transaction validation (it's around 120 MB right now). This makes it faster to validate and to update the database.

Also, Bitcoin at the protocol level does not know anything about addresses or balances - those are client-side things provided by the wallet abstractions. What we're talking about is removing individual transaction outputs that have been spent.

At some later point in time we may add actual pruning, by removing blocks (but not their unspent outputs in the pruned copy) that are old enough. This will imply they cannot be served to other nodes, cannot be rescanned, and cannot be reorganised away. Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore. This is why I believe in a move towards validation nodes and archive nodes.

Also, Bitcoin is a zero-trust system (at least full nodes are). This means that no data ever received from the network is ever taken for granted, and needs validation. This implies you can't ever bootstrap a (zero trust) node without having it validate the entire block chain (although it is not necessary that everyone keeps that data around forever).
hero member
Activity: 756
Merit: 501
There is more to Bitcoin than bitcoins.
October 21, 2012, 09:53:30 AM
#12
Will there be win32 compiles of this any time soon?
Yes, please.
legendary
Activity: 1500
Merit: 1022
I advocate the Zeitgeist Movement & Venus Project.
October 21, 2012, 06:19:16 AM
#11
Will there be win32 compiles of this any time soon?
legendary
Activity: 924
Merit: 1004
Firstbits: 1pirata
October 21, 2012, 06:18:32 AM
#10
Yes.  However, to save downloading, you may provide
Code:
-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Excellent work.

+1 Will test
legendary
Activity: 1050
Merit: 1000
You are WRONG!
October 21, 2012, 04:53:11 AM
#9
testing it now!!
jr. member
Activity: 56
Merit: 1
October 21, 2012, 03:09:00 AM
#8
Hey jgarzik, I noticed you lost your Bitcoins under this upgrade but they do appear on the current release. Pretty serious, huh?

http://bitcoinstats.com/irc/bitcoin-dev/logs/2012/10/21
legendary
Activity: 1596
Merit: 1100
October 21, 2012, 01:47:03 AM
#7
So is BDB still used at all? (e.g, for peers.dat and wallet.dat? )  Or will that likely be changing with the upcoming release that includes ultraprune?

peers.dat is a flat file with a bitcoin-specific file format, unrelated to any database system.

wallet.dat remains BDB, though there are proposals on changing that.

jr. member
Activity: 56
Merit: 1
October 21, 2012, 01:11:01 AM
#6
I recommend not releasing this until it has been thoroughly tested and analysed for at least 6 months.
legendary
Activity: 2506
Merit: 1010
October 20, 2012, 11:58:12 PM
#5
  • Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.

So is BDB still used at all? (e.g, for peers.dat and wallet.dat? )  Or will that likely be changing with the upcoming release that includes ultraprune?
legendary
Activity: 3878
Merit: 1193
October 20, 2012, 10:41:48 PM
#4
Yes.  However, to save downloading, you may provide
Code:
-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Excellent work.
legendary
Activity: 1596
Merit: 1100
October 20, 2012, 09:35:13 PM
#3
I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).

Does this require downloading and re-processing the blockchain from the beginning?

Yes.  However, to save downloading, you may provide
Code:
-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

legendary
Activity: 3878
Merit: 1193
October 20, 2012, 09:25:44 PM
#2
I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).

Does this require downloading and re-processing the blockchain from the beginning?
legendary
Activity: 1072
Merit: 1189
October 20, 2012, 05:37:51 PM
#1
(copy of mailinglist post)

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work). This is a very significant change, and all testing is certainly welcome. As a result of this, many pull requests probably don't apply cleanly anymore. If you need help rebasing them on the new structure, ask me.

The idea behind ultraprune is to use an ultra-pruned copy (only unspent transaction outputs in a custom compact format) of the block chain for validation (as opposed to a transaction index into the block chain). It still keeps all blocks around for serving them to other nodes, for rescanning, and for reorganisations. As such, it is still a full node. So, despite the name, it does not implement any actual pruning yet, though pruning would be trivial to implement now. This would have profound effects on the network though, so may still need some discussion first.

A small summary of the changes:
  • Instead of blk000?.dat, we have blocks/blk000??.dat files of max 128 MiB, pre-allocated per 16 MiB
  • Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.
  • A new LevelDB directory coins/, which contains data about the current unspent transaction output set.
  • New files blocks/rev000??.dat contain undo data for blocks (necessary for reorganisation).
  • More information is kept about blocks and block files, to facilitate pruning in the future, and to prepare for a headers-first mode.
  • Two new RPC calls are added: gettxout and gettxoutsetinfo.

The most noticeable change should be performance: LevelDB deals much better with slow I/O than BDB does, and the working set size for validation is an order of magnitude smaller. In the longer run, I think it is an evolution towards separation between validation nodes
and archive nodes, which is needed in my opinion.
Pages:
Jump to: