Pages:
Author

Topic: [Fundraising] Finish “Ultimate blockchain compression” - page 3. (Read 25512 times)

full member
Activity: 202
Merit: 100
maaku, are you aware that RSS feed isn't working on your website.
I can't conveniently keep abreast with your progress.
donator
Activity: 293
Merit: 250
Hi Maaku,
Do you have also a Freicoin donation address for this project?

I would love to donate some FRCs for this Project Wink

I wish you the best for your project,
Arcurus

UPDATE:

Oh I just saw at your blog, that also Freicoins are accepted at this address, so I can just donate Freicoins to the same address?

"13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP (Bitcoin or Freicoin accepted)."
legendary
Activity: 1792
Merit: 1008
/dev/null
great to see you dont abandon it, keep it up Smiley
legendary
Activity: 905
Merit: 1011
I've posted a status update to the implementation blog, including a brief overview of the current development pathway:

http://utxo.tumblr.com/post/52638982528/pathway-to-freedom
legendary
Activity: 1806
Merit: 1090
Learning the troll avoidance button :)
This is quite high level stuff and I can see how significant these changes would be to the overall usability of bitcoin downloading that client is a significant barrier to its usage as the blockchain grows.
In addition the amount of gruntwork to make such a system work
Since this is a technical discussion thread I will refrain from further posting but I wanted to express my appreciation to developers in building up bitcoin and ensuring it's ongoing growth and commend your transparency
Thank you and I look forward to seeing the new implementation
legendary
Activity: 1078
Merit: 1002
100 satoshis -> ISO code
Maaku, instead of whining and complaining about the size of the blockchain, you have stepped up like a man to solve the situation.

You've just been sent a grant of 150 Bitcoins by myself, on behalf of SatoshiDICE, for the ongoing growth and development of this amazing technology we call Bitcoin. (Note to SD shareholders, this is a personal contribution, not coming out of SD earnings)

Further, if after three months, Alan (whom I respect very highly) recommends further development, I will commit to funding another three months (we'll revisit the btc amount based on the exchange rate then).

Please contact me on skype (evoorhees) and we can discuss further. And I agree with Alan that this needs a better name Smiley

Just saw this post, and want to say that this is an awesome expression of support for the future success of Bitcoin!
legendary
Activity: 905
Merit: 1011
I have started a blog where I will post periodic status updates and technical explanations as I begin work on this proposal. Please check it out:

http://utxo.tumblr.com/
legendary
Activity: 1400
Merit: 1009
legendary
Activity: 905
Merit: 1011
Impaler can get excited sometimes. I am preparing a blog post explaining in detail what I propose to accomplish. Here is a listing of the major points of order:

* The bitcoin Satoshi client (bitcoind and Bitcoin-Qt) will be updated to support two separate Merkle-tree indices: txid:n -> TxOut, and script -> TxOut.
* JSON-RPC commands will be added to query these indices, retrieving (1) the root hashes of the indices for a block; (2) proof of (non-)existence of an unspent transaction output; or (3) the unspent outputs associated with a script, with proof.
* New P2P messages will be created for making queries (2) and (3) against nodes exposing suitable service bits.
* Implement a merged-mined “meta-chain” which contains instead of transactions, checkpoints of the root hashes of the above indices.
* Modify the Satoshi client to start in security-level 2 (see above), and work its way to security-level 0 as a fully-validating node, allowing for quick startup times irregardless of blockchain size.

There are other related things which may be valuable to accomplish, such as pruned operation, and many, many small details left out. However any work on alternative currencies such as Freicoin is outside of scope, and will not be paid for by this stipend.
legendary
Activity: 1400
Merit: 1009
Occasionally though other coins will create something that can be ported back to BTC.  Over at FRC we have set our the goal for our next hard-fork to adding a block-chain compression feature originally proposed by etotheipi that will solve the notorious huge-download-barrier-for-new-user problem.  Our lead programmer has secured 3 months funding generously provided by evoorhees on behalf of SatoshiDICE on the promise the code will be back compatible with BTC.  Our motivation at FRC is that while our block-chain is small now we expect to be around a long time and don't want to have that download problem in the future so it's in our interest to solve this problem.
maaku, would you mind clarifing the scope of his project?
hero member
Activity: 924
Merit: 1000
Quote
Status: 0/unconfirmed, broadcast through 7 nodes
Date: 24/05/2013 17:59
To: Makku 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
Debit: -0.099 BTC
Transaction fee: -0.0005 BTC
Net amount: -0.0995 BTC
Transaction ID: eae93c9d529c486cb679bf0e9f0a72f43bb4db92580ba21cc6a2d7a66847941e

COMPRESS IT!
legendary
Activity: 1400
Merit: 1009
Due to constructive feedback received at the conference, I have created a ReDonate campaign as an optional method for adding to the "Ultimate blockchain compression w/ trust-free lite nodes" campaign. ReDonate is a service for contributing a one-time donation, while receiving periodic reminders in the future to re-donate if you are happy with the progress being made.

http://redonate.net/dashboard/bitcoin-developer-maaku
That's the link for your personal dashboard. The rest of us need to use this link:

http://redonate.net/campaign/bitcoin-developer-maaku
legendary
Activity: 905
Merit: 1011
Due to constructive feedback received at the conference, I have created a ReDonate campaign as an optional method for adding to the "Ultimate blockchain compression w/ trust-free lite nodes" campaign. ReDonate is a service for contributing a one-time donation, while receiving periodic reminders in the future to re-donate if you are happy with the progress being made.

http://redonate.net/dashboard/bitcoin-developer-maaku

As noted in the OP, the 3-month funding goal has been achieved, but it is expected that it will take more than 3 months to implement this feature. Any further contributions now will go toward extending the amount of time I have to work on it.
legendary
Activity: 905
Merit: 1011
@Sukrim, that's essentially my thinking with regards to bittorrent. It's just about moving synchronization of new clients or clients which have been offline for a while off of the p2p bitcoin network, and making it easier for people to offer high-bandwidth “seed nodes” just for this purpose. If done correctly there should be no security implications as blockchain data is self-validating.

@jtimon, you are essentially correct although what you describe is only part of the story. I think “trust level” is an appropriate term. Here's how I would lay them out:

Level 4: Electrum-like client/server. Keys are stored on the client, the but the client trusts the server for information about the unspent-TxOut set. This is only marginally better than sticking your coins on a server-side wallet. I would never make a general recommendation to operate at this level unless you own both the server and the client and have a secure, authenticated connection between the two.

Level 3: Simplified payment verification. Client trusts the “longest” (most work) chain, but reads and processes every block. The client must scan the contents of every block more recent than the oldest address in its wallet in order to be sure that its record of unspent wallet outputs is correct. BitcoinJ has some optimizations not mentioned, but only because they make simplifying assumptions about the circumstances under which wallet transactions might be generated. It remains true that you must touch every transaction of every block that might have a wallet input or output within it.

Both of the above levels will be completely obsoleted by this proposal.

Level 2: The client downloads the “longest” (most work) meta-chain, and retrieves the block of data associated with the current head of the meta-chain. This data includes the Merkle hash of the unspent-TxOut trie and it's associated block height. The client then queries any node exposing the correct service bits about its wallet addresses, retrieving either the associated transaction outputs with proof-of-inclusion, or a negative proof showing that no such output exists in the trie. I call this enhanced simple payment verification, or SPV+, and operates trust-free at an economic security level equal to the hash power of the merged-mined meta-chain.

Level 1: The meta-chain data block probably will include other information, such as a deterministic bittorrent infohash of the serialized unspent-TxOut trie and blockchain checkpoint data. The client downloads the unspent-TxOut trie torrent and verifies that its root hash matches what was in the meta-chain. It then reconstructs the information necessary to do full-block validation from that point forward. The initial synchronization is at the meta-chain level of economic security, but after that it would take a 51% attack on bitcoin itself to subvert a client at this level.

Level 0: The client either verifies the entire chain forwards from the genesis block, or backwards up to the most recent checkpoint as @jtimon described. It is now a fully-validated client operating exactly as the Satoshi client does today, and with the same level of security.

There is also a “0.5” mode that might be good enough for most people: only verify backwards long enough to satisfy your own security requirements (a configurable number of blocks, perhaps as a command line parameter).

I've cross-posted most of this to the original proposal thread, since technical discussion should probably occur there.
full member
Activity: 238
Merit: 100
The Bitcoin Catalog ---> Get Started!
This will be an incredible step forward for bitcoin! I will also contribute and try to raise some funds for this project.
legendary
Activity: 2618
Merit: 1006
Nodes would have more room for specialization. I think there will be always "librarian nodes" ala block explorer. distributing the whole history. But I think their lack is what worries @Sukrim.

True, also there is no incentive to actually operate such a node and if they get rarer, they actually have to serve even more data which puts a stronger pressure to shut them down.

In the OP however there is also some BitTorrent distribution mechanism mentioned... even though torrent files (actually info hashes) for a given file are not 100% deterministic, they can be made deterministic, if there is an agreement on block size. My suggestion would be to include the info hash of a bootstrap.dat file that contains all blocks from (#currentblock-120-#currentblock mod 2016) - meaning that120 blocks after every difficulty change a new info hash gets posted to the UTXO chain. That way the torrent file stays static enough to be properly seeded but changes quickly enough for new clients to catch up.
As BitTorrent downloads are properly hashed, checked etc. and the blockchain can also be verified again upon importing, this should be safe enough, if one also trusts this UTXO chain.
That way full archival nodes would not be queried for older blocks that much, as the bootstrap procedure for a "Ultimate compression" client would be:

* Get current block height
* Get block height of the highest UTXO block X blocks below current height (I would stay with 120 as this is the magic number for spending coinbase transactions, not over 1000... but this canbe even user configurable)
* Download this UTXO block and all UTXO blocks after it, as well as all mainline blocks since then
* Verify that following UTXO blocks were correctly created and also that mainline blocks are OK.
* Get all mainline block headers since genesis (probably a big art of this data can be already come with the installer) and verify that the recent mainline blocks are part of this chain.
### Here you can already start mining your own blocks ###
* Get the full block torrent info hash from the header (?) of the UTXO block(s)
* Download the torrent via a magnet link over DHT (all you need is the info hash after all)
* Verify all full blocks since genesis
### Now you are at the same stage as current bitcoin-qt ###

Now you can either continue to seed the torrent (and maybe even do this on a dedicated machine!) or be happy that the UTXO chain was actually not lying to you, discard most older block contents (maybe here it might make sense to keep the most recent 1000 or so in case there are longer reorgs somewhere in the future?) and just work on UTXO/mainline chains.

With BitTorrent it would be possible to help distributing the block chain without running bitcoind, a downside is that any client that does not have the info hash of the most current check point will seed a relatively dead torrent...
legendary
Activity: 1372
Merit: 1002
I don't know all the details of the proposals, but let me explain it the way I see it and then you people tell me what I'm missing.
I propose "trust levels" as the name. For simplicity, I'm assuming the root of the UTXO tree is part of the headers, not just merged mined, but we can argue about that later.

Trust level 2: The current Simplified payment verification.

Trust level 1: We define a distance in blocks to the past which is secure for the node to assume won't suffer a reorg, say 1000 blocks ago and call it the "pseudo-checkpoint". He downloads the full UTXO tree that was in block (currentHeight-1000), the UTXO tree in the pseudo-checkpoint. He downloads 1000 blocks and reproduces the current UTXO tree from the the old one. For a given unspent output, he can provide its full merkle branch until coinbase OR the pseudo-checkpoint if the coinbase is older.

Trust level 0: The node downloads the rest of the chain, being able to verify that the UTXO tree from the pseudo-checkpoint he used was correct, and he will be able to always provide full output markle branches to their coinbases from now on.

You could download the whole chain from the top to the bottom and optimistically start assuming the last UTXO tree was legit, then validate it with the previous one and the transactions of last block, and so on to the genesis. Effectively decreasing the pseudo-checkpoint height from last block created to genesis.
Once you've achieved trust 0, nothing forces you to store the whole chain. You can then set another pseudo-checkpoint, only being afraid of reorgs and/or not being able to provide long enough output chains.
In fact, you can go to trust level 1 or even trust level 0 to operation level 2 if you want. You can download the whole chain but then only keep the branches of your own outputs.

Nodes would have more room for specialization. I think there will be always "librarian nodes" ala block explorer. distributing the whole history. But I think their lack is what worries @Sukrim.

After this I think you can only improve the storage of the UTXO (and posibly their full merkle brach) using caches.

Is this basically what we're talking about or am I lost and the name "trust levels" is awful?
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Ah, but you forget I was the proponent of a 2-3-4 BST Wink But no matter, you won me over. One of my first tasks would be finalizing the serialized form of the hybrid trie structure, while making sure that it is efficiently retrievable from an indexed key-value store and inexpensive to update.

The beauty of one the last revelations in that thread was that you can define it in terms of that hybrid tree, but implement it with a regular iteration over the natural sorting of the database.  In other words, if you iterate over the database and are careful about tracking where you were and where you are, you can perfectly simulate the hybrid PATRICIA tree without actually implementing all its complexity.  And at the same time saving all the space you would've spent storing pointers.  You simply define it in terms of the hybrid tree (skip strings, omitting NULL pointers), but the actual implementation will be much simpler.

I am confident that LevelDB will be a reasonable default choice and prove reliable even under high load, given the performance numbers I've seen and its heritage from Google's BigTable. However yes, the proper abstraction is “indexed key-value store”, for which there are many NoSQL (and SQL) solutions. I may not go to the extent of actually writing or using an intermediary database wrapper, but I will make sure that there is nothing LevelDB-specific about the solution.

I was just thinking that in the future and/or when developers decide they don't like LevelDB, we may need to switch.  For instance, I read that LevelDB had some problems above 200 million entries.  I don't know the exact parameters (if it's always 200 million, or if it's DB size, etc), but apparently the scalability and robustness may not be there for the long haul.  But it will work perfectly for a proof-of-concept.

Quote from: etotheipi
Also, I should mention that socrates/amiller had some interesting ideas that would be worth discussing.  He had an idea for lite-storage-but-full-verification nodes, which I'm still not confident it is possible, but he insists it is and we agreed to debate it in the future when I had time.  Maybe you can discuss it here before embarking on this journey, to make sure the design is extensible enough to accommodate it if it is feasible.

Yes, I do recall that discussion, and in fact amiller's description of trust-free lite-node operation is I think the primary long-term benefit. Actually, I confess I thought it was your original purpose the first time I read your proposal. Of course “full-verification” is I think a misuse of terms, as in such a case there is in fact no transaction validation but rather trust that the miners on the meta-chain are themselves properly validating, and then construction of a “balance-proof” from the unspent-TxOut trie.

To be clear, UBC (or whatever we call it) allows for two separate “lite” modes of operation: one where the entire unspent-TxOut set is downloaded from peers and new transactions/blocks are validated against it. This is the same as the pruned mode 0.8 is theoretically capable of, except that the unspent-TxOut set need not be built from scratch.

The second, lighter mode of operation is what amiller describes, where the lite client requests other nodes to provide self-validating proofs of address balances by providing paths through the unspent-TxOut Merkle tree. The client can then proceed to operate in an enhanced SPV mode while downloading first the unspent-TxOut set and then the full blockchain in the background.

I was actually talking about some discussions with socrates outside the thread, on IRC.  You said "I think it's a misuse of terms", but no, actually it's not.  Your recap is accurate concerning what I originally proposed.  But socrates took it one step further, saying that you could build on top of this to create a fully-validating node without storing the whole database.  I believe it involves downloading and verifying parts of the chain in real-time, as you are verifying new blocks.  But it can be done, because if you are confident in the state of the blockchain at block N, you can verify block N+1 efficiently given the availability of this new datastructure we are building.  The node wouldn't be very useful for seeding other nodes, but it would at least enforce the rules on new blocks, and theoretically could even solo mine without holding everything.  

To me, it didn't seem very possible that you could have a bunch of low-storage nodes doing full verification, but I also didn't have the patience at the time to flesh out the idea with him.  Perhaps given a system with a high density of full-storage nodes that can help the lite node do its verification, but it certainly wouldn't be feasible for the network to be made up of them, if no one is serving the full data set.

If he's right, that dramatically loosens the specs needed by future miners, when everyone is predicting that only super-computers/clusters will be able to verify-and-mine.  I have my doubts, but we should definitely invite him to expand on that, and even if it's not completely feasible, there's will probably still be useful ideas in there.
legendary
Activity: 905
Merit: 1011
Thank you everyone.

Let's take it from the top:

The topic/proposal should definitely be renamed, as it does not allow for any pruning beyond what is already possible (in theory) with the ultraprune branch that made it into 0.8. I kept the name in the OP only because that is what people know the proposal as, but now that the project is funded we should correct details like this. Suggestions are welcome; perhaps “unspent-TxOut queries for trust-free lite nodes and quick synchronization”? A bit clunky...

Quote from: etotheipi
I think the recent discussion where I realized that you can eliminate all the pointers, makes the overhead much more manageable, but it's still non-negligible.    I don't know your real name, but if you make this work, perhaps it would we could just name it the Reiner-Maaku tree Smiley

Ah, but you forget I was the proponent of a 2-3-4 BST Wink But no matter, you won me over. One of my first tasks would be finalizing the serialized form of the hybrid trie structure, while making sure that it is efficiently retrievable from an indexed key-value store and inexpensive to update.

I am confident that LevelDB will be a reasonable default choice and prove reliable even under high load, given the performance numbers I've seen and its heritage from Google's BigTable. However yes, the proper abstraction is “indexed key-value store”, for which there are many NoSQL (and SQL) solutions. I may not go to the extent of actually writing or using an intermediary database wrapper, but I will make sure that there is nothing LevelDB-specific about the solution.

Quote from: etotheipi
Also, I should mention that socrates/amiller had some interesting ideas that would be worth discussing.  He had an idea for lite-storage-but-full-verification nodes, which I'm still not confident it is possible, but he insists it is and we agreed to debate it in the future when I had time.  Maybe you can discuss it here before embarking on this journey, to make sure the design is extensible enough to accommodate it if it is feasible.

Yes, I do recall that discussion, and in fact amiller's description of trust-free lite-node operation is I think the primary long-term benefit. Actually, I confess I thought it was your original purpose the first time I read your proposal. Of course “full-verification” is I think a misuse of terms, as in such a case there is in fact no transaction validation but rather trust that the miners on the meta-chain are themselves properly validating, and then construction of a “balance-proof” from the unspent-TxOut trie.

To be clear, UBC (or whatever we call it) allows for two separate “lite” modes of operation: one where the entire unspent-TxOut set is downloaded from peers and new transactions/blocks are validated against it. This is the same as the pruned mode 0.8 is theoretically capable of, except that the unspent-TxOut set need not be built from scratch.

The second, lighter mode of operation is what amiller describes, where the lite client requests other nodes to provide self-validating proofs of address balances by providing paths through the unspent-TxOut Merkle tree. The client can then proceed to operate in an enhanced SPV mode while downloading first the unspent-TxOut set and then the full blockchain in the background.

I thought the controversy was over balanced BST vs trie structures, but I may be misremembering that discussion. I think I have a professional obligation now to take another look through the whole thread and evaluate all ideas contained within.

@evoorhees, I sent you a Skype invite and I'd be happy to talk with you that way or at the conference. My timezone is U.S. Pacific Daylight Time.

@jtimon, thank you for the reference and kind words. Yes, our “assets” colored-coin proposal is what drew me back to this proposal. If people think SatoshiDice brings a lot of traffic and bloats the block chain, wait until they see what a distributed exchange and transitive credit will do, especially once the block-size limit is relaxed.

Quote from: Sukrim
As far as I understand it, it might not be possible to do complete block chain analysis after implementing this change, so there might be some reservations from bitcoin-qt developers to this (as you delete parts of history).

This proposal in no way impacts block chain analysis / validation. It is strictly adding functionality to a fully-validating node. However that functionality would allow for non-validating clients to operate with significantly better security than current lite clients such as non-validating bitcoinj or electrum. Further, it becomes possible that the Satoshi client could start operation in one of these enhanced lite security modes (with a sync time of seconds or a few minutes), and then transition to full-validation once the initial block download is complete.


@CIYAM Open, thanks I do appreciate it, although I have my own project management tools configured the way I like them Wink


For everyone who has contributed, thank you. As there are still contributions coming in, I will treat it as a rolling fundraiser for continued development after the 3 months have elapsed. Between the conference and some other obligations I am winding up, my time will be split for the next couple of weeks. Therefore the 3 months will officially start June 1st.
legendary
Activity: 1890
Merit: 1072
Ian Knowles - CIYAM Lead Developer
Actually a Project had been created in CIYAM Open for this very thing a while back (was never actually "opened" though) - if this could be useful (in terms of creating Project Areas and then specific Project Tasks) then CIYAM Open would be happy to help manage this free of charge (if the current Project Owner is not interested to do that then I will volunteer my own time to help out).
Pages:
Jump to: