Pages:
Author

Topic: Ultimate blockchain compression w/ trust-free lite nodes - page 21. (Read 87932 times)

vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
If the idea works and becomes essential to the way most people use Bitcoin, all developers could easily strategize and decide that a future version of clients will only relay/accept blocks when their coinbase contains a valid merged mining record for the other chain's most recent valid block.  It might then be properly called a "meta chain" rather than an "alt chain".
full member
Activity: 225
Merit: 101
If the altchain has some place to put a scriptPubKey or an array of them in every block, the chain could be funded with network assurance contracts as proposed by Mike Hearn for the main chain when the subsidy gets lower than fees.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
isn't it possible to merge mine the balance chain just like many pools do with namecoin?
that way there is not need to give any reward to miners. just include it in bitcoind as a "protocol requirement" for mining.

Yup.  The second chain doesn't need incentive due merged mining, unless it's particularly resource-consuming.  If the software is already written, and integrates into bitcoind or other mining software transparently, and doesn't impact mining performance, then miners would [likely] have no complaints about adopting it given the huge upside it offers.  It's basically free, from their perspective.

Though I don't agree it would be a "protocol requirement."  Miners have various reasons for doing what they do.  And this is being promoted as non-disruptive and optional.  But you don't need more than, probably 20% of mining power to engage this idea for it to be successful.  I'm not even sure what a 51% attack would look like on the alt-chain, but it wouldn't be very exciting -- the worst you could do is prevent some lite-nodes being able to verify their own balance -- but there would be no financial gain for it, and it would cost a fortune.  20% seems like a number high enough that there would always be a "checkpoint" nearby, and high enough that it would be vastly too expensive for a prankster to do anything to that chain.

legendary
Activity: 1428
Merit: 1000
isn't it possible to merge mine the balance chain just like many pools do with namecoin?
that way there is not need to give any reward to miners. just include it in bitcoind as a "protocol requirement" for mining.
legendary
Activity: 1792
Merit: 1111
I have a proposal as an "add-on" to this

When a miner sees a tx, he could either 1. grab the tx fee, or 2. "donate" the tx fee to the alt-chain.

Grabbing the tx free is essentially the current practice

If it is "donated", it will go to a jackpot pool for the alt-chain. A miner who find a valid block in the alt-chain will claim the jackpot.

By donating the tx fee, miner will get a "discount" in their mining difficulty. The discount will be calculated in a pay-per-share manner. For example, the current difficulty is 1583 177.847444 with block reward of 50BTC. A fair PPS would be 0.00003158. Therefore, a tx fee of 0.0005 is equivalent to 15.831778 shares. By donating the tx fee, the miner will only need to find a block with difficulty of 1583177.847444-15.831778 = 1583162.015666.

Miner will want to donate tx fee since this helps them to find a block easier and reduce  variation. It also provide a feedback mechanism for mining. In a bad luck turn where many tx accumulate, difficulty is reduced and the bad luck turn could be ended earlier.
The effective difficulty is no longer a constant throughout 2016 blocks. Instead, it will be a zig-zag function: highest at the beginning of a round, keep decreasing as unconfirmed tx accumulate, and jump back to the highest value when a block is found. The average block generation rate is still 10-minutes but the variation is reduced too.

Some miners may abuse this system by creating tx with high tx fee (e.g. 25BTC) and donating it to the alt-chain, so their difficulty is reduced by 50%. To prevent this, we may put a cap on the difficulty discount (e.g. 10% of difficulty) and/or calculate the discount with <100% PPS (So the expected return will be decreased).
ffe
sr. member
Activity: 308
Merit: 250
legendary
Activity: 2184
Merit: 1056
Affordable Physical Bitcoins - Denarium.com
Why do miners currently mine transactions that don't have a fee? The answer to both of these questions are that doing such things aren't terribly costly, but more importantly, it encourages further adoption of Bitcoin which will directly result in increased revenue for miners in the future.
This is changing faster than we've anticipated though. Due to the sheer volume of transactions that SatoshiDice produces, higher fees will start to get priority. The bad thing is that current Bitcoin clients don't have a very sophisticated fee system or fee options.
sr. member
Activity: 269
Merit: 250
I think that if you want to have the root value to be the same regardless of order in which a tree or any hierarchical structure was created, then you would need a hash function that has commutative and associative properties.
legendary
Activity: 1204
Merit: 1015
Why would people mine this other chain?
Why do miners currently mine transactions that don't have a fee? The answer to both of these questions are that doing such things aren't terribly costly, but more importantly, it encourages further adoption of Bitcoin which will directly result in increased revenue for miners in the future.
legendary
Activity: 905
Merit: 1012
Conceivably you could run an even lighter client that just implicitly trusts the head of the alt-chain. Such a client would rely upon honest miners doing the work of verifying the alt-chain, and not actually perform such checks itself.
legendary
Activity: 1246
Merit: 1016
Strength in numbers
Why would people mine this other chain?

If it has lower hashing power is everyone trusting it more susceptible to double spends somehow?
full member
Activity: 126
Merit: 110
Andrew Miller
I had a very good good reason for believing that you needed to aggregate the unspent-TxOuts (UTXOs) by address/script.  If you didn't do it, bad things would happen.  Unfortunately, I don't remember what those bad things were!

One of the neat things about a Merkle search structure, rather than just an arbitrarily-ordered Merkle tree, is that you can prove that a key is not in the database. Even with a typical Merkle tree, like the current blockchain, it would require a linear effort to prove that a transaction doesn't exist - assuming all you have is an O(1) root hash, and you don't trust anyone!

Even more generally, you can do a verified 'range query' for only O(M log N) effort (where M is the number of results, N is the size of the tree). If you store each unspent-coin in a binary search tree, ordered by the address, then you can ask an untrusted server to give you a snapshot of all the spendable coins associated with that address. There's no way for them to omit any.

Let me try to describe the scenario how I prefer, since it's hard to keep track of the terms otherwise. There are two parties, the Lite-Client (Client) and the Helper (Server). The goal is for the Client not to have to trust the Server, but for the Server to store all the data. The Client only ever needs to store (like, on disk) a constant O(1) amount of state - the root hash. In order to decide what root hash to use, the Client will have to rely on the proof-of-work to recognize the most recent block.

If the Client asks the Server for the list of unspent-coins matching a target address, he receives two or more O(log N) paths through the Merkle tree. The first one is path is to the element with the largest address (lexical ordering) that is smaller than the target address. The last one is a path to the smallest element with a larger address. If there were no transactions matching the target, then these two elements will be adjacent. In any case, the Client iterates through the paths he receives, at each step checking that the paths are adjacent in the tree (and, of course, that the hashes are consistent and lead to the root).

[edit]I hope this turns into a Merkle trees megathread![/edit]
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Quote
The counter argument is that you will never find yourself in this position:  you are either downloading and processing the whole chain, in which case you have no problem replaying the incremental updates.  Or you download from some checkpoint, in which case you can still replay the insertions and deletions from that checkpoint with minimal effort.
Correct. So why does it matter? I'm not sure why it's “inelegant” either, since as you point out there is no use case for recreating the tree from only an unsorted list of outputs anyway.

I didn't point that out, because I don't think I agree with it.  It's a counter-argument that, while I can't dispute it directly at the moment, makes me extremely uncomfortable.  What future use case haven't we considered that would be stunted by this decision?  I'm only agreeing that I don't have a direct counter the argument for it (and thus cannot fully defend my position). 

But I'd hardly believe I'm the only one who would be bothered by it:  I think it's extremely inelegant that if I have every node in the tree, that I still have to download GB more data just to know how to organize it (or rather, I might as well just re-download the tree directly, at that point).
legendary
Activity: 905
Merit: 1012
Quote
The counter argument is that you will never find yourself in this position:  you are either downloading and processing the whole chain, in which case you have no problem replaying the incremental updates.  Or you download from some checkpoint, in which case you can still replay the insertions and deletions from that checkpoint with minimal effort.
Correct. So why does it matter? I'm not sure why it's “inelegant” either, since as you point out there is no use case for recreating the tree from only an unsorted list of outputs anyway.

Quote
A couple months ago when I first theorized this idea, I had a very good good reason for believing that you needed to aggregate the unspent-TxOuts (UTXOs) by address/script.  If you didn't do it, bad things would happen.  Unfortunately, I don't remember what those bad things were!  It's entirely reasonable that I mis-applied some logic in my head and decided I needed an unnecesssarily-complex data structure to achieve security.
If nothing else, it is certainly convenient to pull in all the unspent outputs for an address without having to go all over the tree, as you can under your approach. That's a very common use case, so it would make sense to optimize for it.
donator
Activity: 853
Merit: 1000
listening... very good ideas here, similar to my balance chain concept Smiley but much more fleshed out
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Two big issues brought up so far, in outside discussion:


(1):  I am currently in respectful disagreement with Socrates about the ability to construct stateless unspent-txout-trees.   Recap:  If one were to use a red-black tree to store this info, then the exact structure of that tree will depend on the entire, exact, incremental history of additions and deletions to the tree starting from the genesis block.  To construct the tree from scratch, I must replay the entire history in the same order every time.

I am of the opinion that you should be able to start with the current list of unspent-TxOuts, however it is that you got them, and should be able to construct the correct tree without knowing anything about the history of how elements were added and deleted.  If using the vanilla red-black trees, if I start with just the current unspent TxOuts list, I have every node in the tree -- but I need the entire history of already-spent-TxOuts just to be able to replay that history to construct the tree correctly.  This seems inelegant and dangerous.  But I can't figure out why, other than gut instinct.

The counter argument is that you will never find yourself in this position:  you are either downloading and processing the whole chain, in which case you have no problem replaying the incremental updates.  Or you download from some checkpoint, in which case you can still replay the insertions and deletions from that checkpoint with minimal effort.


(2):  I misunderstood DiThi's original proposal, as a much simpler tree structure that could not provide the trustless lite-node behavior that my trees do.  However, as I re-read it, I'm realizing that I think he's right -- a simpler tree of unspent TxOuts does actually give you that capability.  Is this right?

A couple months ago when I first theorized this idea, I had a very good good reason for believing that you needed to aggregate the unspent-TxOuts (UTXOs) by address/script.  If you didn't do it, bad things would happen.  Unfortunately, I don't remember what those bad things were!  It's entirely reasonable that I mis-applied some logic in my head and decided I needed an unnecesssarily-complex data structure to achieve security.  

So, am I wrong?  I'm not sure.  What are the weaknesses and advantages of DiThi's tree structure vs. mine (his leaf nodes are UTXOs, mine are roots of UTXO sub trees).  One thing I can think of is:  how do you know that a peer gave you the complete list of UTXOs and is not hiding the rest?  Though, I've already made an assumption that you have at least one honest peer, so that probably not an issue... is it?

legendary
Activity: 905
Merit: 1012
As others have mentioned, a self-balancing binary search tree would solve the only real technical issue here. Red-black trees would work fine, or their generalized parent structure the 2-3-4 tree (B+tree of order 4), which would provide a conceptually cleaner implementation and serialization format.

Overall, great work. I can assist you in implementing it.

There is one problem, though it may be part of the materials you already referenced:  the tree must be constructed identically by all parties, and from any state.  And all binary-tree structures are insert-order dependent, unless you're storing them in some sort of trie.  Specifying an insert order doesn't work, because someone constructing from scratch doesn't know how someone updating from a previous tree will insert them.  But I wouldn't want to go to a basic trie, due to the space inefficiency.  Something like a Patricia tree/trie would probably work, but it's very difficult to implement (correctly) and there's not a lot of existing, trusted implementations out there.
(EDIT: Sorry, this is basically what socrates above. I should have read the whole thread first:)

This is a non-issue; simply specify the order of insertion/deletion. For example: “Process blocks in order; for each block process transactions in order; and for each transaction first delete all inputs (in order) from, then insert all outputs (in order) into the alt-chain Merkle tree”. You might have to throw a special case in there for transactions that have as input the output of a transaction that occurs later in the same block (is that even allowed?).

Why (and how) would you be creating a tree from scratch without either access to the blockchain or the tree in the last alt-chain checkpoint?

Quote from: etotheipi
Consider 10 unspent-TxOuts, {0,1,2,3,4,5,6,7,8,9}.

You and I are both constructing the same binary/red-back tree, except that you are starting from scratch and I am starting from a tree where elements {0,1,2,5,6,9,13} are part of the tree.  

I have to add {3,4,7,8} and remove {13} from my tree.   You just add all 10 elements in a specified order.  

We're going to get different roots.  I technically know how your tree would've been constructed, but I might have to start over and reinsert everything from scratch if I wanted to make sure my tree structure matches yours.  

That's what I mean about "commutative computation" -- we need to make sure that regardless of whether you're starting from scratch, or updating an existing tree from any point in the past, that you'll get the same answer.

No, that's going in the wrong direction. Updates to the blockchain occur in atomic operations: blocks. Simply mandate that trees are constructed/updated according to the canonical ordering provided by the blockchain. If you insist on creating the search tree from scratch, simply replay the blockchain inserting and removing in the order specified therein. Or you can start from the tree of the last found alt-chain checkpoint, and replay insertions & deletions from that point forward.

Yes, order of operations matters, so standardize an order of operations.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Question: as users download this alt-chain, do they always download the alt-chain from its genesis block (imagine namecoin), or do they always only download a certain number of blocks back, after which point, all earlier blocks are discarded?  (imagine p2pool).

I would have to imagine it was the second.

If it was the first, the value of the alt-chain would tend to zero over time: it would only be useful for pruning spent transactions that existed at the time the alt-chain was created, as all the blocks representing diffs to the alt chain would be the same as (in proportion) to the diffs on the main chain.  That is, at least the way I understood it.

If it was the second, I would imagine that it would be more sensible to create a superblock (say, every 1000 blocks) that publishes a brand new tree, and then all non-superblocks that follow (up to the next superblock) would be updates to that tree.  Anyone downloading the alt-chain would never need more than 1000 blocks: one superblock and all the incremental blocks since the superblock was created.  Anything older than the superblock would be simply unnecessary.

The second blockchain actually would have no substance to it.  It would solely consist of headers which would contain the master-merkle roots.  The merkle-roots are created from the main chain data, so you go get the data from that network.   There would be no block reward -- your reward is on the main chain which you mining at the same time.

So yes, everyone downloads the entire alt-chain.  But the entirety of this extra chain is the headers themselves:  so it's only about 4-5 MB per year.

I am assuming it is necessary to have a verifiable source for pruned-headers -- a separate blockchain succeeds because correctness of data is required to be accepted.

I believe you could provide this by publishing both your "genesis block" as well as the source code for the one-off utility you make to produce it.  One holding a complete copy of the Bitcoin block chain should be able to run your program and create a bit-for-bit copy of your genesis block.  Get a few people to do this, and to GPG-sign the hash of the resulting output.  Voila.

See (1e) in the original post:  the point of this exercise is to not have to trust specific nodes, GPG/PGP signatures, add centralization, etc.  You trust the proof-of-work.  The merkle-root with the most work behind it on the alt-chain is the merkle-root that you trust to be correct, just like you do with the main-network headers.

We don't want users to have to setup a list of trusted authorities. Then you have revocation lists.  And keep the list updated.  And maintain GPG keys of such authorities.  And politics about who should be trusted authorities.  And of course, centralization.

Or you setup a alternate blockchain, and trust the data that has the most work behind it.  Voila.
full member
Activity: 156
Merit: 100
Firstbits: 1dithi
This idea has been scattered throughout some other threads, but there is no one place that fully explains the idea with pictures.

I did. Well, not exactly pictures, but ASCII drawings:

https://en.bitcoin.it/wiki/User:DiThi/MTUT

https://bitcointalksearch.org/topic/m.709737

I wanted to do a prototype but I have been very busy with other projects since then.
vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
I am assuming it is necessary to have a verifiable source for pruned-headers -- a separate blockchain succeeds because correctness of data is required to be accepted.

I believe you could provide this by publishing both your "genesis block" as well as the source code for the one-off utility you make to produce it.  One holding a complete copy of the Bitcoin block chain should be able to run your program and create a bit-for-bit copy of your genesis block.  Get a few people to do this, and to GPG-sign the hash of the resulting output.  Voila.
Pages:
Jump to: