The High-Value-Hash Highway - page 2.

TheBitcoinChemist

member

Activity: 70

Merit: 10

Quote from: socrates1024 on August 08, 2012, 12:43:39 PM

Quote from: gmaxwell on August 08, 2012, 11:18:39 AM

I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point.

This proposal is primarily about ddos-resistance. A protocol's ddos-resistance is related to the largest plausible validation effort, not the expected (or actual) effort. So if I give you a 20GB set of headers and tell you it's the largest chain... you'd apparently reject it without even looking, since larger than 400MB is implausible.

Quote from: gmaxwell on August 08, 2012, 11:18:39 AM

There are interesting challenges in Bitcoin, reading headers isn't one of them.

My mission is to eliminate every last hard-coded global parameter in Bitcoin, so that it grows into an indisputably scalable and universal protocol. On the chopping block are "10 minutes," and "difficulty adjustment every 2016 blocks."

I assure you, that neither of those parameters are up for debate.

Quote

Two of the things I'm going to propose next (absorbing orphaned forks, and self-adjusted difficulty without timestamps) are going to potentially to create a much larger number of headers, so I wanted to explain my solution to that first, especially starting with efficient validation for lite-clients. If it's not interesting, then no need to read ahead - but save your place.

Orphanced forks don't need to be 'absorbed' into the main chain, and self-adjusting difficulty without timestamps is too complicated to impliment across a massive, distributed p2p network with connections that vary by orders of magnitude. This will not fly, and your current proposal isn't likely to be incorporated into the main chain, either; without some overriding gain. Incremental gains need not apply here, it's about two years too late for such things.

TheBitcoinChemist

member

Activity: 70

Merit: 10

Quote from: ben-abuya on August 08, 2012, 12:13:50 PM

Quote from: gmaxwell on August 08, 2012, 11:18:39 AM

Quote from: socrates1024 on August 07, 2012, 02:53:21 AM

How does a lite client know which block chain is the largest? By iterating through the whole chain from front to back? Ick!

I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point. There are interesting challenges in Bitcoin, reading headers isn't one of them.

Also, from this discussion it sounds like some people are confused wrt difficulty. The difficulty of a block is its target, not the hash value.

If you stopped reading, how do you know? But seriously, I believe the idea here isn't to address practical problems with the current Bitcoin parameters, but to explore concepts that will be needed if Bitcoin morphs in interesting ways. For instance, if average block solving time is reduced 60-fold to ten seconds, can your smartphone still keep up with all those headers?

Average block solving time will never be reduced to anywhere near 10 seconds.

socrates1024

full member

Activity: 126

Merit: 110

Andrew Miller

Quote from: gmaxwell on August 08, 2012, 11:18:39 AM

I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point.

This proposal is primarily about ddos-resistance. A protocol's ddos-resistance is related to the largest plausible validation effort, not the expected (or actual) effort. So if I give you a 20GB set of headers and tell you it's the largest chain... you'd apparently reject it without even looking, since larger than 400MB is implausible.

Quote from: gmaxwell on August 08, 2012, 11:18:39 AM

There are interesting challenges in Bitcoin, reading headers isn't one of them.

My mission is to eliminate every last hard-coded global parameter in Bitcoin, so that it grows into an indisputably scalable and universal protocol. On the chopping block are "10 minutes," and "difficulty adjustment every 2016 blocks."

Two of the things I'm going to propose next (absorbing orphaned forks, and self-adjusted difficulty without timestamps) are going to potentially to create a much larger number of headers, so I wanted to explain my solution to that first, especially starting with efficient validation for lite-clients. If it's not interesting, then no need to read ahead - but save your place.

ben-abuya

sr. member

Activity: 323

Merit: 250

Quote from: gmaxwell on August 08, 2012, 11:18:39 AM

Quote from: socrates1024 on August 07, 2012, 02:53:21 AM

How does a lite client know which block chain is the largest? By iterating through the whole chain from front to back? Ick!

I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point. There are interesting challenges in Bitcoin, reading headers isn't one of them.

Also, from this discussion it sounds like some people are confused wrt difficulty. The difficulty of a block is its target, not the hash value.

If you stopped reading, how do you know? But seriously, I believe the idea here isn't to address practical problems with the current Bitcoin parameters, but to explore concepts that will be needed if Bitcoin morphs in interesting ways. For instance, if average block solving time is reduced 60-fold to ten seconds, can your smartphone still keep up with all those headers?

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: socrates1024 on August 07, 2012, 02:53:21 AM

How does a lite client know which block chain is the largest? By iterating through the whole chain from front to back? Ick!

I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point. There are interesting challenges in Bitcoin, reading headers isn't one of them.

Also, from this discussion it sounds like some people are confused wrt difficulty. The difficulty of a block is its target, not the hash value.

kjj

legendary

Activity: 1302

Merit: 1026

Quote from: TheBitcoinChemist on August 08, 2012, 10:33:26 AM

Quote from: kjj on August 08, 2012, 08:22:47 AM

Ok, so I read this thread again and figured out the part that I had been missing.

When creating a new block, the skiphash points to the most recent block with a value higher than the previous block, not the block with the previous highest value ever.

I'm working through the implications of this, but at first glance, it does not appear to produce a chain of links that leads to a high value block.

Quote from: socrates1024 on August 07, 2012, 01:58:16 PM

Suppose the back-link from the current head looks like 000xxx (value 3). We also include an 'up'-link to the parent of the most recent block that looks like 0000xxxx (value 4). If you skip to that parent block, then it will have an 'up'-link to the parent of a block that looks like 00000xxxxxx (value 5). And so on, until you get to the parent of the largest block for which there's no up-link.

Suppose you start at that largest block, which looks like 00000xxxxx (value 5). It contains an up-link, either to a previous block with value 00000xxx (value 5), or to one with lower value. You can skip your way to the highest-value hash in every sub-interval of history.

The problem is that when you follow that first "up" link, the block you find there does not have a link to a value higher than itself, it has a link to a value higher than its predecessor. If you want to find a block with a value higher than the block you are on, you have to go forward and then "up" from there, and forward links aren't possible.

Of course they are, just +1 on your block count, then "up". Or just change the higher last hash to one that is higher than both the last hash & the current one. The idea of choosing to be higher than the last hash is to avoid duplicating the existing liner structure of the blockchain.

You don't know the value of the current block's hash until after you hash it, and once you do that, you can't go back in and figure out which block you should have linked to, because that would change the hash and value.

Adding a block index to the block gives you the ability to move forward, but it also gives you full random access. Why bother with a second link when you can instantly seek by index?

With random access, the only job left for hashes is to preserve ordering, that is to prove that Z came after Y came after X. Someone creating a fake skipchain only has to make one trivial fake block per high value fake block, and make them in the right order, which is still billions and billions of times easier than creating a whole fake chain, at least over intervals long enough for this system to make any sense at all.

TheBitcoinChemist

member

Activity: 70

Merit: 10

Quote from: kjj on August 08, 2012, 08:22:47 AM

Ok, so I read this thread again and figured out the part that I had been missing.

When creating a new block, the skiphash points to the most recent block with a value higher than the previous block, not the block with the previous highest value ever.

I'm working through the implications of this, but at first glance, it does not appear to produce a chain of links that leads to a high value block.

Quote from: socrates1024 on August 07, 2012, 01:58:16 PM

Suppose the back-link from the current head looks like 000xxx (value 3). We also include an 'up'-link to the parent of the most recent block that looks like 0000xxxx (value 4). If you skip to that parent block, then it will have an 'up'-link to the parent of a block that looks like 00000xxxxxx (value 5). And so on, until you get to the parent of the largest block for which there's no up-link.

Suppose you start at that largest block, which looks like 00000xxxxx (value 5). It contains an up-link, either to a previous block with value 00000xxx (value 5), or to one with lower value. You can skip your way to the highest-value hash in every sub-interval of history.

The problem is that when you follow that first "up" link, the block you find there does not have a link to a value higher than itself, it has a link to a value higher than its predecessor. If you want to find a block with a value higher than the block you are on, you have to go forward and then "up" from there, and forward links aren't possible.

Of course they are, just +1 on your block count, then "up". Or just change the higher last hash to one that is higher than both the last hash & the current one. The idea of choosing to be higher than the last hash is to avoid duplicating the existing liner structure of the blockchain.

kjj

legendary

Activity: 1302

Merit: 1026

Ok, so I read this thread again and figured out the part that I had been missing.

When creating a new block, the skiphash points to the most recent block with a value higher than the previous block, not the block with the previous highest value ever.

I'm working through the implications of this, but at first glance, it does not appear to produce a chain of links that leads to a high value block.

Quote from: socrates1024 on August 07, 2012, 01:58:16 PM

Suppose the back-link from the current head looks like 000xxx (value 3). We also include an 'up'-link to the parent of the most recent block that looks like 0000xxxx (value 4). If you skip to that parent block, then it will have an 'up'-link to the parent of a block that looks like 00000xxxxxx (value 5). And so on, until you get to the parent of the largest block for which there's no up-link.

Suppose you start at that largest block, which looks like 00000xxxxx (value 5). It contains an up-link, either to a previous block with value 00000xxx (value 5), or to one with lower value. You can skip your way to the highest-value hash in every sub-interval of history.

The problem is that when you follow that first "up" link, the block you find there does not have a link to a value higher than itself, it has a link to a value higher than its predecessor. If you want to find a block with a value higher than the block you are on, you have to go forward and then "up" from there, and forward links aren't possible.

ben-abuya

sr. member

Activity: 323

Merit: 250

Quote from: kjj on August 07, 2012, 08:39:47 PM

Blocks don't currently have a concept of height. If you add that too, you have more information, but not that much more.

And backing up doesn't give a new walk, ever. Every block from X+1 to Y has a skiphash value of X, every block from Y+1 to Z has a skiphash value of Y. If you back up one block to Z-1 you don't get a new walk, you get the exact same one again.

kjj, you seem to be confusing two concepts here, the skip hash (most recent block with higher value than the previous) and the highest value block in the chain. In your example, X, Y, Z are skip hashes, but for the traversal stuff, you'd be working with highest value blocks. If I find the highest value block in the chain, clearly there are two intervals: blocks before the hvb in the chain and blocks after the hvb in the chain. If we look at each of these intervals, excluding the hvb, we will find a different hvb in each interval. We can choose to go either up or down. That's the second block in the path.

kjj

legendary

Activity: 1302

Merit: 1026

Blocks don't currently have a concept of height. If you add that too, you have more information, but not that much more.

And backing up doesn't give a new walk, ever. Every block from X+1 to Y has a skiphash value of X, every block from Y+1 to Z has a skiphash value of Y. If you back up one block to Z-1 you don't get a new walk, you get the exact same one again.

socrates1024

full member

Activity: 126

Merit: 110

Andrew Miller

Quote from: ben-abuya on August 07, 2012, 07:01:51 PM

A zero means "look low", a 1 means "look high". You send me that number and I now have to give you all the blocks I get from traversing the highway according to those instructions. If I wanted to feed you fake blocks, I'd have to either dream up all log(n) blocks on the path for you on the fly, or have all possible paths ready to go, which is basically the same as hashing the entire blockchain from scratch.

Yup, random traversals of this structure are very interesting. I suspect that at some later date, we will be able to make a non-gpu proof-of-work scheme based on traversals of the previous work-graph. It's basically a way of randomly sampling blocks from history, weighted according to difficulty.

ben-abuya

sr. member

Activity: 323

Merit: 250

Hmm, how about this:

Let's hypothetically assume current difficulty for all blocks for the sake of this illustration. Your client chooses a random number (k) from 0 to current block height (n). That number is a binary number with log(n) digits (zero-padded). A zero means "look low", a 1 means "look high". You send me that number and I now have to give you all the blocks I get from traversing the highway according to those instructions. If I wanted to feed you fake blocks, I'd have to either dream up all log(n) blocks on the path for you on the fly, or have all possible paths ready to go, which is basically the same as hashing the entire blockchain from scratch.

In order to hash the blocks on the path in a matter of seconds, I'd need 60 * log(n) more hashing power than the entire network.

The problem with this is that I can try to say that the difficulty was historically ridiculously low. But since I'm feeding you all the highest value blocks, if you're not happy with the overall difficulty you just won't deal with me, because I'm probably a crook. Since you can be off by at least two orders of magnitude and still feel safe dealing with me, there probably won't be a lot of false negatives.

socrates1024

full member

Activity: 126

Merit: 110

Andrew Miller

Quote from: kjj on August 07, 2012, 05:29:14 PM

What do you do with the subinterval exactly?

You don't know any blocks between X and Y or Y and Z. You don't know how many there are,

I'm sorry, I wasn't very clear on this part.

Remember that each skip list node allows you to travel either "backwards" or "up". If you traveled "up" several times from Z to get to to Y, then you can start at any of those nodes and travel "back" instead. This lets you subdivide/partition the entire history in a way that's roughly half-as-dense at each level. Since there are two links in each block, you can traverse to any point quickly.

A full chain iteration would consist of "back back back back back back ... back back [genesis]". An efficient way to check the work is to go "up up up up up up up [genesis]". then do "up up up up back up up up up up [genesis]". Mostly ups. At the beginning, you'll spend most of your time among the peaks, where the air is clearer and the signal is stronger.

socrates1024

full member

Activity: 126

Merit: 110

Andrew Miller

I imagine that in each block, you store a concise report of the sum of the difficulties so far. This way you are never trying to make an estimate from scratch, instead you are trying to evaluate whether the self-reported estimate is plausible.

Suppose I tell you that my chain has bajillion hashes with sum-difficulty of 5.

You say "Sweet! I bet you have some cool 9's and 10's in there. Show me those."

If I say "Nope actually my best is just an 8", you are immediately suspicious. "Well show me all your 8s and 7s then." If I actually did a fraction of a bajillion hashes, then the jig will be up quite quickly.

Quote from: TheBitcoinChemist on August 07, 2012, 05:38:14 PM

EDIT: but what happens if the difficulty drops or levels off ofr a long peroid of time, what prevents a uselessly large interval?

Good question! There's a difference between difficulty and hash-value. The difficulty is just the minimum hash-value needed for a block. The skip list indexes the work according to hash-value, which corresponds to the total number hashes drawn over time, regardless of what the minimum thresholds were. So if there's a period of time where the difficulty is only 1 (and perhaps there are tons of blocks like this), but most of the time the difficulty was 5, then you will be estimating mostly from the 7's 6's and 5's. You'd only get to the 1s if the total is plausible all the way down.

TheBitcoinChemist

member

Activity: 70

Merit: 10

Quote from: kjj on August 07, 2012, 05:29:14 PM

Quote from: socrates1024 on August 07, 2012, 04:41:47 PM

Quote from: kjj on August 07, 2012, 03:56:12 PM

Consider this, you get a block, block Z. You do not have block (Z-1), so you check the skiphash and ask for block Y. You also don't have block (Y-1), so you check the skiphash again and ask for block X. Now block X is one that you do have, so you stop there. How much should you trust blocks Y and Z at this point?

...

Unless I'm wrong, of course. It could be that the skiphash chain rules add up to a much stronger context than is immediately apparent to me.

Branch and Bound! You've now established that block X is at least a common point in both chains (assuming you already have a chain, it's the genesis block if you're starting from scratch). Block Y is the largest hash in between X and Z. You can now repeat the process on either sub interval. Each time you go down a level, there are more blocks to check, and you get an increasingly precise estimate. If you keep going, eventually you will process every block.

I imagine that there would be a lite client that says "Estimated work: 1 bajillion megahashes. Likelihood: 76%". The likelihood number increases, very rapidly at first, and slowing down at 99%, 99.99% and so on. If it hits 100, then you have validated the entire chain like a full node.

This also allows you to quickly find the true fork point between two chains, even ones that have had difficulty changes along the way.

What do you do with the subinterval exactly?

You don't know any blocks between X and Y or Y and Z. You don't know how many there are,

Actually, you would know how many there are, because the current block number is part of the header.

I'm starting to see value in this idea.

EDIT: but what happens if the difficulty drops or levels off ofr a long peroid of time, what prevents a uselessly large interval?

kjj

legendary

Activity: 1302

Merit: 1026

Quote from: socrates1024 on August 07, 2012, 04:41:47 PM

Quote from: kjj on August 07, 2012, 03:56:12 PM

Consider this, you get a block, block Z. You do not have block (Z-1), so you check the skiphash and ask for block Y. You also don't have block (Y-1), so you check the skiphash again and ask for block X. Now block X is one that you do have, so you stop there. How much should you trust blocks Y and Z at this point?

...

Unless I'm wrong, of course. It could be that the skiphash chain rules add up to a much stronger context than is immediately apparent to me.

Branch and Bound! You've now established that block X is at least a common point in both chains (assuming you already have a chain, it's the genesis block if you're starting from scratch). Block Y is the largest hash in between X and Z. You can now repeat the process on either sub interval. Each time you go down a level, there are more blocks to check, and you get an increasingly precise estimate. If you keep going, eventually you will process every block.

I imagine that there would be a lite client that says "Estimated work: 1 bajillion megahashes. Likelihood: 76%". The likelihood number increases, very rapidly at first, and slowing down at 99%, 99.99% and so on. If it hits 100, then you have validated the entire chain like a full node.

This also allows you to quickly find the true fork point between two chains, even ones that have had difficulty changes along the way.

What do you do with the subinterval exactly?

You don't know any blocks between X and Y or Y and Z. You don't know how many there are, you don't know what their difficulty is, you don't know what their values are, you don't know anything at all about those intervals, except that if you get block Y-1 (or Z-1) and the value is higher than Y (or Z), one of the blocks is lying. But which one?

And since you don't know anything at all about these intervals, your estimated work is going to be W>X+(y*Y)+(z*Z) where y and z are probably greater than 2. And I just don't see any way, using this method, to improve our estimates.

Again, I might be missing something, but from what I see, it looks like you are using knowledge of the blockchain in your estimates, which is a problem, since this is intended to be a system for finding out information about the blockchain that we assume that you do not actually possess.

socrates1024

full member

Activity: 126

Merit: 110

Andrew Miller

Quote from: kjj on August 07, 2012, 03:56:12 PM

Consider this, you get a block, block Z. You do not have block (Z-1), so you check the skiphash and ask for block Y. You also don't have block (Y-1), so you check the skiphash again and ask for block X. Now block X is one that you do have, so you stop there. How much should you trust blocks Y and Z at this point?

...

Unless I'm wrong, of course. It could be that the skiphash chain rules add up to a much stronger context than is immediately apparent to me.

Branch and Bound! You've now established that block X is at least a common point in both chains (assuming you already have a chain, it's the genesis block if you're starting from scratch). Block Y is the largest hash in between X and Z. You can now repeat the process on either sub interval. Each time you go down a level, there are more blocks to check, and you get an increasingly precise estimate. If you keep going, eventually you will process every block.

I imagine that there would be a lite client that says "Estimated work: 1 bajillion megahashes. Likelihood: 76%". The likelihood number increases, very rapidly at first, and slowing down at 99%, 99.99% and so on. If it hits 100, then you have validated the entire chain like a full node.

This also allows you to quickly find the true fork point between two chains, even ones that have had difficulty changes along the way.

BoardGameCoin

sr. member

Activity: 283

Merit: 250

Quote from: kjj on August 07, 2012, 03:56:12 PM

Unless I'm wrong, of course. It could be that the skiphash chain rules add up to a much stronger context than is immediately apparent to me.

I'm not sure that you're wrong, but I do think its a stronger validation than you realize. When mining, we send solutions to lower-difficulty problems to the server to let it know that we're working. Most mining pools call these 'shares', and there are typically many shares per valid block because the difficulty is much higher. The rules socrates1024 is proposing have the net effect of superimposing a nested set of higher difficulty blocks. Manufacturing a block for the next level would be roughly 16 times as difficult as manufacturing a single block at present difficulty.

I'm curious about this overlay, not sure about its usefulness but I'm at least intellectually interested in seeing it. It does seem to provide an interesting 'difficulty topology' on the blockchain...

kjj

legendary

Activity: 1302

Merit: 1026

Quote from: socrates1024 on August 07, 2012, 02:26:45 PM

Quote from: kjj on August 07, 2012, 02:20:16 PM

Notice that building a couple of phony blocks that look legit when viewed in this chain is trivial compared to building a couple of phony blocks to fool the real chain. And by trivial, I mean billions and billions of times easier.

Good question! That's exactly the concern that motivated me to work on this problem. If a chain is valid, you will eventually process it entirely. However if a chain is work-deficient, like you described, then you can reject it early. Think of it this way: it's a validation procedure that has zero false positives, but supports probabilistic false-negatives from early rejection.

It has neither false positives nor false negatives, because it simply doesn't do ANY verification whatsoever.

Consider this, you get a block, block Z. You do not have block (Z-1), so you check the skiphash and ask for block Y. You also don't have block (Y-1), so you check the skiphash again and ask for block X. Now block X is one that you do have, so you stop there. How much should you trust blocks Y and Z at this point?

The answer is "not at all" because you don't have ANY context for them. You have NO idea what their difficulty should be, you have NO idea if they actually connect back to the real chain or not, and you have NO idea what their timestamps should be. In short, you have no idea if these are real blocks, or if they were prepared just to attack you, and you won't have any idea until you verify the chain the hard way.

The skiphash chain means that when I create my attack chain, the difficulty values must follow some rules, but following those rules is still billions and billions of times easier than following the full network rules because you only need to make a couple of blocks.

Unless I'm wrong, of course. It could be that the skiphash chain rules add up to a much stronger context than is immediately apparent to me.

nedbert9

sr. member

Activity: 252

Merit: 250

Inactive

Interesting.

Topic: The High-Value-Hash Highway - page 2. (Read 9035 times)