Pages:
Author

Topic: How a floating blocksize limit inevitably leads towards centralization - page 5. (Read 71630 times)

legendary
Activity: 1708
Merit: 1011
The verification delays grow exponentially with the number of transactions simply because each node must perform it's own series of checks before forwarding said block.

Wrong, because the number of nodes getting involved in verification of specific block grows exponentially as well, so the relationship between number of transactions and propagation time is linear.

As an analogy think about how a bacteria multiplies, at every step a size of the colony increased by factor of 2 until it's reached lets say 1024, if suddenly it requires twice more time to multiple then it takes only twice longer to reach the size of 1024 because the number of multiplication steps is still the same.

If block verification were a distributable process, you'd be correct, but it isn't.  At least it's not now, and I don't know how it could be done.  One thing that could be altered to speed up propagation is for some nodes to have trusted peers, wherein if they receive a block from that peer, they re-broadcast that block first then do their checks.  But if this was the default behavior, the network could be DDOS'ed with false blocks.

Lets say you have 1025 full nodes on the network, to keep the example simple we will have those nodes connected in a form of binary tree where a block starts propagating from it's root. Lets say it takes 1 minutes for a node to verify a block and relay it to its children, then it will take 10 minutes for 1024 nodes ( I exclude the root from verification time) to verify the block, in another words it takes 10 minutes for a block to get propagated and verified by the network.

Now lets have a bigger block that takes twice more time to verify, then it will take 20 minutes for 1024 nodes to verify the block, in another words it will take twice more minutes for a block to get propagated and verified by the network. Therefore the relationship between verification time of block and propagation delay it's causing is linear, e.g. if it takes thrice more time to verify a block then, in the simplified example, it will take thrice more time to propagate it through the network.

Your example does not relate to the network.  It's not a binary tree, and doubling the size of the block does not simply double the verification times.  While it might actually be close enough to ignore in practice, the increase in the actual number of transactions in the myrkle tree makes the tree itself more complex, with more binary branches thus more layers to the myrkle tree.  This would imply a greater complexity to the verification process on it's own.  For example, a simple block with only four transactions will have the TxID hashes for those four transactions, plus two hashes for their paired myrkel tree branches, and a final hash that pairs those two together, and that is the myrkle root, which is then included into the block header.  Moving beyond four transactions, however, creates another layer in the binary tree; and then another after 8 transactions, and another after 16 transactions.  Once your into a block with several thousand transactions, your myrkel tree is going to have several layers, and only the bottom of the tree are actual transaction ID's; all the rest are artifacts of the myrkel tree, which every verification of a block must replicate.  The binary myrkel tree within a block is a very efficient way to verifiablely store the data in a way that it can be trimmed later, but it's most certainly not a linear progression.  More complex transaction types, such as contracts or send-to-many, has a similar effect; as the process for verifying transactions is not as simple as a single hash.  Portions of the entire transaction must be excluded from the signing hash that each input address must add to the end of each transaction, and then be re-included as the verification process marches down the list of inputs.  And that is just an example of what I, personally, know about the process.  Logically, the increase in clock cycles for larger, and more complex, blocks and transactions must increase the propagation times at least linearly; and it's very likely to be greater than linear.  Which is, by defintion, exponential.  It may or may not matter in practice, as that exponetial growth may be so small as to not really effect the outcomes, but it is very likely present; and if so, the larger and more complex that blocks are permitted to grow the more likely said growth will metasize to a noticable level.
sr. member
Activity: 269
Merit: 250
The verification delays grow exponentially with the number of transactions simply because each node must perform it's own series of checks before forwarding said block.

Wrong, because the number of nodes getting involved in verification of specific block grows exponentially as well, so the relationship between number of transactions and propagation time is linear.

As an analogy think about how a bacteria multiplies, at every step a size of the colony increased by factor of 2 until it's reached lets say 1024, if suddenly it requires twice more time to multiple then it takes only twice longer to reach the size of 1024 because the number of multiplication steps is still the same.

If block verification were a distributable process, you'd be correct, but it isn't.  At least it's not now, and I don't know how it could be done.  One thing that could be altered to speed up propagation is for some nodes to have trusted peers, wherein if they receive a block from that peer, they re-broadcast that block first then do their checks.  But if this was the default behavior, the network could be DDOS'ed with false blocks.

Lets say you have 1025 full nodes on the network, to keep the example simple we will have those nodes connected in a form of binary tree where a block starts propagating from it's root. Lets say it takes 1 minutes for a node to verify a block and relay it to its children, then it will take 10 minutes for 1024 nodes ( I exclude the root from verification time) to verify the block, in another words it takes 10 minutes for a block to get propagated and verified by the network.

Now lets have a bigger block that takes twice more time to verify, then it will take 20 minutes for 1024 nodes to verify the block, in another words it will take twice more minutes for a block to get propagated and verified by the network. Therefore the relationship between verification time of block and propagation delay it's causing is linear, e.g. if it takes thrice more time to verify a block then, in the simplified example, it will take thrice more time to propagate it through the network.
legendary
Activity: 1708
Merit: 1011
Still, the number of hops (nodes) a block has to be checked by in order for the network to be flooded / saturated with that block is linear to the average radius of the net, in hops, from the node that solves the block, isn't it?

-MarkM-


Hmm, more or less.  But my point is that, as the blocksize increases, the propagation delays increase due to more than one delay metric increasing.  We have the check times, we have the individual p2p transmission times, and we have the number of hops.

One thing that I didn't make clear is that I also suspect, but cannot prove, the number of hops in the network will also tend to increase in average number.  Partly because of an increase in active nodes, which is an artifact of a growing economy, but also somewhat due to the increase in blocksize.  I suspect that as network resource demand increases, some full nodes will deliberately choose to limit their peers and their dedicated bandwidth, functionally moving themselves towards the edge of the network.
legendary
Activity: 2940
Merit: 1090
Still, the number of hops (nodes) a block has to be checked by in order for the network to be flooded / saturated with that block is linear to the average radius of the net, in hops, from the node that solves the block, isn't it?

-MarkM-
legendary
Activity: 1708
Merit: 1011
The verification delays grow exponentially with the number of transactions simply because each node must perform it's own series of checks before forwarding said block.

Wrong, because the number of nodes getting involved in verification of specific block grows exponentially as well, so the relationship between number of transactions and propagation time is linear.

As an analogy think about how a bacteria multiplies, at every step a size of the colony increased by factor of 2 until it's reached lets say 1024, if suddenly it requires twice more time to multiple then it takes only twice longer to reach the size of 1024 because the number of multiplication steps is still the same.

If block verification were a distributable process, you'd be correct, but it isn't.  At least it's not now, and I don't know how it could be done.  One thing that could be altered to speed up propagation is for some nodes to have trusted peers, wherein if they receive a block from that peer, they re-broadcast that block first then do their checks.  But if this was the default behavior, the network could be DDOS'ed with false blocks.
legendary
Activity: 924
Merit: 1004
Firstbits: 1pirata
The verification delays grow exponentially with the number of transactions simply because each node must perform it's own series of checks before forwarding said block.

Wrong, because the number of nodes getting involved in verification of specific block grows exponentially as well, so the relationship between number of transactions and propagation time is linear.

As an analogy think about how a bacteria multiplies, at every step a size of the colony increased by factor of 2 until it's reached lets say 1024, if suddenly it requires twice more time to multiple then it takes only twice longer to reach the size of 1024 because the number of multiplication steps is still the same.

MoonShadow is right, every node performs the checks on its own before forwarding the block to other peers, please don't make useless analogies Serith.
sr. member
Activity: 269
Merit: 250
The verification delays grow exponentially with the number of transactions simply because each node must perform it's own series of checks before forwarding said block.

Wrong, because the number of nodes getting involved in verification of specific block grows exponentially as well, so the relationship between number of transactions and propagation time is linear.

As an analogy think about how a bacteria multiplies, at every step a size of the colony increased by factor of 2 until it's reached lets say 1024, if suddenly it requires twice more time to multiple then it takes only twice longer to reach the size of 1024 because the number of multiplication steps is still the same.
legendary
Activity: 1708
Merit: 1011
full member
Activity: 154
Merit: 100
legendary
Activity: 1708
Merit: 1011


One, is the fixed cost really directly linear to the max block size? Or is it really more like exponential?


It's definately exponential, even if you just consider network traffic.

Let me paint a strawman to burn...

The network topology, as it exists, is largely random.  A full client is expected to have at least 8 active connections, although three would suffice for the health of the network and only one works for the node itself.  Many can, and do, have more connections than 8.  Many of the 'fallback' nodes have hundreds, perhaps thousands, of connections.  So here is the problem; when a block is found, it's in every honest nodes' own interest that said block propagates to the edges of the network as quickly as possible.  A recently as last year, it wouldn't normally take more than 10 seconds to saturate the network, but Satoshi chose a 10 minute interval, in part, to reduce the number of network splits and orphaned blocks due to propagation times, because he presumed that the process that nodes go through (verify every transaction, verify the block, broadcast a copy to each peer that does not have it) would grow as blocks became larger.  The verification delays grow exponentially with the number of transactions simply because each node must perform it's own series of checks before forwarding said block. 

Now we get to the traffic problem...

Once the block has been found by a miner, that miner very much wants to send it to every peer he has as quickly as he can.  So he has to upload that block, whatever it's size, at least 8 times because it's impossible that any of his peers already has it.  Then, after each of those peers has performed the checks, he then proceeds to upload that block to all of his peers save one (the one it got it from) because at this point it is very unlikely that any of his other connected peers already has the block.  These nodes also have a vested interest in getting this block out quickly, once they have confirmed it's a valid block, if they are miners themselves because they don't want to end up mining on a minority block.  Thus, as you can see, the largest miners will always form the center of the network, because they all have a vested interest in being very well connected to each other.  So they also have a vested interest in peer connections with other miners of their same caliber, but not so much with the rest of the network, since sending to lesser miners or non-mining full clients will slow down their ability to transmit said block to those peers that increase the odds that said block will be the winner in a blockchain split.  This effect just gets worse as the size of the block increases, no matter how fast the connection that the mining nodes may have.

Well, this process continues across the p2p network until half of the network already has the block, at which point each node, on average, only finds that half of his peers still need the block.  The problem is that, for small miners, the propagation of a block that is not their own offers them somewhat less advantage than it does for the big miners at the center of the network, because those big miners have already been mining against that new block for many milliseconds to several seconds before the marginal miner even received it.  To make matters worse, the marginal miner tends to have a slower connection than the majors.  So even though it's increasingly likely that his peers don't need the block at all; as the size of the block increases, the incentives for the marginal miner to fail to forward that block, or otherwise pretend it doesn't have it, increase at a rate that is greater than linear.  This is actually true for all of the mining nodes, regardless of their position in the network relative to the center, but those who are closest to the center are always those with the greatest incentives to propagate, so the disincentives increase at a rate closer to linear the closer one is in the network to the center.

Now there is a catch, most of the nodes don't know where they are in the network.  It's not that they can't determine this, most owners simply don't bother to figure this out, nor do they deliberately establish peer connections with the centermost nodes.  However, those centermost nodes almost certainly do have favored connections directly to each other, and they were established deliberately by the ownership for these very reasons.
legendary
Activity: 2940
Merit: 1090
Fine then, lets look at how much it will cost to do specific upgrade scenarios, since all costs ultimately end up coming out of the pockets of users. (Right? Users ultimately pay for everything?)

One scenario is, we upgrade absolutely every full node that currently runs 24/7.

Already we are driving out all the people who like to freeload on the resources of decentralised networks without giving back. They fire up a node, download the movie they want or do the bitcoin payment they want - in general, get something they want from the system - then they shut down the app so that their machine doesn't participate in providing the service.

So for a totally ballpark, making handwaving assumptions back of a napkin guess, lets guess maybe ten times the block size would mean ten times the current costs of hardware and bandwidth. "We" (the users) provide upgrades to all the current 24/7 full nodes, increase the max block size by ten and we're done.

Of course the beancounters will point out "we" can save a lot of cost by building one global datacentre in Iceland, increasing the max block size to oh heck why quibble over how many gigabytes lets just do this right and say at least a terrabyte. Current nodes are just the fossils of a dead era, "we" do this once, do it right, and we're done. For redundance maybe "we" build actually seven centres, one per continent, instead of just one wherever electricity and cooling is cheapest.


Two details:

One, is the fixed cost really directly linear to the max block size? Or is it really more like exponential?

Two, experiment with various s/we/someoneelse/ substitutions? (Google? The government? Big Brother? Joe Monopolist? The Russian Mob? Etc.)

(I thought I had two details, but come time to type the second couldn't remember what I had had in mind, so made up the above detail Two as a spacefiller.)

-MarkM-
member
Activity: 87
Merit: 12

I think you're on the right track with that idea.  Here's a metaphor that might be related to this whole issue:

If demand for wheat increases, the price of wheat increases too because, of course, there's a limited supply of wheat.  The increased price encourages farmers to grow more wheat and make more investment in tractors and equipment.  Higher wheat prices encourage more people to become wheat farmers when they see that the wheat business has become more lucrative.  With the increased supply of wheat that results from its increased production, consumer prices begin to fall.  Overall, the economy ends up with a much larger supply of wheat at the most competitive price possible.  Production is maximized, farmers' profits are maximized, and consumer prices are minimized.

I know the block size issue is more complex due to things like hard drive space, but the relationships between supply, demand, production, and price seem important to the problem at hand.
legendary
Activity: 1708
Merit: 1011
While this number isn't remotely what it would have to be to handle the kind of transactions normally done in cash, we don't really want to handle every single transaction.

Why not? Because it would require 25GB blocks (to do 15,000,000,000 transactions per day) and that seems completely impossible right now?

No, but because of the network effects of raising that limit.  While not a certainty, raising the limit theoreticly permits the overall resource consumption of bitcoin's p2p network to increase exponentually.  And it's this increase in resource consumption that would drive out, not just the hobbyist with a full client, but small mining operations that are otherwise profitable but marginal.  I'm not afraid of some degree of this if I'm convinced that raising that limit is a net benefit for Bitcoin, but I'm not convinced that this is the case.  Transaction times, delay time and average transaction fees are far from the only consideration.  This is a very complex system, and even a small increase in the hard limit could cause some unintended consequences; but a small increase does not justify the potential risk of a hard code fork fight.
sr. member
Activity: 310
Merit: 250
While this number isn't remotely what it would have to be to handle the kind of transactions normally done in cash, we don't really want to handle every single transaction.

Why not? Because it would require 25GB blocks (to do 15,000,000,000 transactions per day) and that seems completely impossible right now? Computing resources are growing much faster than either the number of people on this earth, or their number of financial transactions. Think about how valuable Bitcoin (and the resources that protect it) would become if it ever got to that scale. I'm not saying this is going to happen tomorrow or even in 10 years. But in my mind, more (paying) transactions = more success for Bitcoin.
legendary
Activity: 1708
Merit: 1011
To make a point about this, I went onto BitcoinWatch and checked the transactions for the past 24 hours.

61,689

On a per second rate, that's about 0.714 transactions per second.  While it's true that we've hit some pretty high highs.  Has there ever been a case where we were maxing out the soft limit for a full day?  Or even for a full hour?

I'd like to see what a transactions per day chart might look like over the past month or so.  I'd guess that it's not likely to be over one transaction per second.  While 7 tps sounds like it's slow, it's a continuous process, 24 hours per day, 7 days per week.  That comes out to be approximately 4,233,600 transactions per week.  Think about how often you use a credit card.  Do you use it everyday?  While this number isn't remotely what it would have to be to handle the kind of transactions normally done in cash, we don't really want to handle every single transaction.  What if we look at it like it's a bank account.  Normally, people get paid and deposit into their accounts once each week, and then they widthdraw spending cash and sometimes use debit cards, etc. to buy things or pay regular bills.  So if we were to grant the average user 5 transactions per week, we'd currently be able to manage this kind of rate to just under a million users.  So if our goal is to have as many direct users of the blockchain as there are citizen of the United States, we'd have to have a blocksize limit of about 75 megabytes.  However, a blocksize limit of merely 10 MB would manage roughly 40 million users who directly access the blockchain on a regular basis.  This is the practical equivilant of 40 million banks, if out-of-band transactions are both practical and encouraged.  That does not sound like centralization to me.

Keep in mind that your full client functions like a bank for your single account, but an online wallet service does the same thing for an entire group of members.  We really want to encourage out-of-band transactions if we are ever to achieve the kind of success Bitcoin is actually capable of, not to mention that out-of-band transactions improve anonimity if done the right way.
legendary
Activity: 1078
Merit: 1006
100 satoshis -> ISO code
I'm pretty sure that the 250KB limit has never been broken to date.

block 187901 499,245 bytes
block 191652 499,254 bytes
block 182862 499,261 bytes
block 192961 499,262 bytes
block 194270 499,273 bytes

These are the biggest 5 blocks up to the checkpoint at block 216,116.  Interesting that they're all 499,2xx bytes long, as if whoever is mining them doesn't want to get too close to 500,000 bytes long.

I understand that at least one miner has their own soft-limit, probably Eligius and probably at 500kb.
legendary
Activity: 1078
Merit: 1003
legendary
Activity: 1708
Merit: 1011
When the competition is decimated all transactions can have compulsory fees.

I don't ever want to see compulsory fees.  That would undermine a great deal of transactions in meatspace that we don't want to see.

As the system is designed, there is no way for the receiver to pay for processing.  There are conditions wherein it's ideal that the burden of such be upon the merchant.  The Wal-Mart example is a general one. 

If Wal-Mart starts accepting BTC in their brick & morter stores, most of the time they are not going to need confirmations for their funds.  However, when it comes time to pay their employees, they need to have those free transactions processed.  So it's in Wal-Mart own interest to support a mining operation, if only to garrantee themselves timely transaction processing.  The same logic applies to Target or McDonalds.  This is what we want.  We desire that the big players be willing to support the network, even at a loss, much like a bank must buy a quality safe.  The safe is a cost center for a bank, which mostly deals in electronic forms of transactions anyway; but no one wants a bank branch without a safe.  Such a form of off-network compensation for mining would establish a baseline network infrastructure, and thus result in more 'sticky' difficulty.  Would more marginal miners be pushed beyond profitablility?  Probably so, but the network ends up with more reliable infrastructure and a lower security cost overall; and Wal-Mart wont do it unless they figure that it would be cheaper for them to sponsor miners to include all of the free transactions that they receive than to expect their customers to pay the minimum fee per transaction.
legendary
Activity: 2940
Merit: 1333
I'm pretty sure that the 250KB limit has never been broken to date.

block 187901 499,245 bytes
block 191652 499,254 bytes
block 182862 499,261 bytes
block 192961 499,262 bytes
block 194270 499,273 bytes

These are the biggest 5 blocks up to the checkpoint at block 216,116.  Interesting that they're all 499,2xx bytes long, as if whoever is mining them doesn't want to get too close to 500,000 bytes long.
legendary
Activity: 1078
Merit: 1006
100 satoshis -> ISO code
"Actually, free transactions do earn money. They encourage new users. More people will try bitcoin, accelerating its growth. By implication this contributes a proportion to its exchange rate against fiat. So, miners that include free transactions are indirectly benefiting from them as their block reward is higher in fiat terms."

I am not saying if this is good or bad. I am describing the reality of the situation that exists.

Has no one heard of a loss leader? http://en.wikipedia.org/wiki/Loss_leader
Bitcoin is exponentially growing against fiat currencies and established payment systems. By accident or design some/lots of this growth is via its loss leader of free transactions. Arguably, SD abuses this facility.

However, loss leading might be the best "winning" strategy that will have BTC kill off most of fiat in the shortest time (if that is the goal). When the competition is decimated all transactions can have compulsory fees (if that is the goal). Bitcoin does not exist in a vacuum and obtaining value in fiat during the growth phase is sensibly leveraging an external. The network benefits as most participant nodes would have a long-term BTC holding.

Hotmail and Yahoo mail, Google search, Facebook social network, Huffpost news: all are free, all are loss leading to get market-share! Once/if they decimate their respective competition their fees will rise. Their goal is transparent. In the vernacular: to monetize eyeballs. Loss leading is essential to get there.
Pages:
Jump to: