Pages:
Author

Topic: Maintaining the growing blockchain ledger size in local full nodes (Read 1144 times)

full member
Activity: 182
Merit: 101
Storage space’s cost drops much faster than the ledger size grows. No problem. For instance, 4TB hard drives cost has dropped approximately 75% in four years.

https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/
member
Activity: 301
Merit: 74
There are two separate options here. One is download everything, keep 10% for serving. That fully meets the "validate everything" criteria.

But also the second option, the 10% download, I think might be secure, though I'll let the crypto guys decide. Smiley If you trust the full header chain you've downloaded, and if all the blocks you store and serve validate internally and against the header hash, where's the attack vector? For wallet use it's definitely more secure than SPV, but yes, getting the whole UTXO set will require its own solution (haven't read in detail, but here's a related idea).

You don't expect to connect to hundreds of nodes and for hundreds of nodes to connect to you right? You are getting the blocks from several peers at the same time.
I don't understand where you see the problem. It's not much different than how it's done now. Just in the case of a 1/10 split, 1/10 of the nodes will tell you "I don't have this block", and will likely refer you to another peer they know that does have it. Just another hop or two.

Quote
So the nodes would be relying on a central server to be providing them with the info? What about the peer to peer in Bitcoin?
Kad and BT DHT don't rely on servers, they're distributed. And there you usually find a piece of data that's held by 1 in a million, not 1 in 10.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
Maybe it could be configurable, like pruned mode. Those who want to fully download and verify all blocks since the beginning can do so, others can rely on just the block headers.
So a SPV client? If we're talking about full nodes, you would be expected to download and verify all the blocks.

But if you trust them to be a linked list of the REAL chain, you also trust their hashes. And since you would fully validate some statistical percentage of the blocks, it would be hard to fake just some blocks in the middle which you didn't validate. Also, you could potentially validate some of these blocks later. For example, maybe the client could download each hour 1 random block it didn't validate as part of its 10% download.
You can validate the blocks per the block headers but you won't be able to verify if the block follows the protocol rules. If you don't do it, there's no sense in running a full node at all. I get that you're solving the storage problem but the redundancy would be affected quite severely. If 1/2 of the nodes goes down, the network would have only 500 nodes(or even lesser) with the specific portion of the blockchain.

Why one peer? If it's 10%, 10% of the nodes can provide it. That's still hundreds or thousands.
I don't think currently you're connecting to 100% of the nodes to download the whole chain.
You don't expect to connect to hundreds of nodes and for hundreds of nodes to connect to you right? You are getting the blocks from several peers at the same time. The current implementation works by the peers sending the blocks simultaneously. If your implementation were to be a reality, your client will download fragments of it and there would be a bottleneck.

Sure, but you can connect to a large enough portion of it to find what you need, like it already does. There are other examples of decentralized networks which work with far fewer nodes holding data, sometimes just a single one, like Kad and BitTorrent's DHT.
So the nodes would be relying on a central server to be providing them with the info? What about the peer to peer in Bitcoin?

That's one option, but: it just solves storage, it can't help serving blocks as a "split full node", and in the long term also the initial download can be a problem. In a way, bootstrapping a client from scratch already takes too long.
There's little to no point if you're downloading just parts of the blockchain. Either you download and verify everything or use a SPV client.
member
Activity: 301
Merit: 74
So you're just solving the storage problem?
Maybe it could be configurable, like pruned mode. Those who want to fully download and verify all blocks since the beginning can do so, others can rely on just the block headers.

Quote
Block headers doesn't show if a block is valid.
But if you trust them to be a linked list of the REAL chain, you also trust their hashes. And since you would fully validate some statistical percentage of the blocks, it would be hard to fake just some blocks in the middle which you didn't validate. Also, you could potentially validate some of these blocks later. For example, maybe the client could download each hour 1 random block it didn't validate as part of its 10% download.

Quote
With your implementation, only one peer can provide the specific portion of blocks, making it a bottleneck.
Why one peer? If it's 10%, 10% of the nodes can provide it. That's still hundreds or thousands.
I don't think currently you're connecting to 100% of the nodes to download the whole chain.

Quote
What does the full node being not in sync relate to the problem?
It just means that already the client doesn't assume each full node holds the whole chain, and can query and seek another node if needed. It could be the basis to build upon for the split chain protocol.

Quote
Nodes cannot see the entire network and no one can connect to all of them.
Sure, but you can connect to a large enough portion of it to find what you need, like it already does. There are other examples of decentralized networks which work with far fewer nodes holding data, sometimes just a single one, like Kad and BitTorrent's DHT.

Quote
might as well as just have a pruned node instead of going through the trouble.
That's one option, but: it just solves storage, it can't help serving blocks as a "split full node", and in the long term also the initial download can be a problem. In a way, bootstrapping a client from scratch already takes too long.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
It could work like pruned mode: download everything but store only some. Or it could be selective downloading. In both cases, like it does already, the blockchain headers can be downloaded and kept in full to provides some verification.
So you're just solving the storage problem? You're probably ending up connecting to hundreds of nodes just to download the whole Blockchain. Block headers doesn't show if a block is valid. That being said, the synchronization could also end up being a LOT longer. Current implementation allows several peers to provide the blocks. With your implementation, only one peer can provide the specific portion of blocks, making it a bottleneck.
I'm not sure of the protocol details, but that's relatively easy to solve. Currently it already finds the full nodes among the pruned and SPV. Also current full nodes aren't always up-to-date to the latest block, during startup resync. As for deciding what to store, maybe a probabilistic approach could work, or something fancier that involves probing the network.
It's not easy. What does the full node being not in sync relate to the problem? Nodes cannot see the entire network and no one can connect to all of them. Unless you can somehow decide to centralise Bitcoin and have nodes connect to a central server, the distribution of blocks will be severely lopsided.
That's unlikely the be viable for long. Blocks are already mostly full today, and the network is congested. If SegWit takes hold it might reach 100GB/year or more (not sure if everyone needs to store the witness part?) within months, and even that's not going to be enough for much longer if the network is to grow further.
Well, to be fair, not everyone has to run a full node. Since you're just solving the bandwidth part, might as well as just have a pruned node instead of going through the trouble.
member
Activity: 301
Merit: 74
Currently, full nodes verify and validate each and every block to ensure that they are valid and adhere to the protocol rules. If you were to store all the blocks separately, not every node will be verifying each of the blocks.
It could work like pruned mode: download everything but store only some. Or it could be selective downloading. In both cases, like it does already, the blockchain headers can be downloaded and kept in full to provides some verification.

Quote
In addition, how would you know which nodes would store which "10%" of the blockchain?
I'm not sure of the protocol details, but that's relatively easy to solve. Currently it already finds the full nodes among the pruned and SPV. Also current full nodes aren't always up-to-date to the latest block, during startup resync. As for deciding what to store, maybe a probabilistic approach could work, or something fancier that involves probing the network.

But I bet all of that was already considered in the past, at least tentatively. Maybe there are even some BIPs on topic? Smiley

Quote
If the current growth remains the same, I'm estimating an additional ~52.6GB per year.
That's unlikely the be viable for long. Blocks are already mostly full today, and the network is congested. If SegWit takes hold it might reach 100GB/year or more (not sure if everyone needs to store the witness part?) within months, and even that's not going to be enough for much longer if the network is to grow further.

legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
The same could be said about how blocks are currently shared.

The only difference would be that instead of having 10,000 nodes with the full blockchain, you'd have 1,000 nodes for each 10% of it.
And as the network grows you'll likely have more nodes anyway, so it could be 10,000 nodes for each 10%.

But the UTXO set will require a more elaborate scheme.
Currently, full nodes verify and validate each and every block to ensure that they are valid and adhere to the protocol rules. If you were to store all the blocks separately, not every node will be verifying each of the blocks. This means that nodes will never know if the blocks are valid or not. It'll just be like a SPV client.

In addition, how would you know which nodes would store which "10%" of the blockchain?
I don't know. If the current growth continues, and the scaling problems are solved, perhaps a doubling each year is more likely. That works out to be about 40TB in 8 years. Smiley But of course, the scaling solution might not involve having to store the full blockchain.
Normal users can just run a pruned node.

If the current growth remains the same, I'm estimating an additional ~52.6GB per year.
member
Activity: 301
Merit: 74
nodes could share portions of the blockchain and unspent transactions
Sounds good, would never work. Bitcoin is built to be trustless.
The same could be said about how blocks are currently shared.

The only difference would be that instead of having 10,000 nodes with the full blockchain, you'd have 1,000 nodes for each 10% of it.
And as the network grows you'll likely have more nodes anyway, so it could be 10,000 nodes for each 10%.

But the UTXO set will require a more elaborate scheme.

looking at a blockchain no matter how you look at it under 1 Terabyte even in 8 more years time.
I don't know. If the current growth continues, and the scaling problems are solved, perhaps a doubling each year is more likely. That works out to be about 40TB in 8 years. Smiley But of course, the scaling solution might not involve having to store the full blockchain.

newbie
Activity: 62
Merit: 0
Sounds good, would never work. Bitcoin is built to be trustless. Unless you can find an implementation which nodes can store parts of it and still have it verifiable and diversed in terms of its difficulty to execute a DOS attack at a specific node to hinder the access to parts of Blockchain, it will never be possible.

So true. Any idea (with my limited understanding) to solve this involves giving more power / belief of few systems, which would be against the vision of bitcoin and might prove catastrophic if implemented.
member
Activity: 266
Merit: 42
The rising tide lifts all boats
nodes could share portions of the blockchain and unspent transactions
Sounds good, would never work. Bitcoin is built to be trustless. Unless you can find an implementation which nodes can store parts of it and still have it verifiable and diversed in terms of its difficulty to execute a DOS attack at a specific node to hinder the access to parts of Blockchain, it will never be possible.
Exchanging portions of UTXO (unspent data set) with Merkle paths serving as proofs for them is perfectly possible, once some kind of UTXO commitment is added to the blocks (e.g. into coinbase tx first).
Once that is done, even miner can download a few last blocks, see their UTXO Merkle roots, then download entire UTXO set from nodes and start mining.
There are like 4 proposals and 3 Python proof-of-concept implementations for that.

After that, rather than growing blockchain we would have to deal with growing UTXO (OP_RETURN transactions, old amounts below dust threshold too expensive to spend in terms of fees, etc.)
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
May be I am reading the opposing arguments more, but isn't the block size being small affecting transactions speeds?
The confirmation speed is certainly affected, the transaction propagation speed remains the same at a few seconds.
nodes could share portions of the blockchain and unspent transactions
Sounds good, would never work. Bitcoin is built to be trustless. Unless you can find an implementation which nodes can store parts of it and still have it verifiable and diversed in terms of its difficulty to execute a DOS attack at a specific node to hinder the access to parts of Blockchain, it will never be possible.
newbie
Activity: 1
Merit: 0
Blockchain will become larger. But I think it have some solves
newbie
Activity: 55
Merit: 0
agree that storage size will outpace
blockchain growth and be affordable and efficient
jr. member
Activity: 83
Merit: 1
nodes could share portions of the blockchain and unspent transactions
newbie
Activity: 62
Merit: 0
May be I am reading the opposing arguments more, but isn't the block size being small affecting transactions speeds?
hero member
Activity: 938
Merit: 559
Did you see that ludicrous display last night?
This is the purpose of the block size limit.  It's intended to prevent full nodes from being impractical to run by limiting the pace at which the blockchain can grow.  So basically people have been thinking about it pretty intensely for a good 7 years now.

As for the fact that the blockchain inevitably continues to grow, typical costs for storage and bandwidth tend to gradually decrease alongside that.
Moore's law.
Moore's Law is about the number of transistors per square inch.  It doesn't affect storage or bandwidth.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
i think this is what POS of stake solves, instead of a computer power battle

you just prove you own some good amount of coin
No. POS does not solve the storage problem, nowhere near.


The blockchain will continue to get larger and larger, that is inevitable. Not everyone needs a full node, in fact, I've mostly used SPV client. If you'd like, you can prune the blockchain to save space though that does not solve the bandwidth and time consumption.

The main challenge is to fit as many transactions as possible into the space. Full nodes are important but the improvement in technology would have already directly addressed it.
member
Activity: 70
Merit: 10
Moore's law. Bitcoin's blockchain we can assume to be roughly double what it's current size is in a few more years time, and of course that will be determined by block size changes/capacity as well but we're essentially looking at a blockchain no matter how you look at it under 1 Terabyte even in 8 more years time. 1 Terabyte in eight more years should only cost roughly the price of a little cheap flash drive at that time.

But that does not take into account bandwidth speed and costs. The larger the blockchain becomes, the harder for Bitcoin users to run a full node, opening the network more to centralization.

I tried to run one at home this year but everytime I try I always get discouraged by how large the blockchain is and how long it takes to download all of it.



That's true. Normal user does not get enough incentive to run a full node, even if storage becomes cheaper. That's why some people against too large block size.

Maybe it could also resolved technically, like compact early local blocks, records compact hash and unspend transactions only?
copper member
Activity: 81
Merit: 0
Look around you , nothing is secure
Being a bitcoin / block chain newbie, I am amazed by the technology, with one question coming to my mind.As the number of transactions have skyrocketed how will systems handle the growing ledger size. Will it be viable say 5 years down the line to have a full node?  Huh I guess most of senior members and early adopters would have thought about this.

i think this is what POS of stake solves, instead of a computer power battle

you just prove you own some good amount of coin
legendary
Activity: 2898
Merit: 1823
Moore's law. Bitcoin's blockchain we can assume to be roughly double what it's current size is in a few more years time, and of course that will be determined by block size changes/capacity as well but we're essentially looking at a blockchain no matter how you look at it under 1 Terabyte even in 8 more years time. 1 Terabyte in eight more years should only cost roughly the price of a little cheap flash drive at that time.

But that does not take into account bandwidth speed and costs. The larger the blockchain becomes, the harder for Bitcoin users to run a full node, opening the network more to centralization.

I tried to run one at home this year but everytime I try I always get discouraged by how large the blockchain is and how long it takes to download all of it.

Pages:
Jump to: