Author

Topic: Proposal for reduction of storage space on all clients (Read 1588 times)

full member
Activity: 168
Merit: 100
Here's another proposal (similar to the first link provided) for a "ledger". I think this guy actually went so far as to implement it, and it seemed to be very well thought out.
https://bitcointalksearch.org/topic/a-proposal-for-a-scalable-blockchain-52859

I think a major point that's missed about BTC is that, for all intents and purposes the only thing it's really doing is keeping track of account balances. All of the fluff and noise on top of that makes up the rules of how they are managed, but most of the work being done is extremely superfluous. Granted, some of that is on purpose in order to allow for various kinds of "contracts", but for the average tx all of the extra features just end up bloating the blockchain to a soon-to-be-unacceptable size.

One thing I noticed after digging through the protocol details is that the "balance" on any BTC account you have, rather than being stored as a single, current balance, is instead made up of every tx that was ever sent to that address. So, just as an example, say I have a donation account, and 100 people send me 0.01 BTC each. So, my donation account has a balance of 1.00BTC, but in order to spend it, I have to make a tx with 100x 0.01 tx donations as the input. As fees and blockchain size are directly related to the number of txIns and txOuts, half of my donation fund might end up in fees and then everyone who has a copy of the blockchain now must include a block with 100 txIns for one of the tx, even though the total balance of my account was a single value of 1.00BTC. If standard tx were highly simplified, it would save a whole lot of disk space and a whole lot of trouble.

Another thing that could be done, is that for p2pooling, rather than storing the whole blockchain each miner could store the latest ~4032 blocks (28 days), plus a random 4032 block "period" from the last 13 periods of 28 days (ie 364 days of blockchain total). When the extra period a miner is holding expires, they simply replace it with the latest 28 day period that would otherwise be discarded. Along with the ledger system, that would allow miners to run on less than 1/4 the size of the current/total blockchain, and they could still seed it to any clients requesting it torrent style. Really only a year of blockchain is needed for security reasons. Older sections only have historical value for those wanting to prove that the initial block rewards really are what the wiki says they are. As per the link above, a client would only need to store ~2016 blocks and a ledger, which amounts to even less total size.

As for the "government attack", that can only be avoided by improvements to the protocol itself and tightening of the blockchain validity criteria. For example replacing BTC's highly subjective "network time" with an NTP, perhaps merging with NTPool should allow latency-adjusted network time to be kept accurate to within less than 1 second. With some heuristics to prevent attackers from inventing the current time easily along with block timestamp restrictions of maybe 10 seconds, timejacking and block hoarding attacks would be nearly impossible, and certainly impractical since the window for double spending would be only seconds.

Above and beyond timekeeping, Meni's Proof of Stake system allows for checkpointing to be done often such that blockchain reorgs have a maximum depth that can be kept to ~24h. It would probably be very difficult to implement from the current protocol, but it would make the whole network very "sticky" to the current most-accepted blockchain, giving even the most effective attacks a very limited ability to do damage. At most the network would be disorganized for the period of the attack, but older historical parts of the blockchain and their associated tx would remain firmly in place.

If some attacker were to spam a fake blockchain, on the other hand, it would only affect new users or machines where the BC had not been downloaded yet. Unless said attacker were prepared to completely blind said user to the internet at large, there would be some noticeable problems for the client in using the BTC network. For example, if the blockchain they downloaded was fake, and then they go to receive coins from a legit part of the network, the blockchain info they had would not match the transaction received to their account, and they would see it as invalid. Obviously, the only way around that sort of attack is to download the blockchain from an alternate source. Given the relative ease of supplying a reliable copy of the blockchain via various methods, and the difficulty of censoring all of them, the payoff vs resources required for such an attack would be mediocre at best regardless of the intent.
hero member
Activity: 1596
Merit: 502
If the clients drop 99% of the blocks and only keep the headers but ask for a block if they need it, how can anybody send a fake block?
If you change only 1 bit of any transaction, the transactions would get a different merkle hash and so won't match the header.
hero member
Activity: 755
Merit: 515
Matt Corallo, what if a malicious org or a government decides to build up dark capacity and suddenly has 60% of the network? Could they swap in a fake history and destroy the network (every government will plan this out, but maybe never execute it)? Normal operations are not the problem, I am thinking about large-scale malicious intent.

Put more clearly, we need to be able recover from a large "wrong" chain of blocks in case of disaster. Say a buggy client is released or a government decides to not like bitcoin anymore. The buggy client or the government would potentially be in control for a week or so outputting thousands of blocks. Could we recover with a pruned history? Thats the question.
That is why fClient exists and thin clients dont sit around communicating with other thin clients.  They talk to the full clients.
member
Activity: 91
Merit: 10
ByteCoin, thanks.

Matt Corallo, what if a malicious org or a government decides to build up dark capacity and suddenly has 60% of the network? Could they swap in a fake history and destroy the network (every government will plan this out, but maybe never execute it)? Normal operations are not the problem, I am thinking about large-scale malicious intent.

Put more clearly, we need to be able recover from a large "wrong" chain of blocks in case of disaster. Say a buggy client is released or a government decides to not like bitcoin anymore. The buggy client or the government would potentially be in control for a week or so outputting thousands of blocks. Could we recover with a pruned history? Thats the question.
hero member
Activity: 755
Merit: 515
I think it is discomforting to place the whole history and therefore the whole trust into the hands of a small percentage of all bitcoin users. I don't have a solution unfortunately.
Has been discussed really not as big a problem as you think, as anyone *can* download it if they wish.

In any case, as ByteCoin points out, there have been many discussions on this topic, no need to start a new thread.
http://forum.bitcoin.org/index.php?topic=10663.0 is one more realistic.
sr. member
Activity: 416
Merit: 277
I propose to insert into every block 1000th block N a summary ...

Your proposal is similar to one presented in http://forum.bitcoin.org/index.php?topic=505.0 which outlines a scheme for having to store as little information as possible.

ByteCoin
member
Activity: 91
Merit: 10
I think it is discomforting to place the whole history and therefore the whole trust into the hands of a small percentage of all bitcoin users. I don't have a solution unfortunately.
hero member
Activity: 755
Merit: 515
Miners need to have a full (purged) block chain.  They could maybe be able to remove block headers after a certain age and just hold transactions, but that isnt nearly worth it as block headers are very small.
End clients dont have to bother with any blocks except for recent ones.  They can (and will) just search through new blocks for their transactions and then store those.
legendary
Activity: 1106
Merit: 1004
Somebody once suggested that only those concerned by a block would need to keep it entirely, meaning, those who have money on inputs of that block. Miners would only have to keep the headers and maybe the most recent blocks as cache.

I'm not sure if that could work easily with the current implementation, as I don't know if it is possible to ask the content of a block backwards until the request reaches the sender of a transaction. But with some improvements maybe that could be done.
member
Activity: 91
Merit: 10
Do miners really need the fully chain? The summary would imho be enough. The size of the block chain will be a problem to fit on any disk a few years from now.
hero member
Activity: 755
Merit: 515
You can already prune transactions in blocks which could provide a fairly similar gain.  Also, thin clients.  Why go half way for every user, when you can go all the way for 90% of users and not for the miners who need full chain?  Also, breaking backward compatibility is a bad idea.
member
Activity: 91
Merit: 10
When transaction load increases we need a system that allows most clients to function without all transaction history. The system must also still be attack-proof so we cannot offload history on a small number of supernodes - they might get under control of an attacker.History must be spread across all clients randomly.

I propose to insert into every block 1000th block N a summary of all balance differences that resulted from transactions in the block range [N-20000 to N-10000). Clients only need history to ensure that current transactions have enough balance on their inputs. We can store that balance data as a cache inside of the network. The important bit is that this data is validated through the block chain so clients can trust this data (they would not be able to trust it if it was stored inside of a DHT in the P2P network). Only if clients can trust this cache they do not need to download, verify and store the blocks [N-20000 to N-10000).

In order to still preserve all information in the network clients still download 1-10% of the ranges they don't actually need (amount depending on available disk space). That way we save 90-99% of storage space while preserving all data.
Jump to: