Posted the following to James A. Donald's blog (the first guy who interacted with Satoshi in the cryptography forum where he first appeared).
Purging old records has the benefit of eliminating lost (or abandoned, e.g. death) wallets to make the money supply more quantifiable. Owners can send new transactions to themselves to update their timestamps. Purging will not hold the block chain size constant, because the rate of transactions is growing (
probably exponentially).
Although the block chain size is currently only ~8GB (up from ~2GB in a year) and thus can still easily fit in the 4TB harddisks available to and afforded by the consumer market, it will not only eventually outpace Moore's Law applied to harddisk space, but it is currently too large for many consumer internet connections to download in any quick start scenario. If non-hosted ISP connections provide 0.1 - 1GB per 10 minutes, then (assuming a resumable download manager for dropped connections) 8GB is a 1 hour to 1 day download. At 4TB a 1 year to decades download. Note a mining peer could begin processing before downloading the entire blockchain, if it is download from newest to oldest, and all the transactions in a current block are from blocks already downloaded.
At Visa scale of 16 million transactions per 10 minute block, the blockchain would be growing at roughly 23 GB per day or 8 TB per year. However, some percent of this can be reduced by pruning the blockchain for private keys that have been entirely spent (and possibly also beyond a certain age).
I propose that although we need to broadcast the transactions, the blockchain should only need to store the balances of the private keys (perhaps after the currently 100 block maturity cycle to account for resolution of competing forks). There would be two Proofs-of-Work provided, i.e. two parallel blockchains, one containing the transaction data and the other only the private keys with updated balances, with the former provide first, then all peers competing to provide the latter. So the reward would be split in half and the difficulty for both blockchains would be set so they both average completion every 10 minutes. Or the latter blockchain could be a digest of say every 10 to 100 blocks, and so the difficulty could be adjusted to be every 100 to 1000 minutes.
If the number of private keys in existence could be limited (by an automated free market algorithm protocol that raised the price of new private keys while giving a simultaneous credit for spending all of and thus deleting a private key), then size of the blockchain could be limited. Four billion private keys with a 4-byte balance would require roughly 100 GB, thus 12 hours to 12 days download. With perhaps 100 million Bitcoin users at most over the next several years, that is 40 private keys each. By the time the entire human population needs to use Bitcoin, the bandwidth of the ISPs will probably have increased an order-of-magnitude, so the limit can be increased by up to an order-of-magnitude.
For many reasons, including that
mining is the only way to obtain Bitcoins truly anonymously, we don't want mining to be limited to only those with certain resources (especially we don't want to eliminate normal ISP accounts!).
Every mining peer has to have the evidence that supports a transaction, else peers could disagree on consensus (see my conclusion that alternatives to Proof-of-Work must
centralize to obtain consensus) about new blocks and forks could appear.
Assume the blockchain is partitioned in N sections, where each mining peer only has to hold a section determined from its private key by partitioning the private key space into N sections.
If the blockchain evidence for each transaction is not sent to every mining peer, then transactions require a factor of N more time to be added into the blockchain (must wait for a mining peer to win the Proof-of-Work which holds the section of record on the sender's balance) and forks can appear because (N-1)/N mining peers won't be able to verify (N-1)/N transactions in the current block before starting Proof-of-Work on the next block.
So if the blockchain is N partioned, the only viable design is that the evidence must be sent to all mining peers for each transaction. Thus increasing bandwidth required by Proof-of-Work while reducing bandwidth required for new peers to download the entire blockchain. The number of peers that will request the evidence is N-1 and the size of the blockchain that a new peer has to download is total/N.
I believe Jim is correct that the only evidence that needs to be sent are the branches of the Merkle tree within the block up to block hash. All mining peers would keep a complete history of mining hashes, since these are only 80 bytes * 6/hr * 24hr * 365 = 4MB per year.
The Merkle tree is a perfectly balanced binary tree, thus the depth of the tree is log2(T) where is T is the number of transactions in a block. Thus the number of (2 hashes evidence per) nodes from block hash tree root is log2(T)-1. Thus the Merkle branch evidence bandwidth required at the limit of N -> infinity is T_current x ((log2(T_old)-1) x 2 x hashsize + transactionsize/2). Note this is in addition to the data for the current block, which is T_current x (hashsize + transactionsize) - hashsize.
Visa scale is ~16 million transactions per 10 minutes. If hashsize is 20 bytes (instead of current 32 bytes) and transactionsize is 50 bytes, then for ~16million transactions per block, the 1.1GB data size increases to 15.8 GB per 10 minute block.
Non-hosted ISP connections are limited to order-of-magnitude of 100 MB to 1GB bandwidth per 10 minutes equating to 1.4 - 14.3 million transactions per 10 min block with Bitcoin's non-partitioned blockchain or 118 to 1046 thousand transactions per 10 min block with the herein proposed, partitioned blockchain.
Thus I conclude that the only way to scale to Visa-scale and retain freedom of mining for all (and thus anonymity for all), is to limit the number of private keys as I proposed above. This also has the advantages of keeping required bandwidth thus unreliable connection hiccups lower and discarding the history of transaction graphs, which thus increasing anonymity w.r.t. the private sector attacks (although the NSA has the Zetabyte storage resources to retain the transaction graphs even at Visa scale).
Does anyone see a problem with that proposal?