The blockchain size is
constantly increasing, and as a result, it is necessary to store a large amount of data on the computers of cryptocurrency holders, so I believe that minimizing of the blockchain is a matter of current interest. Many users do
not want to download the entire multi-gigabyte blockchain, that has been created over the years of the cryptocurrency's existence, for the sole purpose of sending several transactions securely. Therefore, I propose to publish here
various methods for cutting old blocks from the blockchain.
Of course, in any case the entire blockchain is necessary for the
full verification and validation of all transactions, that have been performed since the mining of the first GENESIS block, as well as for the
analysis of coin transfers from one address to another. As a variant, instead of downloading the entire blockchain, you can use PRUNE mode and online blockchain explorers, but in this case new vulnerabilities are added to such solutions due to the possible inaccessibility of the website, phishing and hacking of DNS servers. I am talking about
alternative methods for storing the current state of cryptocurrency addresses for a client-side computer.
The first method for Bitcoin
According to Bitcoin technology, to perform a transaction, you must know the SHA256D
hash of the previous transaction and the index
number of the unspent output in this transaction. Therefore, in order to minimize the blockchain size, it is only necessary to store the structure and data related to all unspent UTXO outputs and Bitcoin transaction hashes which contain these outputs with their index numbers in this transaction.
A sample dump contains three tables:1.
The list of Bitcoin transactions which contain
at least one unspent output:
1) Structure of the transaction 1:
- SHA256 hash of the transaction (32 bytes)
- transaction parameters (lock_time, SegWit data, etc)
2) Structure of the transaction 2
...
The table is
sorted by the ascending "SHA256 hash of the transaction" field which has the little-endian byte order.
2.
The list of unspent outputs UTXO:
1) Structure of the output 1:
- output script
- amount of BTC coins in the output
- index number of the transaction (according to the list above)
- index number of the output in this transaction
2) Structure of the output 2
...
The table is
sorted in ascending order, firstly by the field "index number of the transaction", secondly by the field "index number of the output". Both fields have the big-endian byte order.
3.
The list of block headers:
1) Structure of the header 1:
- index number of the block in the blockchain
- block header
2) Structure of the header 2
...
The table is
sorted by the ascending "index number of the block" field which has the big-endian byte order. In fact, this table is a
chain of block headers.
In addition, the dump contains the following fields:- dump version
- cryptocurrency name
The "dump version" field has the big-endian byte order.
The essence of the method for cutting old blocks from the blockchain is that, for example,
every month a certain
trusted person who has an excellent reputation in the cryptocurrency community creates the list of unspent transaction outputs, which is described above, and calculates the SHA256 hash of this dump. Then he
signs a standardized comment, that contains the size and the hash of the created file, with his ECDSA secp256k1 key and publishes this dump on some well-known website. Users download this file, verify the digital signature and load the dump to a
special version of Bitcoin Core. Afterwards they download new blocks,
starting with the last block number in the dump.
Thus, users will
not need to download the entire blockchain to their computers, and the correctness of the data will be based on the
trust in the person who created the list of unspent transaction outputs and signed it with his ECDSA key. Moreover, network synchronization will take the
minimum time.
The second method for Bitcoin
It's the same scheme, but the difference is that, for example, every 4320 blocks (i.e. approximately
every month) Bitcoin miners add a standardized comment to the input script of the COINBASE transaction. This comment should contain the size and the SHA256 hash of the file, which is a list of unspent UTXO outputs and is published on some well-known website. This will mean that the miner
confirms that the published list is correct and can be safely used by users on their computers. If other miners do not agree that all the data included in this file is correct, they will not continue to mine this branch, so the mined block will become an
orphan.
In this case, users who have downloaded and verified the list of unspent transaction outputs will rely on the
amount of Proof-of-Work operations that were performed
after the miner confirmed that the data included in this file is correct.
Actually, although the second method looks more convincing, it is almost practically
unrealizable, because all the Bitcoin miners need to approve consensus. But in the current situation, miners are indifferent to the texts written in the input of the COINBASE transaction, so they will mine any
valid branch and will not check the correctness of the published file which contains a list of unspent outputs. Therefore, in my opinion, the first method for Bitcoin looks more preferable.
However, the second method can be implemented in
another cryptocurrency the consensus of which will
require miners to check the size and the hash of the dump of unspent outputs that are written in the input script of the COINBASE transaction. In addition, the full node will not need to download a published dump, because it will can compose such the file
itself by the algorithm described above.
If you wish, propose here
other methods for cutting old blocks from the blockchain with the full description of your algorithm, not only for Bitcoin. I will
edit the second post and add links to your ideas and solutions.