The network has just started and the size of the folder with blocks is already 2.64 GB.
This is a lot.
Will there be any work to optimize the size of stored blocks?
Hi, I am from the Kaspa research team.
All and all we store three components: full header data above the pruning block, the UTXO set of the pruning block, and a proof of correctness for the UTXO set.
We have a fancy pruning mechanism (cf.
https://research.kas.pa/t/some-of-the-intuition-behind-the-design-of-the-invalidation-rules-for-pruning/95) that allows us to remove old block data. At full capacity the size of a block payload is bound by 100kB and the size of a block header is bound by (100 +32*log_2(past size))B. In the distant future where we have a trillion blocks in the network (this will take about 30 thousand years of one block per second) we will have that log_2(past size) = 40, so let us assume that log_2(past size) <= 40. This means that the header size is bound by (100+32*40)B which is just below 1.5kB. For simplicity assume for now that he entire block size is 100kB. We store three days worth of full block data which, at a rate of one block/second, accumulates to about 26GB (note that this bound assumes that all blocks are at maximum capacity, no assumptions on average number of txns per block).
The UTXO correctness proof (cf.
https://github.com/kaspanet/research/issues/3) requires that we keep additional log_2(number of blocks in the network) headers (not full blocks). Using again the assumption log_2(past size) <= 40 this adds about 60kB of data, which is completely negligible. Currently we store all block headers, as it requires some care to remove them without accidentally removing headers required for the proof and our dev team hasn't got around to this yet, this is a completely technical issue which will be resolved in the near future. (There is another detail I swept under the rug, which is that we also have to store the headers of all pruning blocks. This means one header per day. While this technically grows at a rate O(n*logn) the constant is ridiculously small: it is bound by 1.5kB/day, which are about 570kB a year).
The only thing that grows linearly is the pruning block UTXO set itself. It currently requires a field of a fixed size for every unspent output in the network. It is hard to predict how fast this set grows as this heavily depends on user behavior. We will resolve this in the future by means of cryptographic accumulators (cf.
https://en.wikipedia.org/wiki/Accumulator_(cryptography)). An accumulator is a way to represent a large set succinctly such that it is impossible to recover the set itself (due to information theoretic compression bounds), but it is possible to verify that an element is in the set given a proof. This means that every user will only need to store the (proofs of) their own unspent outputs, and the nodes will only have to verify this proof against the accumulator, which is much smaller than the actual number of unspent outputs. The sizes of the accumulator and the proofs depends on the exact solution we will choose.
Holding back from making announcements or keeping info back - like the telegram bot which gives network hashrate, didn't sit right with me and the coin is no longer restricted to a few people on discord.
I feel that I should clarify: the bot in question was written by a community member which is not a member of the core team and who does not want to share their code, and I am sure they have their reasons. This has nothing to do with Kaspa. All the bot does is to issue a couple of commands to our (completely open source and publicly available) node and print the result to a Telegram channel. Seems a bit unfair to me to conclude from this that the core team is holding back on anything.
We strongly believe in openness, which is why we made the network publicly available and invited the community to get involved as soon as possible, and in particular, without any premining whatsoever.
The reason we wanted to delay the announcement is because we wanted the coin to be more well tested, and the ecosphere more well developed, before using our one chance to garner attention off the BCT board. But since this is a community coin, we can't (and don't want to) prevent anyone from making announcements. Anyway, now that the cat is out of the bag feel free to ask me anything.