Pages:
Author

Topic: (non-ultimate) blockchain compression? - page 2. (Read 1832 times)

member
Activity: 229
Merit: 13
I don't think a common disk compression methods are efficient for blockchain. Efficient compression means understanding the underlying structure.

I disagree.
You will have benefits from compression big chunks of data (such as blk-files), not a small pieces (transactions in blocks)

Quote
Speaking on cheap vs expensive - I think it's users time that's more expensive than processor time or disk space. We can significantly save time necessary to wait for another wallet sync, or for a new wallet to init.

The main time-expensive routine is verifying ECDSA signatures, not downloading.
But we can not (?) eliminate this step on every node.

Hmm... May be some checkpoints? Let's say we have bootstrap.dat & all index files up to the block on May,1,2014
And the client has hardcoded hash of this data.
So, new user have to download bootstrap&indexes, check hash and... do not verify all signatures from the beginning of bitcoin era
administrator
Activity: 5222
Merit: 13032
blocks/blk*.dat could probably benefit from some sort of compression. These files are reasonably compressible and very large, but they're not accessed very frequently. It might also be good to compress blocks before sending them to peers, especially when they're downloading many blocks. The other files are already compressed about as well as they can be without hurting performance, or small enough that it doesn't matter.
hero member
Activity: 836
Merit: 1030
bits of proof
I don't think a common disk compression methods are efficient for blockchain. Efficient compression means understanding the underlying structure.

Speaking on cheap vs expensive - I think it's users time that's more expensive than processor time or disk space. We can significantly save time necessary to wait for another wallet sync, or for a new wallet to init.

Block chain data does not compress impressively on a global scale, but indices on addresses and tx hashes do.

Bits of Proof stores both the block chain data and supplementing indices in LevelDB and achieves high performance in retrieving transactions referring an arbitrary HD master key, that is why it powers the myTREZOR web wallet.

I am sure it could be further optimized with your ideas, so let me know if you'd like to discuss them in that scope.
donator
Activity: 1218
Merit: 1015
Quote
I didn't do a detailed study yet, but speaking of low hangin fruits, what about address dictionary, typical script patterns and database index compression?
Set compression for the hard drive and place bitcoin files on it.
And think what is cheaper: disk space or processor work?
Disk space is fairly insignificant. Transmission is the real money/time suck. I bugged Jeff about compressing the bootstrap torrent he maintains. He's open to it... needs more prodding, I think. Smiley Even without anything built specifically for Bitcoin, >40% can be cut from the blockchain without requiring anywhere near as much time to decompress (compared to downloading time saved) unless you have a higher-end cable connection or fiber.
newbie
Activity: 14
Merit: 0
...also, this would be an optional feature anyway
newbie
Activity: 14
Merit: 0
I don't think a common disk compression methods are efficient for blockchain. Efficient compression means understanding the underlying structure.

Speaking on cheap vs expensive - I think it's users time that's more expensive than processor time or disk space. We can significantly save time necessary to wait for another wallet sync, or for a new wallet to init.
member
Activity: 229
Merit: 13
Quote
I didn't do a detailed study yet, but speaking of low hangin fruits, what about address dictionary, typical script patterns and database index compression?
Set compression for the hard drive and place bitcoin files on it.
And think what is cheaper: disk space or processor work?
newbie
Activity: 14
Merit: 0
I didn't do a detailed study yet, but speaking of low hangin fruits, what about address dictionary, typical script patterns and database index compression?

Once there's a confirmed interest, I plan to start a detailed study and open a technical discussion. I'd also appreciate links to previous technical discussions (my googling skills apparently are not sufficient to find anything technical beside that "ultimate" thing, which I consider orthogonal to normal compression).
sr. member
Activity: 444
Merit: 250
The algorithms would require modest computational resources and be friendly to hardware implementation. My quick investigation shows that considerable compression ratios (like 50%) may be achievable.

Could you go into more details? Blockchain compression is not exactly a new field of study around here, yet it seems you have identified considerable low-hanging fruit. I don't know much about compression myself, but given that hashes, signatures etc closely resemble random noise, it's hard to imagine how to achieve a 50% compression ratio.
newbie
Activity: 14
Merit: 0
Hi,

I'm a software engineer and I want to contribute to the bitcoin[d] development. I'm a data compression expert so I thought that a block compression library would be a good start.

I've read about the "ultimate" blockchain compression where lite nodes keep only relevant leaves of the merkle tree. However as I understand we are still far from having this implemented, also this will not help full nodes, nodes with lots of transactions and all those who needs to keep full blockchain. Therefore my vision is that blockchain compression whilst is not critical now, is a nice-to-have feature.

The library I'm thinking of would be usable for wallets and daemons, both for storage and transmission. The algorithms would require modest computational resources and be friendly to hardware implementation. My quick investigation shows that considerable compression ratios (like 50%) may be achievable.

Is this something worth doing?

Thanks!
Pages:
Jump to: