Pages:
Author

Topic: (non-ultimate) blockchain compression? (Read 1832 times)

newbie
Activity: 14
Merit: 0
June 04, 2014, 01:59:40 AM
#30
I can not ask this transaction from N1, because N1 already removed it from memory pool

As pointed out 5 times now ... this would require a change in protocol.

yes, the spec. needs to be changed. repeating my question from another reply - should I start a BIP if I want to implement this?

No.  From this thread it is very clear you lack the basic knowledge and logic skills necessary to implement a BIP.  I mean we just played a 14 post game of "who's on first" because you couldn't grasp the concept that in an changed protocol the protocol would be changed and thus referencing the current protocol would be pointless.

Of course that would be my opinion.

Assessment of a person's professional skills (positive or negative) is an offtopic for this thread. I guess you misread the replies, but this cannot be an excuse for throwing your negative emotions onto a technical discussion.

It is clear that implementation of proposed modifications will likely result in protocol modifications. What is not clear (to me) is:

1. are such changes welcome by the core team in general (provided that they meet coding guidelines, quality standards, etc) ?
2. Is there a process of BIP review/approval by the core team ?
donator
Activity: 1218
Merit: 1079
Gerald Davis
June 03, 2014, 09:09:20 AM
#29
I can not ask this transaction from N1, because N1 already removed it from memory pool

As pointed out 5 times now ... this would require a change in protocol.

yes, the spec. needs to be changed. repeating my question from another reply - should I start a BIP if I want to implement this?

No.  From this thread it is very clear you lack the basic knowledge and logic skills necessary to implement a BIP.  I mean we just played a 14 post game of "who's on first" because you couldn't grasp the concept that in an changed protocol the protocol would be changed and thus referencing the current protocol would be pointless.

Of course that would be my opinion.  You don't need to ask permission to write up a BIP or make a pull request.
newbie
Activity: 14
Merit: 0
June 03, 2014, 06:07:41 AM
#28
Quote
could N1 keep this tx for some time after it's advertised in a block header?

N1 has this tx in block and *can* provide it of course.
But the current rules allows me to ask only mempool(wild) transactions, not confirmed in blocks.
At least we have to rewrite some other code for supporting this behavior:

Quote
https://en.bitcoin.it/wiki/Protocol_specification#getdata
getdata is used in response to inv, to retrieve the content of a specific object, and is usually sent after receiving an inv packet, after filtering known elements. It can be used to retrieve transactions, but only if they are in the memory pool or relay set - arbitrary access to transactions in the chain is not allowed to avoid having clients start to depend on nodes having full transaction indexes (which modern nodes do not).


yes, the spec. needs to be changed. repeating my question from another reply - should I start a BIP if I want to implement this?
member
Activity: 229
Merit: 13
June 03, 2014, 05:13:22 AM
#27
Quote
could N1 keep this tx for some time after it's advertised in a block header?

N1 has this tx in block and *can* provide it of course.
But the current rules allows me to ask only mempool(wild) transactions, not confirmed in blocks.
At least we have to rewrite some other code for supporting this behavior:

Quote
https://en.bitcoin.it/wiki/Protocol_specification#getdata
getdata is used in response to inv, to retrieve the content of a specific object, and is usually sent after receiving an inv packet, after filtering known elements. It can be used to retrieve transactions, but only if they are in the memory pool or relay set - arbitrary access to transactions in the chain is not allowed to avoid having clients start to depend on nodes having full transaction indexes (which modern nodes do not).
newbie
Activity: 14
Merit: 0
June 03, 2014, 04:31:54 AM
#26
Quote
The point is that most of the time, most of the full nodes know about 90%+ of the txs.

Yes. The problem is the rest 10%-.

Quote
If a node doesn't know about a particular tx it will request that from its peers.

OK, lets imagine. My node is connected to nodes N1, N2, N3... Nx
Once upon a time I receive "block template" from N1
There is a transaction in it which I do not know.
I can not ask this transaction from N1, because N1 already removed it from memory pool
And the other my peers also may not know about this tx or will reject/ignore my getdata packet.
So, I should drop "block template" and ask N1 for block itself.

This increases traffic (in some cases) and slows down block propagation.

This is not very rare scenario, I can prove it with my already working program (which checks for unknown transactions in incoming blocks)


could N1 keep this tx for some time after it's advertised in a block header? Also if the receiving node has already got block header it only needs the transactions, so in the worst case the overhead would be small (significantly less than the size of the header). Also we may consider sending shortened transaction hashes in the block header which would make the header even smaller and worst case less probable.
newbie
Activity: 14
Merit: 0
June 03, 2014, 04:22:56 AM
#25
But the protocol layer can fix that. A block that is just header + coinbase + txid list would be pretty short.

Yes, this is how I thought this can be done. This looks like a straightforward self-contained feature that I could start working on. Is this a feature bitcoin core developers would be willing to accept into the main branch? Should I open a BIP?
newbie
Activity: 14
Merit: 0
June 03, 2014, 03:30:17 AM
#24
~144 blocks

sorry, you're right of course. But still this is a considerable amount of time to wait.

Btw, is it my correct understanding that it is possible to build and send a valid transaction without having to wait for full wallet sync, provided that your tx inputs don't depend on the unknown part of the blockchain?
member
Activity: 229
Merit: 13
June 03, 2014, 01:48:44 AM
#23
Quote
The point is that most of the time, most of the full nodes know about 90%+ of the txs.

Yes. The problem is the rest 10%-.

Quote
If a node doesn't know about a particular tx it will request that from its peers.

OK, lets imagine. My node is connected to nodes N1, N2, N3... Nx
Once upon a time I receive "block template" from N1
There is a transaction in it which I do not know.
I can not ask this transaction from N1, because N1 already removed it from memory pool
And the other my peers also may not know about this tx or will reject/ignore my getdata packet.
So, I should drop "block template" and ask N1 for block itself.

This increases traffic (in some cases) and slows down block propagation.

This is not very rare scenario, I can prove it with my already working program (which checks for unknown transactions in incoming blocks)
donator
Activity: 1218
Merit: 1079
Gerald Davis
June 02, 2014, 11:35:45 PM
#22
But the protocol layer can fix that. A block that is just header + coinbase + txid list would be pretty short.
Yes, but what if I do not have one or more transactions in my mempool to assemble a block from this template?
This situation occures when a transaction comes to a miner, miner accepts it into a block and solve block immideately after.
Or miner takes very old (month or year) transaction from mempool
So, no one node on network has this transaction in mempool.

Ok, first node receives "template", and have to ask for missing transaction its peer.

The point is that most of the time, most of the full nodes know about 90%+ of the txs.  The current protocol relays them tx they already know about.  It is a very simplistic and not optimized protocol.  In time it will almost certainly be changed to header + coinbase + tx hashes.   If a node doesn't know about a particular tx it will request that from its peers.  That is still far less bandwidth then all nodes relaying full tx list to peers most of which already know about most or all of them who then relay the full list to their peers most of which know about most or all of them.
legendary
Activity: 905
Merit: 1012
June 02, 2014, 10:48:11 PM
#21
As I said, protocol change required.
member
Activity: 229
Merit: 13
June 02, 2014, 10:36:47 PM
#20
But the protocol layer can fix that. A block that is just header + coinbase + txid list would be pretty short.
Yes, but what if I do not have one or more transactions in my mempool to assemble a block from this template?
This situation occures when a transaction comes to a miner, miner accepts it into a block and solve block immideately after.
Or miner takes very old (month or year) transaction from mempool
So, no one node on network has this transaction in mempool.

Ok, first node receives "template", and have to ask for missing transaction its peer.

Quote
https://en.bitcoin.it/wiki/Protocol_specification#getdata
getdata is used in response to inv, to retrieve the content of a specific object, and is usually sent after receiving an inv packet, after filtering known elements. It can be used to retrieve transactions, but only if they are in the memory pool or relay set - arbitrary access to transactions in the chain is not allowed to avoid having clients start to depend on nodes having full transaction indexes (which modern nodes do not).

Peer does not have this tx in memory pool - because tx is already in block
legendary
Activity: 905
Merit: 1012
June 02, 2014, 06:53:45 PM
#19
But the protocol layer can fix that. A block that is just header + coinbase + txid list would be pretty short.
member
Activity: 229
Merit: 13
June 02, 2014, 07:52:20 AM
#18
Quote
1. If I sync once a day I need about 3 mins just to fetch the data (250 KiB/s channel, about 200 KiB block, 240 blocks).
~144 blocks

Quote
3. The new block will contain transactions previously propagated through the network, so some (all?) transactions get received by peers twice - when transaction is created, and as a part of a block. Is this correct?
This is usually correct.
newbie
Activity: 14
Merit: 0
June 02, 2014, 06:17:09 AM
#17
Block chain size is currently rate limited to about 1MB per 10 minutes, or about 13kbps. It is not at all clear that compressing this data will result in faster syncs, but if that is what interests you then by all means give it a shot.

1. If I sync once a day I need about 3 mins just to fetch the data (250 KiB/s channel, about 200 KiB block, 240 blocks).

2. As I undestand clients propagate new transactions through the network instantly and whenever there's a new block, there's a spike in the amount of data (which may well exceed 13 kpbs).

3. The new block will contain transactions previously propagated through the network, so some (all?) transactions get received by peers twice - when transaction is created, and as a part of a block. Is this correct?
legendary
Activity: 905
Merit: 1012
The issues there are not block chain data size, but rather the network sync algorithm. The bitcoind code that is deployed right now uses a rather stupid algorithm for fetching blocks from peers that results in downloading the same block multiple times. There is a developer working on that right now. Block chain size is currently rate limited to about 1MB per 10 minutes, or about 13kbps. It is not at all clear that compressing this data will result in faster syncs, but if that is what interests you then by all means give it a shot.

Note that there is already a Script compressor which is used by the ultraprune code to greatly reduce the size of common bitcoin scriptPubKeys.
newbie
Activity: 14
Merit: 0
Quote
The main time-expensive routine is verifying ECDSA signatures, not downloading.
But we can not (?) eliminate this step on every node.

Hmm... May be some checkpoints? Let's say we have bootstrap.dat & all index files up to the block on May,1,2014
And the client has hardcoded hash of this data.
So, new user have to download bootstrap&indexes, check hash and... do not verify all signatures from the beginning of bitcoin era

ECDSA checking may be expensive, but I open my wallet every few days and on every open I see considerable network traffic for couple of minutes (I'm on ADSL). I wouldn't mind halving the time. Also for bitcoin nodes with lots of down links this might be a bigger problem.
newbie
Activity: 14
Merit: 0
I don't think a common disk compression methods are efficient for blockchain. Efficient compression means understanding the underlying structure.

Speaking on cheap vs expensive - I think it's users time that's more expensive than processor time or disk space. We can significantly save time necessary to wait for another wallet sync, or for a new wallet to init.

Block chain data does not compress impressively on a global scale, but indices on addresses and tx hashes do.

Bits of Proof stores both the block chain data and supplementing indices in LevelDB and achieves high performance in retrieving transactions referring an arbitrary HD master key, that is why it powers the myTREZOR web wallet.

I am sure it could be further optimized with your ideas, so let me know if you'd like to discuss them in that scope.

Definitely I'm interested in this. I think optimization of network transmission and storage on portable/battery powered devices are major targets to consider. I agree with comments that disk space on PC is not an issue.
newbie
Activity: 14
Merit: 0
Quote
I disagree.
You will have benefits from compression big chunks of data (such as blk-files), not a small pieces (transactions in blocks)

I didn't say there's no benefit from compressing big chunks. However in this thread I'd like to study interest and requirements rather than benefits of a specific compression technique. I think it's pretty clear that disk compression is not a solution.
member
Activity: 229
Merit: 13
Quote
Bitcoin Core already skips signature verification on blocks before the latest checkpoint.

But it parses all blocks and transactions to create indexes. Or I am wrong?
administrator
Activity: 5222
Merit: 13032
Hmm... May be some checkpoints? Let's say we have bootstrap.dat & all index files up to the block on May,1,2014
And the client has hardcoded hash of this data.
So, new user have to download bootstrap&indexes, check hash and... do not verify all signatures from the beginning of bitcoin era

Bitcoin Core already skips signature verification on blocks before the latest checkpoint.
Pages:
Jump to: