Author

Topic: [SCALING] Minisketch (Read 513 times)

member
Activity: 99
Merit: 91
January 02, 2019, 07:09:35 PM
#20
We don't plan on changing how the mempool works-- rather, for each peer a sketch would be kept with all the transactions that your node would like to send / expect to receive with that peer. It takes a bit of memory but it's pretty negligible compared to other data kept per peer (such as the filters used to avoid redundant tx relay today).

Although, this is a very intriguing idea at first I have a couple of questions regarding incentives and resources of nodes as well as propagation latency.

In terms of peer sketches.  What determines how often a node will update a peer sketch? Are they doing it every time they get a transaction, or right after a peer requests it.  If they are doing it every time they get a transaction, the sketch takes more resources.  If they do it every time a peer requests an update, sketch creation causes a delay, followed by the delay in determining the difference and than subsequently requesting the deltas.  It was expressed that this could be done without a hard fork because it doesn't change the incentives.  I think that this does change the incentives of nodes in an indeterministic way which could lead to various forms of selfish and resource optimization behavior especially against small nodes.

The second question I have is how does this net effect latency.  The theoretical propagation time of data across the network is actually the upper bound on network throughput. Introducing latency to aggregate, share, compute, request and share transactions likely adds a lot of latency to the network which can actually decrease throughput.  Any thoughts on where the use of bandwidth competing with latency is optimal?
legendary
Activity: 4410
Merit: 4766
December 27, 2018, 07:40:01 AM
#20
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
December 24, 2018, 08:42:12 AM
#20
I'm preparing a draft for this, but I'm really sick of doing work on problems that you guys in the team are not interested in.
And I'm really sick of your insulting lectures when you can't even bother to keep up on the state of the art in what's already been proposed. Tongue
just check this post and my follow-ups here: https://bitcointalksearch.org/topic/m.48855816 you need to be more humble, imo.

P.S.
OP! It is unacceptable for a self-moderated topic you kept Greg's shit intact and removed my answer. Now, I regret merits I sent to you  Tongue
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
December 24, 2018, 01:59:33 AM
#20
I'm preparing a draft for this, but I'm really sick of doing work on problems that you guys in the team are not interested in.
And I'm really sick of your insulting lectures when you can't even bother to keep up on the state of the art in what's already been proposed. Tongue Please spare me the excuses about how its other people's fault that you're not doing other things. You seem to have plenty of time to throw mud...

There are already existing designs for an assumevalid 'pruned sync'...  but not enough hours in a day to implement everything immediately, and there are many components that have needed their own research (e.g. rolling hashes like the ecmh, erasure codes to avoid having snapshots multiplying storage requirements, etc.).

If you want to help--great! But no one needs the drama.  It's hard enough balancing all the trade-offs, considerations, and boring engineering without having to worry about someone being bent that other people aren't going out of their way to make them feel important. It's hard even to just understanding all the considerations that other engineers express: So on one has time for people who come in like a bull in a china shop and don't put in that effort towards other people's work. It doesn't matter who you are, no one can guarantee that any engineering effort will work out or that its results will be used even if it does.  The history of Bitcoin (as is the case in all other engineering fields) is littered with ideas that never (yet) made it to widespread use-- vastly more of them from the very people you think aren't listening to you than from you. That's just how engineering goes in the real world. Don't take it personally.

And it is the drama, usual Greg Maxwell ... I'm just asking a simple question: Is this problem, fast boot, considered important at all? Is there any hope for noobs like me to have their work committed to the code? Or all we have to do is forking off from legendary engineers like you?  Tongue

What do I get? A personal attack about me not being up to date in the field, a maniac and a narcissist, ... why because I'v somehow insulted Greg the great, by saying that I'm sick of working on problems that are not considered a problem at all by the team who is in charge of the code.

What I don't get? An answer to my question: Is this problem, bootstrap nightmare, considered a priority by the Core team at all? Believe me Maxwell, it is part of my research  Cheesy

P.S.
Speaking of engineering ... You guys don't even care about software engineering, bitcoin code is far from being engineered, the bootstrap problem tells everything about it. I do agree that decentralized consensus based p2p systems is new field, new problem domain and best practices are not established yet, but I don't expect being taught about how 'engineering goes in the real world' by people who are maintaining a system that takes like an age to boot.

staff
Activity: 4284
Merit: 8808
December 23, 2018, 09:51:00 PM
#16
and I think this method will fit in cryptos. if so, then this would be a good idea to relay the FEE parameter with these hash values too. then you could add a HAND-SHAKING stage at the beginning of these back and forts:
[...]
node minimal fee threshold[/url]) Alice and Bob (within hand-shaking stage) could express their minimal-fee-threshold to the other side and save even more bandwidth from the beginning. not sure,

Bitcoin nodes already tell their peers the minimum feerate they'll accept, so their peers don't won't offer anything that they won't accept. Doesn't even need to go into the IDs.   See BIP-133.  It would be possible to put the feerate in the IDs for even more fine grained/realtime filtering but I believe that would be a net waste of bandwidth due to the extra space for that information and the relatively few extra transactions that get sent in between feefilter updates.

Quote
and I have a question here. if I understand it right, the BCH here needs a rearrangement of data in mempool - so in continue, we should know does this rearrangement need more space too? and the encoding/decoding processes engage the processor of a node. may it put an end on nodes that have miniature hardware like raspberry family?

We don't plan on changing how the mempool works-- rather, for each peer a sketch would be kept with all the transactions that your node would like to send / expect to receive with that peer. It takes a bit of memory but it's pretty negligible compared to other data kept per peer (such as the filters used to avoid redundant tx relay today).

For computation we've been designing assuming needing to accommodate a rpi3 ... and part of the purpose of building minisketch as a fairly well optimized library was understanding exactly what the performance tradeoffs would be. 

One nice thing is that all the CPU heavy lifting is done by the party that requests reconciliation, so if you are CPU starved you can just reconcile less frequently. Doing so will also further reduce your bandwidth but just has a downside of propagating transactions somewhat slower.
 


full member
Activity: 135
Merit: 178
..
December 23, 2018, 09:34:26 PM
#15
libminisketch was published last week: https://github.com/sipa/minisketch/blob/master/README.md

from the introduction section of url above:
"If the elements are large, it may be preferable to compute the sketches over hashes of the set elements."

and I think this method will fit in cryptos. if so, then this would be a good idea to relay the FEE parameter with these hash values too. then you could add a HAND-SHAKING stage at the beginning of these back and forts:

To clarify for anyone else who might be reading this, BCH in this case has nothing to do with Bitcoin Cash but rather stands for Bose-Chaudhuri-Hocquenghem. I just find it a fun coincidence: https://en.wikipedia.org/wiki/BCH_code

and I have a question here. if I understand it right, the BCH here needs a rearrangement of data in mempool - so in continue, we should know does this rearrangement need more space too? and the encoding/decoding processes engage the processor of a node. may it put an end on nodes that have miniature hardware like raspberry family?

staff
Activity: 4284
Merit: 8808
December 23, 2018, 06:35:52 PM
#14
I'm preparing a draft for this, but I'm really sick of doing work on problems that you guys in the team are not interested in.
And I'm really sick of your insulting lectures when you can't even bother to keep up on the state of the art in what's already been proposed. Tongue Please spare me the excuses about how its other people's fault that you're not doing other things. You seem to have plenty of time to throw mud...

There are already existing designs for an assumevalid 'pruned sync'...  but not enough hours in a day to implement everything immediately, and there are many components that have needed their own research (e.g. rolling hashes like the ecmh, erasure codes to avoid having snapshots multiplying storage requirements, etc.).

If you want to help--great! But no one needs the drama.  It's hard enough balancing all the trade-offs, considerations, and boring engineering without having to worry about someone being bent that other people aren't going out of their way to make them feel important. It's hard even to just understanding all the considerations that other engineers express: So on one has time for people who come in like a bull in a china shop and don't put in that effort towards other people's work. It doesn't matter who you are, no one can guarantee that any engineering effort will work out or that its results will be used even if it does.  The history of Bitcoin (as is the case in all other engineering fields) is littered with ideas that never (yet) made it to widespread use-- vastly more of them from the very people you think aren't listening to you than from you. That's just how engineering goes in the real world. Don't take it personally.
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
December 23, 2018, 08:50:08 AM
#13
Initial sync is already an enormous problem, and even if you assume computers/bandwidth improve at 18% year over year (which is a big assumption...) any blocksize over ~300kbytes means that the initial sync time will continue to get worse.
We are above 300k already, aren't we? So, why not to solve bootstrap problem once forever, regardless of anything else?

I understand how crucial is the need for full validation but there exist drawbacks more than just bootstrap becoming more and more time consuming. Most importantly, the code needs to be downward compatible in a radically exaggerated way which leads to software bloat, as you know.

Theoretically once a node is convinced about the state at a given height, it is done with the history (it is how pruning works after all). What if we could establish a solid algorithm that provides the same security level as current boot-from-genesis approach strategy?

I'm preparing a draft for this, but I'm really sick of doing work on problems that you guys in the team are not interested in. Is there any chance to have your attention for such a proposal? I understand there is always a lot of issues and proposals and TODOs, I just need to know how much attention a proposal for a secure alternative to boot-from-genesis deserves to get.

staff
Activity: 4284
Merit: 8808
December 23, 2018, 05:04:37 AM
#12
Thanks for the information. I agree that initial sync will become big problem, but aren't verification time is very fast, so this won't be problem unless block size limit is increase too much (unless verification time isn't growing linearly)?
Initial sync is already an enormous problem, and even if you assume computers/bandwidth improve at 18% year over year (which is a big assumption...) any blocksize over ~300kbytes means that the initial sync time will continue to get worse.

Quote
since even nodes which run 0.16 or above is barely above 50%[1] (https://luke.dashjr.org/programs/bitcoin/files/charts/software.html)
It's unclear, we know that the numbers are somewhat distorted by spynodes which commonly claim to be old versions, but we don't know by how much. I know that on nodes where I've aggressively banned spynodes I see a much newer node mix than luke does.

Regardless, in the future nodes could eventually decide to stop relaying unconfirmed transactions to old nodes... it's pretty backwards compatible to do so. But thats getting way ahead of things...
legendary
Activity: 3122
Merit: 2178
Playgram - The Telegram Casino
December 20, 2018, 08:17:10 PM
#11
Minisketch: a library for BCH-based set reconciliation

There'd be a certain irony if that approach would indeed help with on-chain scaling by means of allowing for bigger blocks with lesser downsides.

It wouldn't be the first (or last) time that other cryptocurrencies have used code developed by the Bitcoin devs if so

To clarify for anyone else who might be reading this, BCH in this case has nothing to do with Bitcoin Cash but rather stands for Bose-Chaudhuri-Hocquenghem. I just find it a fun coincidence:
https://en.wikipedia.org/wiki/BCH_code

(it's linked in the readme, I wasn't aware of these codes myself)
legendary
Activity: 3430
Merit: 3080
December 20, 2018, 07:40:32 PM
#10
Minisketch: a library for BCH-based set reconciliation

There'd be a certain irony if that approach would indeed help with on-chain scaling by means of allowing for bigger blocks with lesser downsides.

It wouldn't be the first (or last) time that other cryptocurrencies have used code developed by the Bitcoin devs if so


Increasing the block size still increases the UTXO set size and the time to sync and validate all blocks. Minisketch does not help with initial sync or with validation, it only reduces the amount of bandwidth you consume for normal transaction relay (i.e. relaying unconfirmed transactions after you have already synced).

Thanks for the information. I agree that initial sync will become big problem, but aren't verification time is very fast, so this won't be problem unless block size limit is increase too much (unless verification time isn't growing linearly)?

That depends. Some transactions may lead to a quadratic increase of verification times [1]. This problem is fixed with SegWit transactions, however legacy transactions are still a thing so that might be important to keep in mind.

Schnorr sigs improve verification performance, or at least the Bitcoin developed standard for the koblitz elliptic curve does


Not every new technology in Bitcoin is for making more transaction throughput. In this case, this new technology is to reduce the total bandwidth cost for a node, which is good, but unrelated to transaction throughput.

Hmmm, I interpreted Minisketch relay as a means to improve the Bitcoin network's ability to handle the additional capacity that's gradually becoming available as segwit increases in usage (and likewise with schnorr). Arguably it's a marginal improvement, but it still makes the on-chain scaling that's unrealised (but possible) more viable. Still, maybe the scaling label for this thread is a little hyberbolic
staff
Activity: 3458
Merit: 6793
Just writing some code
December 20, 2018, 07:01:20 PM
#9
Increased UTXO size means increased user base, doesn't it?
Not necessarily.

And you think it is infeasible to have a UTXO an order of magnitude bigger? Why?
The same reason that it has been for the past several years: a larger UTXO set makes verifying blocks slower and increases node requirements thus making it more expensive for people to run nodes. More transactions in general (which correlates to a larger UTXO set) makes it harder to run a node which means that there will be less nodes and thus less decentralization.

I am not against Bitcoin adoption in general. I am for more people using Bitcoin. However I do not believe that we can support having more people at this time with current technology. Minisketch does not change this fact because it is used for network relay, which is completely unrelated to blocks, consensus, and the UTXO set. Not every new technology in Bitcoin is for making more transaction throughput. In this case, this new technology is to reduce the total bandwidth cost for a node, which is good, but unrelated to transaction throughput.

Let's not turn this thread into another block size argument. That's completely off topic.
legendary
Activity: 3122
Merit: 2178
Playgram - The Telegram Casino
December 20, 2018, 06:58:18 PM
#8
That's pretty neat, I'm looking forward to see whether this will bring any practical implications for Bitcoin.


Increasing the block size still increases the UTXO set size and the time to sync and validate all blocks. Minisketch does not help with initial sync or with validation, it only reduces the amount of bandwidth you consume for normal transaction relay (i.e. relaying unconfirmed transactions after you have already synced).

Thanks for the information. I agree that initial sync will become big problem, but aren't verification time is very fast, so this won't be problem unless block size limit is increase too much (unless verification time isn't growing linearly)?

That depends. Some transactions may lead to a quadratic increase of verification times [1]. This problem is fixed with SegWit transactions, however legacy transactions are still a thing so that might be important to keep in mind.

[1] https://bitcoincore.org/en/2016/01/26/segwit-benefits/#linear-scaling-of-sighash-operations
legendary
Activity: 3430
Merit: 3080
December 20, 2018, 05:52:05 PM
#7
Now suppose we need miners to relay a data structure like what carlton has implemented

Nothing to do with me, I'm just delivering the news
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
December 20, 2018, 05:28:30 PM
#6
I just read general info and it's really interesting since internet bandwidth and latency are biggest limitation for on-chain scaling.
Improving block reconciliation and helping with block size is not the only impact. Actually I'm more interested in the algorithm because of its efficiency in comparing two sets for not spending a same output (by same or different txns).

Besides my PoCW proposal I've been working on sharding for quite a while, interestingly in both cases, I've encountered to this problem: in a given state (block height) how do we choose between multiple blocks, processed/proposed in parallel?

Let's define two blocks to be antagonist if and only if they include transactions (identical or not) that spend a same output. If two blocks are not antagonist, a hypothetical protocol might allow both to be included in the blockchain.

Now suppose we need miners to relay a data structure like what carlton has implemented to represent spent UTXOs (and not the txns) using a simple XOR operation we would be able to verify antagonism of two competing blocks before downloading them. It is a huge step forward.

Increasing the block size still increases the UTXO set size and the time to sync and validate all blocks. Minisketch does not help with initial sync or with validation, it only reduces the amount of bandwidth you consume for normal transaction relay (i.e. relaying unconfirmed transactions after you have already synced).
Although I'm more a block-time-decrease fan rather than a block-size-increase one and I'm concerned about quite more sophisticated approaches to on-chain scaling but I think your argument is generally against bitcoin adoption and it is so disappointing.

Increased UTXO size means increased user base, doesn't it? And you think it is infeasible to have a UTXO an order of magnitude bigger? Why?

As of your bootstrapping argument, we have UTXO commitment proposals and a row of novice solutions combined with pruning ready to be implemented and used.

Bootstrapping is not a big deal imo and bitcoin is not doomed to low adoption because of its UTXO size.
legendary
Activity: 3430
Merit: 3080
December 20, 2018, 04:53:59 PM
#5
While some devs thought Minisketch could be used to connect to more nodes, IMO increase block size is more viable as current block size simply not enough if Bitcoin if mass adopted (and almost everyone use LN)
I agree to a certain extent, but I'm still conscious that we've not yet seen the full impact of the segwit block capacity increase

What kind of full impact you're talking about? It's been more than a year after SegWit activation and we should able to get/know the impact.

Right, but average block size is still not consistently above 1 MiB. Those days are coming though (if Lightning became very popular, there could be regular periods with a consistent 24 hour average blocksize of >= 3 MiB blocks)


Yes, but based on my understand, both nodes must support Minisketch, otherwise there's no difference since Node with Minisketch forced to use old method.
Right, but there are very few upgrades can be made to Bitcoin where old versions can benefit from new features.

You're right, but do you think older client version will get upgrade only related to specific feature/critical bug fix (just like Core 0.15.2 & 0.14.3)?

Major features are usually added in 0.x.0 releases; bugfixes, softforks and minor feature backports comprise any 0.x.x releases. Minisketch relaying seems like it would be considered a major feature, so older versions wouldn't get it.
legendary
Activity: 3430
Merit: 3080
December 20, 2018, 03:34:53 PM
#4
I just read general info and it's really interesting since internet bandwidth and latency are biggest limitation for on-chain scaling.
While some devs thought Minisketch could be used to connect to more nodes, IMO increase block size is more viable as current block size simply not enough if Bitcoin if mass adopted (and almost everyone use LN)

I agree to a certain extent, but I'm still conscious that we've not yet seen the full impact of the segwit block capacity increase. Minisketch relay for both transactions and blocks would mitigate that, but it will be a while before we reach average daily sizes of (say) 2-3 MB, and the overall blockchain will grow even more between now and then. I agree with achow101 that there's still initial block download to consider (using a rolling UTXO set could be used to make a contrary case, but that's for another discussion)


libminisketch was published last week: https://github.com/sipa/minisketch/blob/master/README.md

The link doesn't work (GitHub page only says "Not Found"), the correct link is https://github.com/sipa/minisketch/blob/master/README.md (or just check their main page)

Fixed that, apologies

It's obviously early in the Minisketch story, but this is a p2p layer change that doesn't require a soft fork to implement.

Yes, but based on my understand, both nodes must support Minisketch, otherwise there's no difference since Node with Minisketch forced to use old method.

Right, but there are very few upgrades can be made to Bitcoin where old versions can benefit from new features.
staff
Activity: 3458
Merit: 6793
Just writing some code
December 20, 2018, 03:20:36 PM
#3
While some devs thought Minisketch could be used to connect to more nodes, IMO increase block size is more viable as current block size simply not enough if Bitcoin if mass adopted (and almost everyone use LN)
Increasing the block size still increases the UTXO set size and the time to sync and validate all blocks. Minisketch does not help with initial sync or with validation, it only reduces the amount of bandwidth you consume for normal transaction relay (i.e. relaying unconfirmed transactions after you have already synced).
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
December 20, 2018, 01:23:22 PM
#2
Kudos  Smiley
Excellent work here, I was disparately looking for such a solution, specially interested in efficient symmetric difference evaluation feature of the algorithm. I'm so excited right now, let me finish reading details, thank you very much dude, O....M....G  Shocked
legendary
Activity: 3430
Merit: 3080
December 20, 2018, 11:34:14 AM
#1
libminisketch was published last week: https://github.com/sipa/minisketch/blob/master/README.md

This is a software library that can be used to allow set differences to be encoded and relayed using less data than before. Bitcoin currently relays blocks using a different technique known as "compact blocks" (BIP152), but Minisketch should improve on the performance of compact blocks by several factors (also, compact blocks fail if the amount of transcations a node needs to reconstruct a given block is lower than a certain threshold).

This type of technique has been suggested as a scaling technique in the past, but the suggested method (IBLT, inverted bloom lookup tables) has probabilistic decoding performance, and hence has an associated failure rate. Minisketch always decodes successfully.

Not only could Minisketch reduce the amount of data needed to relay blocks between nodes, but could also be used to reduce the amount of data needed to relay transactions. Current relay can use 200-300 MB bandwidth per day, using Minisketch as part of the tx relay method will reduce this considerably (alot of redundant inventory request/reply messages are sent with the present relaying logic, i.e. nodes use significant bandwidth simply asking each other "which transactions have you got")


It's obviously early in the Minisketch story, but this is a p2p layer change that doesn't require a soft fork to implement. Along with other potential improvements to help the Bitcoin network's resource load (Schnorr, taproot, MAST, rolling UTXO set), we could be using a significantly leaner network this time next year.
Jump to: