Why would make the extra merkle commitment asicboost uneconomical?

johoe

full member

Activity: 217

Merit: 259

Quote from: gmaxwell on April 12, 2017, 01:35:57 PM

Consider: today miners could use centralized devices to generate midstates, but instead the S9/R3 miners have fairly expensive FPGAs with 16MB of attached dual channel DDR2133 to generate the ~2000 midstates per second they need.

Did you mean 16 GB? 16 MB sounds a bit low and make 4-way collision finding quite expensive.

If it is 16 GB, does this indicate that they planned collision finding in their design? Or is there another reason you need that much memory for mining?

Using the same assumption of collection collisions for 5 seconds, you need about 45 MHash/s without commitment header to compute 500 4-way collisions and 550 MHash/s when witness is required.
But you can save more by using a longer collision time, e.g. 14 Mhash/s / 180 MHash/s for 30 s. What is the estimated hash rate of the FPGA?

gmaxwell

staff

Activity: 4326

Merit: 8951

Because it's a 12x-ish additional work factor, thats all. Is 12x a killer? Depends on how specifically that you've constructed it.

Trying to put it all on one big device and you run into IO/memory bottlenecks much easier-- your 100 million midstates is over 64gbit/sec of data, that device goes down and your farm is down. The device is also highly non-covert, so you cannot keep what you are doing secret from your facilities staff. It is much more straightforward to user local FPGAs to generate the collisions. And from what I'm told use of local devices to generate is how the unreleased spoondooles design worked.

Consider: today miners could use centralized devices to generate midstates, but instead the S9/R3 miners have fairly expensive FPGAs with 16MB of attached dual channel DDR2133 to generate the ~2000 midstates per second they need. (When will Trezor ship with such an incredible FPGA? Never, I assume Tongue

) They could use a centralized device to do so, but they don't. (well it looks like privately bitmain runs two S9s per controller, twice that).

If you've adopted a design with distributed generation, then you can't escape the cost-- perhaps you could deal with it, but that doesn't mean the infrastructure anyone has already built does deal with it.

arubi

jr. member

Activity: 31

Merit: 1

I guess the topic is covert AB, but the real issue is if there is any point which AB becomes uneconomical at all, even if overt form can be blocked by policy.

arubi

jr. member

Activity: 31

Merit: 1

Quote from: johoe on April 12, 2017, 04:57:47 AM

...

Now, the extra commitment in the UTXO makes computing the hashes more expensive. Instead of a single hash, you need to compute 12-13 Hashes for the Merkle root (if you still use Merkle grinding to generate the commitment hashes). But it requires no extra storage. So the total extra cost is 12-13 GHash/s for a 400 PHash/s pool. Everything else is unchanged.

With segwit active, a single transaction with 13 inputs and 13 outputs all signed ANYONECANPAY|SINGLE, it's possible to permute the pairs to get 13! permutations without the need to re-sign, as ANYONECANPAY|SINGLE in segwit outpoints does not commit to the specific place of the pair.
If the 13 inputs are derived from the same transaction, and the outputs are all the same, you should really only need to swap 13 bytes around the serialized transaction to generate a different, valid transaction and txid. not true, might be true if the inputs are not signed at all.

tromp

legendary

Activity: 1000

Merit: 1120

Quote from: johoe on April 12, 2017, 04:57:47 AM

If one collects collisions on a central server (the most efficient way to find a lot of collisions in a big pool), one needs about 3 billion hashes to create 25 million 4-way collisions. If one can live with 5 seconds delay (i.e. blocks miss some high-fee transactions from the last 5 seconds), one can reduce this to 5 billion hashes in 5 seconds,

Your analysis appears correct. To elaborate on the math above:

Given x * 2^32 hashes, the probability of a k-way collision on a particular 32 bit value equals
choose(x * 2^32, k) * (2^-32)^k * (1 - 2^-32)^{x * 2^32 - k} which is about x^k/k! * e^-x.

For x=3/4 we get a fraction of 25M/4B 4-way collisions.
And for x=5/4 we get a fraction of 125M/4B 4-way collisions
(we actually get more than an extra quarter of that in useful collisions from k=5 and higher).

johoe

full member

Activity: 217

Merit: 259

I know that it requires some extra hashes but if my math is correct we are talking about 12-13 GHash/s for a 400 PHash/s pool. And I don't think 12-13 GHash/s cost millions of dollar per year.

This is how I derived the number:

A 400 PHash/s pool grinds through 100 million mid-states per second. With 4-way asicboost, it needs about 25 million 4-way collisions per second.

If one collects collisions on a central server (the most efficient way to find a lot of collisions in a big pool), one needs about 3 billion hashes to create 25 million 4-way collisions. If one can live with 5 seconds delay (i.e. blocks miss some high-fee transactions from the last 5 seconds), one can reduce this to 5 billion hashes in 5 seconds, or a billion hashes per second.

Collecting the hashes and storing them to find the collision is not trivial, but should be possible. You have to do it for any form of covert asicboost.

Now, the extra commitment in the UTXO makes computing the hashes more expensive. Instead of a single hash, you need to compute 12-13 Hashes for the Merkle root (if you still use Merkle grinding to generate the commitment hashes). But it requires no extra storage. So the total extra cost is 12-13 GHash/s for a 400 PHash/s pool. Everything else is unchanged.

Topic: Why would make the extra merkle commitment asicboost uneconomical? (Read 1137 times)