Author

Topic: minisketch: txids vs wtxids (Read 170 times)

legendary
Activity: 1456
Merit: 1175
Always remember the cause!
October 07, 2019, 05:13:38 PM
#7
Fwiw here is the quote from the BIP.
Quote
A new data structure, witness, is defined. Each transaction will have 2 IDs.
Definition of txid remains unchanged: the double SHA256 of the traditional serialization format:
  
Code:
[nVersion][txins][txouts][nLockTime]
A new wtxid is defined: the double SHA256 of the new serialization with witness data:
  
Code:
[nVersion][marker][flag][txins][txouts][witness][nLockTime]
(And for the last sentence about malleability the DoS case is now handled, see above)
Exactly! According to this quotes: txids are hashes of transactions without scriptSig (witness data) and wtxids are hashes of raw transactions including witness data.

Analogically speaking legacy txids are equivalents to segwit wtxids because both have embedded signature data. My point is using such ids in relay network has no indication because they are inherently vulnerable to malleability like phenomenons. Although I'm not talking about the exact mempool flooding same as classical malleability, flooding the relay network is a possibility and I'm not convinced by what I've quoted from Greg Maxwell above thread who wants to justify using wtxids in minisketch by asserting that an adversary could fool the peers by uploading mutated witness data which is absurd: Nodes are not supposed to relay an invalid txn and adversaries could be easily identified and banned just in their very first attempt to do so.

But I see your point, changing a little bit of security against usability (even if it re-introduces multiple (somehow) trusted actors, so it makes things far more difficult). Why all this just for bootstrap time ?
It is not about any form of trust. UTXO commitment has been discussed for a long time, there is no trust problem involved. The only point would be a disadvantage regarding very long-range chain-reorg situations which is just a hypothetical problem with no real-world game-theoretic possibility. What would be bitcoin without game theory and incentive analysis anyway?

sr. member
Activity: 279
Merit: 435
October 07, 2019, 04:16:11 PM
#6
A segwit client is simply vulnerable to a flooding attack by an attacker who intentionally generates multiple signatures for the same txn if and only if wtxids are used as a reference in relay network instead of txids.
There was indeed a DoS vector which was hopefully patched (https://github.com/bitcoin/bitcoin/pull/8312, https://github.com/bitcoin/bitcoin/pull/8525, https://github.com/bitcoin/bitcoin/pull/8499) (actually it was quite some time ago: for 0.13).

Considering your comment, I afraid that there are some misunderstandings regrading the terms wtxid and txid.

In a legacy bitcoin transaction, with scriptSig being part of the txn body, i.e. not being segregated, the sha256 hash of the txn (which obviously yields different hash values for different scriptSigs), is called wtxid in our (new) terminology.

After segwit we have another option as well: Hashing main txn body (no witness data) and using it as the main reference this is called txid.
Obviously using wtxids, opens doors to transaction malleability whether you are segwit aware or not.
I .. Don't think I'm confused at all.
Fwiw here is the quote from the BIP.
And it is the most sensitive point  Smiley

Once you are thinking in-the-box, you are right, you need witness data to verify but thinking out-of-the-box you may find it reasonable to have a better incremental verification strategy: Suppose I start pruning witness data before removing the actual blocks from the history and still I'm able to help blockchain reconciliation for nodes that are satisfied with a medium level of verification given a threshold of confirmations (more blocks) is reached.

Again thinking in-the-box may give us no clue of why one should prune his blockchain incrementally, I mean you either want the block or don't want it, yes?

No! Out-of-the-box, things look a bit different: I can imagine a fast-sync strategy in which bootstrapping nodes do not need signatures after a threshold of strongly verified blocks has reached but still they want to verify the integrity of blocks and their consistency with the claimed (committed probably) UTXO set, it would be a moderate verification strategy.

So, we could have a UTXO committed by like 1000 blocks, where 500 blocks have no witness data and the 500 recent blocks are fully maintained. Now a bootstrapping node verifies that there are at least 1000 blocks that commit to a UTXO hash where 500 among them are strongly verified and are stacked upon another moderately verified half.

Such a node would have a very fast boot process with a multi-gigabyte HDD and still practically a full-node for most usual use-cases just like a pruned node. To be more precise: up to 2 times more compact than a comparable pruned node thanks to above mentioned incremental moderate pruning strategy.
I don't find it reasonable  Wink
But I see your point, changing a little bit of security against usability (even if it re-introduces multiple (somehow) trusted actors, so it makes things far more difficult). Why all this just for bootstrap time ?
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
October 07, 2019, 02:25:37 PM
#5
When I made that comment, I was unaware of where wTxIds are being used. But since they are already computed there is no additional cost and using them makes more sense as the hash covers the entire transaction.
wTxids are totally meaningless and being "already computed" is implementation-dependent not a requirement. I can imagine a better bitcoin without such computation at all: witness data could be queried in a much smarter way.

In any hypothetical relay scheme, like minisketch, we don't need wtxid to be relayed and once a node issues request for a specific [set of] txids in my idealistic scenario it can simply specify whether it needs witness data or not and it would be sender's responsibility to send a proper witness data. The whole witness data can and should be omitted from the initial phase (any sort of gap analysis) and relaying such data may be optional. Again in my perfect implementation, in the bootstrap process, the same basic requirements are met: Nodes have an option to query a txn without witness data.

A segwit client is simply vulnerable to a flooding attack by an attacker who intentionally generates multiple signatures for the same txn if and only if wtxids are used as a reference in relay network instead of txids.

After segwit we have another option as well: Hashing main txn body (no witness data) and using it as the main reference this is called txid.
Obviously using wtxids, opens doors to transaction malleability whether you are segwit aware or not.
That would be double spend attempt and using txids instead is not going to prevent that. Same scenario of flooding can still happen if txids were used (change sequence for instance to get different txid). They'll be rejected and the node will probably ban that IP address.
No double-spend attempt is involved. Just providing different (valid) ECDSA signatures and we have different wtxids for the same txn. Your banning strategy won't work for sophisticated and distributed versions of such an attack. Changing the sequence number is different in this context, it would be an actul double-spend and yields different txids.

Not sure why you are linking this to malleability though.
I think it is quite a similar situation: the attacker can establish multiple connections to multiple nodes and relay same transactions with different witness data all over the network and keep network busy gossipping about them over and over before any node have figured out many of the wtxids are actually referring to the same txn that it is already aware of.

I do agree it is not exactly similar to the old malleability issue as the transaction won't find its way to the mempool but it will still keep the relay network busy propagating different versions of the same transaction.
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
October 07, 2019, 01:35:25 PM
#4
When I made that comment, I was unaware of where wTxIds are being used. But since they are already computed there is no additional cost and using them makes more sense as the hash covers the entire transaction.

A segwit client is simply vulnerable to a flooding attack by an attacker who intentionally generates multiple signatures for the same txn if and only if wtxids are used as a reference in relay network instead of txids.

After segwit we have another option as well: Hashing main txn body (no witness data) and using it as the main reference this is called txid.
Obviously using wtxids, opens doors to transaction malleability whether you are segwit aware or not.

That would be double spend attempt and using txids instead is not going to prevent that. Same scenario of flooding can still happen if txids were used (change sequence for instance to get different txid). They'll be rejected and the node will probably ban that IP address.
Not sure why you are linking this to malleability though.
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
October 07, 2019, 10:45:47 AM
#3
Two different wtxids do not represent two different txns but different txids definitively do so.
It does for an upgraded (Segwit aware) client.
A segwit client is simply vulnerable to a flooding attack by an attacker who intentionally generates multiple signatures for the same txn if and only if wtxids are used as a reference in relay network instead of txids.

Considering your comment, I afraid that there are some misunderstandings regrading the terms wtxid and txid.

In a legacy bitcoin transaction, with scriptSig being part of the txn body, i.e. not being segregated, the sha256 hash of the txn (which obviously yields different hash values for different scriptSigs), is called wtxid in our (new) terminology.

After segwit we have another option as well: Hashing main txn body (no witness data) and using it as the main reference this is called txid.
Obviously using wtxids, opens doors to transaction malleability whether you are segwit aware or not.

To be more specific: I think even in the bootstrap process we could have segwit witness data pruned if there were enough blocks under which the containing block is buried.
??
Without the witness data, an input (of an up-to-date transaction) can not be validated.
And it is the most sensitive point  Smiley

Once you are thinking in-the-box, you are right, you need witness data to verify but thinking out-of-the-box you may find it reasonable to have a better incremental verification strategy: Suppose I start pruning witness data before removing the actual blocks from the history and still I'm able to help blockchain reconciliation for nodes that are satisfied with a medium level of verification given a threshold of confirmations (more blocks) is reached.

Again thinking in-the-box may give us no clue of why one should prune his blockchain incrementally, I mean you either want the block or don't want it, yes?

No! Out-of-the-box, things look a bit different: I can imagine a fast-sync strategy in which bootstrapping nodes do not need signatures after a threshold of strongly verified blocks has reached but still they want to verify the integrity of blocks and their consistency with the claimed (committed probably) UTXO set, it would be a moderate verification strategy.

So, we could have a UTXO committed by like 1000 blocks, where 500 blocks have no witness data and the 500 recent blocks are fully maintained. Now a bootstrapping node verifies that there are at least 1000 blocks that commit to a UTXO hash where 500 among them are strongly verified and are stacked upon another moderately verified half.

Such a node would have a very fast boot process with a multi-gigabyte HDD and still practically a full-node for most usual use-cases just like a pruned node. To be more precise: up to 2 times more compact than a comparable pruned node thanks to above mentioned incremental moderate pruning strategy.
sr. member
Activity: 279
Merit: 435
October 07, 2019, 09:58:51 AM
#2
Hello,

Two different wtxids do not represent two different txns but different txids definitively do so.
It does for an upgraded (Segwit aware) client.

Why should you? Because you are an adversary? So, as an adversary, couldn't you produce multiple witness data for the same tx? Aren't we back to the transaction malleability era?
Old clients not aware of wtxids are, and will always be vulnerable to transaction malleability. So we are not back to the transaction malleability era if we responsibly update our clients  Smiley.

My point is wtxids are vulnerable to txn malleability and I see no reason to use them in minisketch or any new proposal.
We now build upon Segwit.

To be more specific: I think even in the bootstrap process we could have segwit witness data pruned if there were enough blocks under which the containing block is buried.
??
Without the witness data, an input (of an up-to-date transaction) can not be validated.
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
October 07, 2019, 08:45:45 AM
#1
Hello,
I made a mistake and posted a reply in @CarltonBanks thread about minisketch. It is self-moderated and you know Carlton: a troll who fakes being a troll hunter
Cheesy


Wtxids are not used anywhere (so it shouldn't be pre-computed already) and they are more expensive to compute,
Sure they are, they're required to tell two different transactions apart.
With all due respects, I completely disagree. Two different wtxids do not represent two different txns but different txids definitively do so.

When txs are only identified by txids I can take a valid transaction mutate its witness to make it invalid (or just too low a feerate), and it'll have the same txid, so if you fetch by txid you can't avoid fetching the same junk multiple times.
Why should you? Because you are an adversary? So, as an adversary, couldn't you produce multiple witness data for the same tx? Aren't we back to the transaction malleability era?

My point is wtxids are vulnerable to txn malleability and I see no reason to use them in minisketch or any new proposal.

To be more specific: I think even in the bootstrap process we could have segwit witness data pruned if there were enough blocks under which the containing block is buried.

Jump to: