I don't really expect this will be implemented, but I hope people will find this interesting.
Transaction structure is the same but an extra field, txSize, is added. Transaction hash is the hash of everything with scriptSig removed, and this is the hash going to the Merkle Root. As the scriptSig will not go to the Merkle Root, it won't be really recorded in the blockchain. People can freely mutate the scriptSig (even after confirmation). Miners will accept a transaction if they see a valid scriptSig and the transaction size is equal to txSize. This will make sure the block size is not mutable.
Payer could opt to sign the txSize or not.
This could be a hardfork, or a softfork with the auxiliary block:
https://bitcointalksearch.org/topic/auxiliary-block-increasing-max-block-size-with-softfork-283746Any comment?
But does this stop malleable transactions? Even if the transaction size is the same you can create malleable transactions by shuffling inputs/outputs around, right?
However, I had an idea last night and just still grabs my head and I need to get this falsified, because it is stupidly simple so I must likely be wrong. So help me out proving this wrong.
First, let's define a total order on all transactions,
Tx(A) < Tx(B) if SerializedByteSeq(Tx(A)) < SerializedByteSeq(Tx(B))
Where '<' operator in SerializedByteSeq(X) < SerializedByteSeq(Y) is the lexicographically byte string comparison.
Furthermore, there's always smallest element in this total order. I'm not sure what the smallest possible transaction looks like, but there's definitely one. Note that if two transactions are of equal size (in bytes) we still apply the lexicographic rule ('AA' < 'AB').
Because we have a total order, any subset of Tx(A) < Tx(B) is also a total order.
Now let's consider all possible transactions for a given set of inputs + outputs.
Fixing the inputs and outputs will give us a subset of Tx(A) < Tx(B).
And there must be a smallest possible such transaction (again using the lexicographically definition).
Now for each Bitcoin node in the network, we'll from now on apply this for unconfirmed transactions:
* If we receive a transaction Tx(A) with inputs+outputs IO(Tx(A)) = <{Inputs},{Outputs}>, then if we haven't seen that transaction before then add it to our pool of unconfirmed transactions.
* If we receive a transaction Tx(A) with inputs+outputs IO(Tx(A)) = <{Inputs},{Outputs}> AND we have an existing IO(Tx(B)) = <{Inputs},{Outputs}> (same inputs and outputs) AND Tx(A) < Tx(B), then replace existing Tx(B) with Tx(A). Otherwise we just drop Tx(A).
The only thing an original broadcaster of a transaction needs to do is to pick the smallest (w.r.t. to total order) transaction and it will be impossible for an attacker to create a malleable transaction.
The pros with this solution is:
* It doesn't require any data structure changes
* Nodes can smoothly update to the new version (using the above rules) to gradually block malleable transactions and DoS attempts.
* It's 100% backwards compatible what we have today. I'm not saying that a Tx(B) in Tx(A) < Tx(B) is an invalid transaction, just that Tx(B) has less precedence.
So how hard is it to pick the smallest lexicographically transaction for an original Tx broadcaster?
It shouldn't be that hard. Everything reminds me of how you sign a set of strings in X.509 ceritificates that use ASN.1 DER encoding.
What you normally do (to ensure a unique hash) is to lexicographically sort the strings first. If you look at all the sub-sequences of data in a Tx string, it shouldn't be that hard to prove the best order (at least not for simple transactions).
As this is way too simple, now shoot me in the head and prove me wrong