What about a soft fork: pools don't include malled txs and clients don't relay them. I think this would very quickly solve the problem.
What is a malled tx look like? There are many ways that a tx can be mutated that makes identifying the "correct" one impossible. Most nodes and pools already drop the duplicate the issue is the duplicate is getting to the pool before the original. It wouldn't end up in a block otherwise.
If you have two tx each are equally valid and correct, how do you know which one was mutated after sending? You can't so you do what you already do. You keep the one you receive first and you drop the other one. That means if the mutated tx gets to a miner's memory pool first, it will get included in a block.
I thought it was possible to determine if a tx looks like it was created by the standard client or not. Should this not be possible or there are too many different legitimate tx styles out there then I guess we will have to deal with malled txs.
All that really needs to be done is for transactions to be reformatted to standard format and txids generated from that. Any other txid would be invalid. Surely?
That is a rather large "all" by itself as the input script is rather "loose". That isn't all you need to do though. ECDSA signatures are not immutable - Ouch. You also need to restrict signatures to a single "form" (DER) as OpenSSL (which Bitcoin uses) validates a variety of forms - double ouch.
To produce immutable hashes you need to:a) Limit signatures to DER. OpenSSL verifies non canonical forms of signatures. Initially Bitcoin was no more restrictive than OpenSSL. Since v0.8 Bitcoin, non-DER signatures are non standard and are not relayed by nodes running v0.8+. This was MtGox problem their "Goxxed v0" custom client was creating non-DER signatures and thus the tx were being dropped by most nodes.
b) Make ECDSA Signatures immutable. ECDSA signatures are not immutable, they were never designed to be and they never will be. ECDSA will produce the same signature with both S and -S for a given payload and priv key. So same tx data, same signature, different hash. To resolve this "Bitcoin signatures" will need to be more restrictive than ECDSA signatures. In other words all (future) valid Bitcoin signatures are valid ECDSA signatures, but not all valid ECDSA signatures will be valid Bitcoin signatures.
c) Allow only one form of input script for a given outcome. The Bitcoin protocol is rather loose when it comes to op-codes in the input script. It is possible to make some changes to these and still produce the same output (just like 3 + 2 = 3 + 2 +0). Fixing this aspect will require tighter rules on what is a valid input script.
That is a lot to do, test, and get right. Think of the "upgrading a plane in flight" analogy.
It can be done is phases to provide a safer path to the hard fork.
First the improved rules would be implemented in new version of clients (i.e. all clients would always use a +S instead of a + or -S) but clients wouldn't treat the "worse" tx any differently.
Next clients would favor "better" form of the transaction (i.e. if a node receives duplicates it will keep the one with the +S).
Next clients would consider the "worse" forms to be non-standard and would refuse to relay them (i.e. -S = non-standard, and tx is dropped by an upgraded node but still valid if in a block).
Finally clients would tx which break the new "better" rules to be invalid (i.e. after block # XXXXXXXXXX a tx with a -S is invalid, a block containing a tx with a -S is invalid).
It isn't going to happen overnight so in the interim mutable transaction ids are just a reality for Bitcoin.