This. Other exchanges, merchants, and service providers have dealt with this for years.
Urgh, actually I think most of them don't.
All services that provide you with an unconfirmed transaction ID right at withdrawal times are likely to be vulnerable from the same MtGox issue in one form or another.
IMHO they were just too puny for anyone to have bothered, we'll see sinkers and floaters in the next weeks.
Not really.
The first thing to point out is that if you are competent very few of your txs should ever end up dropped from the network. MtGox had thousands of them. They had not one problem but at least 4
a) double spending previously spent coins = invalid tx so nodes dropped it.
b) spending immature coins = invalid tx so nodes dropped it (although these would eventually get through after 120 blocks).
c) paying insufficient fees for relay (not following the fee rules in the reference client so nodes running on those rules would drop the tx instead of relaying them).
d) using non canonical signing two years after the issue was reported and modified in the reference client.
This changed what normally is an incredibly rare event (tx not propagating the network) to one that was widespread and affecting innocent users. Had this giant cluster fuck of layered incompetency not happened the actions of any scammer would be more obvious. If you only have a handful of users reporting they didn't get paid it is a little more obvious when 20% to 30% of all your withdraws across all users are failing.
A small exchange doesn't need an automated system to deal with tx mutability. We don't. That is because 99.9%+ of the tx will never be mutated and there are no false positives. There is nothing wrong with using the tx id as long as you don't rely on it as your first, last, and only method of accounting. Mutability doesn't happen by accident.
IF a user reports they didn't get paid and IF the tx hash is not in the blockchain then you MANUALLY flag the account and look for a tx which has the same inputs and outputs.If you don't have that laundry list of problems above going on this should be a VERY VERY RARE EVENT and one that likely warrants manual attention anyways. Sure a super duper automated system would be great and if you have the funds to build, extensively test, and deploy it go ahead but this isn't a routine occurrence for a properly running node. So if your backend can process 99.9% of tx in an automated fashion and flags 1 in 1000 (and honestly probably more like 1 in 100,000) for manual review well guess what? That is called doing business. Most automated systems have some event which are logged for manual review.
The other thing to keep in mind is that most likely, the only reason it happened is because the user who reports he "didn't get paid", modified the tx intentionally so he can try to collect a second payment. That means putting a man in the loop is a good idea even if you do have a super duper automated handle every edge case back end processor.