Good to know about the CoinJoin initiative. After reading the thread, I understand the view that it will have more value the more common it becomes in normal use.
I'd just like to
clarify what CoinJoin is NOT able to do, to avoid seemingly frequent misunderstandings (and where it is different from traditional mixing services) - and I have a proposal further below.
All CoinJoin transactions considered in my example are "N=3", and I assume that all transactions are of 1 BTC, there is no change involved (unless stated otherwise) and Tx fees are neglected for the sake of simplicity/clarity:
Person A sends 1BTC from Address
1A1in and uses
1A1out as output (change of 1 BTC goes to
1A2in).
Person B sends 1BTC from Address 1B1in and uses 1B1out as output (no change).
Person C sends 1BTC from Address 1C1in and uses 1C1out as output (no change).
At this point, outside taint analysis cannot tell who is the owner of 1A1out/1B1out/1C1out, even if the owners of 1A1in/1B1in/1C1in were perfectly known. So the taint seems to be removed. But in fact the taint is only "hidden" and may re-appear, as we'll see further below!
Now Person A has another coin that he wants to clean, so he engages in another CoinJoin transaction with N=3:
Person A sends 1BTC from Address
1A2in and uses
1A2out as output (no change).
Person X sends 1BTC from Address 1X2in and uses 1X2out as output (no change).
Person Y sends 1BTC from Address 1Y2in and uses 1Y2out as output (no change).
At this point, Person A has 2 BTC in his wallet, namely 1 BTC in
1A1out and
1A2out respectively. No outside observer can know that these are his addresses, even if it was publicly known that 1A1in belongs to him. So it seems that Person A has "cleaned" these 2 BTC.
But now Person A makes a fatal mistake, because he doesn't know how CoinJoin works internally and succumbs to the fallacy that his addresses 1A1out and 1A2out are "perfectly cleaned" (as if he had used a traditional mixing service or an online-wallet-"back-and-forth"-transfer or alike):
He sends over, say 1.5 BTC, to his Friend "F", so the transaction is his:
Input = 1A1out and 1A2out (=1+1 =2 BTC).
Output = 1Friend (1.5 BTC) and 1Achange (0.5 BTC to his own new change address).
Now, for an outside observer of the blockchain it is utmost easy (even manually w/o automated scripts) to re-associate addresses 1A1out and 1A2out with 1A1in and hence with person A. This would not have happened if Person A had done proper (manual) "CoinControl", or if he had mixed his coins with a (trustable!) traditional mixer.
Even worse: If Person B makes the same mistake as Person A, then also the address 1C1out of Person C is 100% tainted again, even if she (Person C) did not make any mistake herself!
This is an argument for choosing greater N (or for cascading MANY CoinJoin transactions of N=2), to reduce the probability of getting re-tainted due to mistakes made by the other CoinJoin participants.
So generally, users of CoinJoin must know that
any taint that was (temporarily) removed thanks to CoinJoin gets re-established if the user uses different CoinJoin outputs as common inputs for a future transaction. This taint easily re-appears in the same way even if the user performed a cascade of CoinJoin transactions. So careful manual coin-control is indispensable, and maybe clients will implement algos to support the user in his/her coin-control to try avoid re-establishment of taint that was formerly removed by CoinJoin type of techniques, and warn the user when necessary.
So I would use a different terminology: Instead of saying that CoinJoin (and alike) "remove" the taint, I'd rather say it "hides" the taint, or to keep the picture of "colored" coins:
CoinJoin and alike techniques do not remove a [red/black/violet/blue] taint, but they paint white color over this taint. If the user itself practices bad coin control afterwards (or if the other co-participants of the CoinJoin transaction do so), the white paint covering the taint gets peeled-off and the taint re-appears.
This is a fundamental restriction that all blockchain-based-multi-signature taint removal techniques (bitprivacy, CoinJoin, ZeroCoin) have in common as I understand. So the uninformed user may get a false sense of safety for his/her privacy when using these techniques. The only possibility for complete taint removal (instead of just over-painting the taint) is an "off-blockchain mixer", which of course has other disadvantages:
(1) need to trust that the operator does not make records which it hands over to some "3-or-more-Letters-Org"
(2) need to trust the operator that he does not simply keep my coins.
The first disadvantage (1) can be alleviated by looking where the mixer service is located. If there are some mixers in "free" countries, or in some countries that do not co-operate with each other well (e.g. Country X + Y), I should be quite safe w.r.t. risk (1) if I simply concatenate a mixing in country X followed by a mixing in country Y.
For disadvantage (2):
Apparently the only way around is to split-up the amount to be cleaned into smaller quantities and serialize the procedure. E.g. if I want to clean 100 BTC, I serialize it into 100 times 1 BTC, i.e. I will not send out the (n+1)th BTC to the mixer before I have received the nth BTC back. So my risk of fraud gets reduced to 1 BTC. The price I pay is the increased Tx fees, so I can half these fees by accepting twice the risk by selecting 2 BTC increments for the mixing procedure instead of 1 BTC.
Of course it would be nice if mixing APIs could be standardized and built into the BTC clients, such that a background job can be scheduled from the client, i.e. user selects addresses and amount for taint removal, plus the mixing increments, and the mixing server should be able to communicate his fee policy and the one-time address per mixing job via the API, while the client communicates mixing jobs (amount, source address, pay-back address(es)) to the mixer etc. The client preferably also is configurable w.r.t. time increments (do not send (n+1)th mixing job over immediately after nth mixing job is done, but wait n seconds, where n is configurable by the user and follows e.g. a random poisson distribution to make timestamp-based blockchain analysis more difficult. The mixing BTC amounts accepted by the mixer per job should not be arbitrary, but a fixed set (e.g. only allow 1*10^n and 3*10^n BTC per job, where n can be [...]), to make blockchain analysis based on tracking equal transfer sizes more difficult. The mixer may also send the return to multiple addresses (as specified by the user client via the API) in multiple transactions, to make tracking more difficult. Another idea is to include a "provably fair gambling element" in the payback strategy of the mixer (whose statistical sigma could be agreed upon with the client via the API), to further randomize the payout and make tracking more difficult (e.g. payback = 100% +/-n% -he% - Tx fee, while "n" is Gaussian distributed with STD=sigma as chosen by the client and sigma<=sigma_max as defined by the mixer, and he(%) is the "house edge" of the mixer's payback randomization service)! Whether the "house edge" of this gambling element is 0% or actually more (i.e. mixer takes extra fee for this randomization/obfuscation element) is up to the mixer policy. The "provably fair" verification of the "gambling randomization" would also have to be part of the API, so that the client can verify that this randomization is really fair. The payout times of the mixer should have a variable delay relative to the incoming payments, to make tracking even more difficult. Etc.)
While writing this,
I think it would be good to define a BIP for a mixer-client API that supports all these features *. Does anyone know if this has happened already?
After a while, I assume that an infrastructure of mixers will evolve, similarly to the infrastructure of mining pools that we know today. They can differentiate from each other w.r.t. server location/jurisdiction, fee structure, blockchain trackability defense mechanisms (see examples above), etc.
* If you guys think this makes sense, I may start to work on the definition of such a BIP. Sorry this is OT for CoinJoin (albeit related), but the ideas developed while writing this. Further discussions on this in a separate thread I guess(?), if considered useful.-------------------------------------------------------------------
2nd topic:
Idea how to minimize the likelihood of a CoinJoin sybil attack (i.e. that the N-1 co-participants of my CoinJoin transaction are "3-letter-hostile-organization spies"):
- I initiate TWO CoinJoin transactions instead of just one, of equal output size, and at about the same time, but I inject them at different points of the network if possible (so that an observer does not know they both origin from me).
- When all co-participants have completed their inputs to reach the number N (here I assume N=10 or so at least), I check if BOTH of my inputs show up amongst these N.
- If yes, I can fairly assume that the CoinJoin server did not get flooded with 3-letter-hostile spying-transactions and also the other N-2 inputs are probably legit, and I will sign my two transactions.
- If no, I must assume that the other N-1 transactions are hostile and my second input has been "squeezed-out" and pushed to the next CoinJoin transactions, and I better refrain from signing and try again later. Or I sign anyway (because my non-signing might get interpreted as a DOS attack) but then initiate another CoinJoin transaction were hopefully both of my inputs will show up then...