Author

Topic: P2P coin mixing (Read 3336 times)

jr. member
Activity: 38
Merit: 3
June 01, 2013, 02:24:49 PM
#17
Suppose I have two outputs A and B, each with a value of 0.5 BTC. Presently there is no information in the blockchain that indicates A and B are owned by the same person. If I want to send 0.75 BTC to a third party, I must create a transaction  that takes A and B as inputs and creates a 0.75 output C and a 0.25 change output D.

An attacker looking at this transaction can conclude that the same person owns A, B and, D because it would not have been necessary to combine A and B if the real spend was only 0.25. It would not matter at all if A and B had been mixed prior to this operation or not.

You are quite right, A, B and D are associated at this point in your situation. Of course, with an M-to-N mixer, A and B could have been combined together into say E previously and then E could not have been linked quite as easily to C or D, as it would be unknown which of them is the payment and which is the change. The same result can be obtained however by mixing N-to-N twice, a first time with A and B to unlink them and a second time with D after the transaction you were proposing. In such a way, the outputs of A and B (which we can call F and G) are linked to spend C and change D as before, but since neither F, G, C or D are linked to anything else after the second mixing, who cares?

Now you are right that my approach takes two mixings while yours takes only one but it is actually more secure since D cannot be linked at all to any other of Alice's future transactions as it is mixed right after. In your approach, an adversary actually would have a 1 out of 2 chance of correctly guessing the change address anyways, whereas this probability of guessing is much smaller in the case of a secure mixing.

you have to combine smaller outputs from time to time in transactions

Yep, that's just the point of mixing these small outputs. After the mixing, you can just throw them all together in one large output if that is more intuitive, but you can actually go straight to spending them together in one or many larger transactions without caring if they are linked together, as they are linked to nothing else.
legendary
Activity: 1400
Merit: 1013
May 27, 2013, 07:12:10 AM
#16
]I don't agree here - after all, at some point in time you simply won't have enough large outputs anymore because all outputs keep shrinking.  Thus you have to combine smaller outputs from time to time in transactions (except if you want to "throw away" too small outputs regularly), which then helps to associate all those to you.
I fail to see in what way you are disagreeing with me. That's exactly what I just said.

]Furthermore, another advantage I see in a mixing service is this:  Somewhere I have to get my initial coins; and if I don't mine, then buying is what I have to do.  If I buy for instance at Mt. Gox (yes, I know about localbitcoins and the like, but what if there are no sellers for cash in my area?), then at least *they* know theoretically my real identity and could connect me to everything I buy with the coins.  Using a mixer can break this identification from my real identity to the bought coins.

Notice that I said that a user does not need a external mixing service for this case. If you withdraw that output to and address you control, you can create a series of transactions from that point on that splits the output in to several new outputs in a manner indistinguishable from spending it. Mt Gox will only know that you had the coins at some point; it will will have no way to know if you still have them or which outputs you may still control as change. This is exactly the same thing that mixing accomplishes, so the only remaining problem to be solved is the case of how to combine multiple small outputs into larger outputs in a way that does not reduce your anonymity.
legendary
Activity: 1135
Merit: 1166
May 27, 2013, 01:01:36 AM
#15
If users follow best practices and never reuse addresses and use clients with good output selection algorithms they don't need a external mixing service. The only time when mixing is necessary is when they want to create an output larger than any available inputs.

I don't agree here - after all, at some point in time you simply won't have enough large outputs anymore because all outputs keep shrinking.  Thus you have to combine smaller outputs from time to time in transactions (except if you want to "throw away" too small outputs regularly), which then helps to associate all those to you.

Furthermore, another advantage I see in a mixing service is this:  Somewhere I have to get my initial coins; and if I don't mine, then buying is what I have to do.  If I buy for instance at Mt. Gox (yes, I know about localbitcoins and the like, but what if there are no sellers for cash in my area?), then at least *they* know theoretically my real identity and could connect me to everything I buy with the coins.  Using a mixer can break this identification from my real identity to the bought coins.

I would be very delighted to see a p2p mixer based on this or a similar concept developed.  After all, it seems not too hard (compared, e.g., to the p2p exchange everyone is talking about at the moment), and would be very valuable in my eyes.
legendary
Activity: 1400
Merit: 1013
May 26, 2013, 11:19:34 PM
#14
However, this can be done without a problem after the mixing since all that you can learn from the blockchain is that some of the addresses in the mixing belonged to the same person, and you would have learned that from any M-to-N mixing with M>N anyways. Of course, that means that the number of addresses in the mixing transaction is larger, but this has the upside of increasing the anonymity gained.
I don't follow this reasoning.

Suppose I have two outputs A and B, each with a value of 0.5 BTC. Presently there is no information in the blockchain that indicates A and B are owned by the same person. If I want to send 0.75 BTC to a third party, I must create a transaction  that takes A and B as inputs and creates a 0.75 output C and a 0.25 change output D.

An attacker looking at this transaction can conclude that the same person owns A, B and, D because it would not have been necessary to combine A and B if the real spend was only 0.25. It would not matter at all if A and B had been mixed prior to this operation or not.

If users follow best practices and never reuse addresses and use clients with good output selection algorithms they don't need a external mixing service. The only time when mixing is necessary is when they want to create an output larger than any available inputs.
jr. member
Activity: 38
Merit: 3
May 25, 2013, 11:29:58 AM
#13
One important difference between the schemes you are considering and Yang's is that in your schemes, the people that do the mixing seem to know the link between the input and output addresses of the people that they are mixing with, while this is not the case in Yang's proposal. Another important difference is that you accept inputs of different sizes, while he doesn't. I understand the appeal of using inputs of different sizes but as Yang suggests, we can just transform these into standardized sizes using a greedy algorithm and mix them separately. This means that you don't have to fiddle the sizes as you have been doing.

As for M-to-N mixing with M>N as justusranvier suggested, I believe it is not really an improvement over N-to-N mixing. My reasoning is the following:
The idea of M-to-N mixing seems to be to reduce the number of change addresses each user has. This cannot be done before the mixing as this would link previous transactions. However, this can be done without a problem after the mixing since all that you can learn from the blockchain is that some of the addresses in the mixing belonged to the same person, and you would have learned that from any M-to-N mixing with M>N anyways. Of course, that means that the number of addresses in the mixing transaction is larger, but this has the upside of increasing the anonymity gained.
staff
Activity: 4284
Merit: 8808
October 19, 2012, 08:05:49 AM
#12
Code:
47 BTC -> 50 BTC
50 BTC -> 47 BTC

Another possibility is just having change.

Code:
47 BTC -> 47 BTC
50 BTC -> 47 BTC
            ->  3 BTC

Then the 3 BTC is considered still unmixed by the original owner.  This avoids DOS attacking the txout set, and requires fewer fees.

Another useful optimization is to trigger these activities when you want to make a payment so that it's not even creating an extra transaction and the ownership of the outputs is further clouded (e.g. in that example there could be 1-2 senders and 1-3 recipients who may or may not be the same as the senders).
legendary
Activity: 1106
Merit: 1004
October 19, 2012, 04:33:33 AM
#11

It's indeed very cool. But I don't know how step 2 actually works. I suppose I'll have to read that 17 pages scientific paper on Secure Multi-party Sorting linked there.....  Undecided

1. Once all of the parties submit their signed transactions, aren't they at risk of losing their funds to a rogue mixer?

No. The transaction is already built with the correct outputs before you sign. You verify that your output is there.
legendary
Activity: 1358
Merit: 1003
Ron Gross
October 19, 2012, 01:57:21 AM
#10
Is anyone working on this?
hero member
Activity: 784
Merit: 1000
0xFB0D8D1534241423
September 19, 2012, 07:23:46 PM
#9
The point of mixing is so you can't tell who owns what coins afterwards. If you have a tx like this:

Code:
47 BTC -> 50 BTC
50 BTC -> 47 BTC

you don't have to be a genius to assume that the 47 BTC input maps to the 47 BTC output. The mix has failed.

There isn't an easy way to fix this if the disparities between parties are huge. But if they aren't, as above, we can do:

Code:
47 BTC   -> 10 BTC
50 BTC   -> 10 BTC
         -> 10 BTC
         -> 10 BTC
         -> 1 BTC
         -> 1 BTC
         -> 5 BTC
         -> 5 BTC
         -> 5 BTC
         -> 20 BTC
         -> 20 BTC

and now it's not really clear which output is owned by whom. There are several combinations that work.

Let a represent the minimum of the inputs,
Let b represent the GCD of the inputs.

For each input, the output is to be as many blocks of a as possible, followed by enough b to complete it.

This method works decently:

Code:
IN        OUT
 8 BTC -> 8 BTC
10 BTC -> 8 BTC
       -> 2 BTC

Code:
IN           OUT
3.5 BTC -> 3.5 BTC
6.5 BTC -> 3.5 BTC
9.0 BTC -> 3.5 BTC
        -> 3.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
That method is unfortunately not perfectly secure. For example, none of the 0.5 BTC outputs belong to the first party (the one supplying 3.5 BTC).
The last one could be done better and shorter:
Code:
3.5 BTC -> 3.0 BTC
        -> 0.5 BTC
6.5 BTC -> 3.0 BTC
        -> 3.0 BTC
        -> 0.5 BTC
9.0 BTC -> 3.0 BTC
        -> 3.0 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
13 outputs, instead of 14. Also, looking at the transaction, you cannot figure out any information with 100% certainty about the identity of any one of the outputs.

So we must translate this
Quote
For each input, the output is to be as many blocks of a as possible, followed by enough b to complete it.
Into
Quote
For each input, the output must be composed of blocks no greater than a. Each input must correspond to at least one output of each size block.
Which is only a bound, not a function.
Since all of the inputs are in a single transaction, there is no way to map an input to an output. I can find no way to further shorten the above example.

This example:
Code:
IN        OUT
 8 BTC -> 8 BTC
10 BTC -> 8 BTC
       -> 2 BTC
Fails in a very bad way: the 2 BTC output obviously belongs to the owner of the 10 BTC input.

There is also the unfortunate aspect of probability: looking at the example with 3.5, 6.5, and 9 as inputs, each 0.5 output has a 75% chance of belonging to the owner of the 9 input.


Here is the optimal solution.
Several parties can get together and mix like this:
Code:
3.5 BTC    3.5 BTC
6.5 BTC -> 3.5 BTC
9.0 BTC    3.5 BTC
           3.0 BTC
           5.5 BTC
Of course, the 3.0 and 5.5 BTC outputs are identifiable. They do another transaction*
Code:
3.0 BTC -> 3.0 BTC
5.5 BTC -> 3.0 BTC
        -> 2.5 BTC
The final 2.5 BTC is impossible to mix satisfactorily with the other parties.
* this can all be done in one transaction, actually

Doing an analysis on this transaction yields:
- Each of the 3.5 BTC outputs have a 33% chance of belonging to any given party. 1/3 is obviously the maximum anonymization which can be achieved with 3 parties.
- Each of the 3.0 BTC outputs have a 50% chance of belonging to B or C, and do not belong to A. 1/2 is the maximum anonymization which can be achieved with 2 parties, and trying to mix these coins with party A ruins the perfect anonymization of the 3.5 BTC outputs (see below).
- The 2.5 BTC output belongs to C

Mathematically, any good permutation is equivalent. For example, given 38 0.5 outputs, each output has a 7/38 chance of belonging to party A, a 13/38 chance of belonging to party B, and a 18/38 chance of belonging to party C.



Now that I've shown perfect anonymization to be impossible with unequal inputs, here's my idea of a practical implementation:
- Each node which wants to mix sets two values, Min and Max. Min being the lowest value it wishes to deal with (this is a tradeoff -- a lower value means higher anonymity, but a higher value saves money on tx fees and blockchain size etc), and Max being the amount of money it wishes to supply to the input of the transaction (setting this to the maximum amount of money in the wallet risks disclosing how much money you own and could theoretically allow identification of addresses, but it lets all of your money be anonymized later on.)
- Each node also has a MaxParties or MaxTxSize constant to avoid some global mixing which crashes every computer and begins the apocalypse.
- The node broadcasts these through some channel (p2p or centralized) and connects to peers such that Amax >= Bmin and Bmax >= Amin for all peers A and B.
- After MaxParties is reached or some time limit is expired, they begin constructing a transaction as in the iterative example above.
Code:
Populate associative array A with inputs from each party (each party supplies Max as broadcasted as a Tx input) and an array of output addresses per input.
While array A is not empty
{
    find the minimum value in A
    add an output of this minimum size to the first address in each outputaddress array to the transaction
    remove the minimum key-array pair from A
    remove the first address from each array in A
}
Verify that the transaction spends the total of your input addresses to your output addresses
Sign the transaction
Once signed by everyone, broadcast it.
I would notate array A as something like {3.2: [1abc, 1def], 5.0: [1ghi, 1jkl]...}
In order to find how many output addresses to supply, take Max from each party and run the above code, but keeping a tally of outputaddresses rather than adding outputs pointing to them.
Once this is done, each party can help populate array A by adding their Max and list of addresses, and signing that with the address which is to supply the input.
sr. member
Activity: 323
Merit: 250
July 20, 2012, 04:08:46 PM
#8

This looks very cool. A couple of questions:

1. Once all of the parties submit their signed transactions, aren't they at risk of losing their funds to a rogue mixer?
2. How is m determined for the 1/m probability of having to do another mixing?
staff
Activity: 4270
Merit: 1209
I support freedom of choice
legendary
Activity: 1246
Merit: 1077
July 17, 2012, 03:23:32 PM
#6
The point of mixing is so you can't tell who owns what coins afterwards. If you have a tx like this:

Code:
47 BTC -> 50 BTC
50 BTC -> 47 BTC

you don't have to be a genius to assume that the 47 BTC input maps to the 47 BTC output. The mix has failed.

There isn't an easy way to fix this if the disparities between parties are huge. But if they aren't, as above, we can do:

Code:
47 BTC   -> 10 BTC
50 BTC   -> 10 BTC
         -> 10 BTC
         -> 10 BTC
         -> 1 BTC
         -> 1 BTC
         -> 5 BTC
         -> 5 BTC
         -> 5 BTC
         -> 20 BTC
         -> 20 BTC

and now it's not really clear which output is owned by whom. There are several combinations that work.

Let a represent the minimum of the inputs,
Let b represent the GCD of the inputs.

For each input, the output is to be as many blocks of a as possible, followed by enough b to complete it.

This method works decently:

Code:
IN        OUT
 8 BTC -> 8 BTC
10 BTC -> 8 BTC
       -> 2 BTC

Code:
IN           OUT
3.5 BTC -> 3.5 BTC
6.5 BTC -> 3.5 BTC
9.0 BTC -> 3.5 BTC
        -> 3.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
        -> 0.5 BTC
legendary
Activity: 1400
Merit: 1013
July 17, 2012, 02:54:49 PM
#5
Code:
47 BTC   -> 10 BTC
50 BTC   -> 10 BTC
         -> 10 BTC
         -> 10 BTC
         -> 1 BTC
         -> 1 BTC
         -> 5 BTC
         -> 5 BTC
         -> 5 BTC
         -> 20 BTC
         -> 20 BTC
You don't need an external mixing service to anonymously break large bitcoin balances into small bitcoin balances - you need one to combine small balances into larger balances.

So a better use case would be several people sending their small balances to a mixing address to obtain larger balances. If everybody takes out bitcoins in increments of BTC1, for example then it's even harder to figure out which inputs map to a particular output because all outputs are the same.
legendary
Activity: 1526
Merit: 1134
July 17, 2012, 11:12:49 AM
#4
The point of mixing is so you can't tell who owns what coins afterwards. If you have a tx like this:

Code:
47 BTC -> 50 BTC
50 BTC -> 47 BTC

you don't have to be a genius to assume that the 47 BTC input maps to the 47 BTC output. The mix has failed.

There isn't an easy way to fix this if the disparities between parties are huge. But if they aren't, as above, we can do:

Code:
47 BTC   -> 10 BTC
50 BTC   -> 10 BTC
         -> 10 BTC
         -> 10 BTC
         -> 1 BTC
         -> 1 BTC
         -> 5 BTC
         -> 5 BTC
         -> 5 BTC
         -> 20 BTC
         -> 20 BTC

and now it's not really clear which output is owned by whom. There are several combinations that work.
legendary
Activity: 1400
Merit: 1013
July 16, 2012, 05:08:38 PM
#3
Because each input is mapped to multiple outputs of a random value split, chain observers cannot tell who owns the resulting outputs, assuming no huge disparities in the values put into the mix. If there are huge disparities, you can assume that the tiny input maps to the set of tiny outputs.
Why are random output values better than identical output values?
legendary
Activity: 1526
Merit: 1134
July 16, 2012, 03:42:20 AM
#2
It can be done simpler and with better scalability if you avoid CHECKMULTISIG.

The participants inform each other of what outputs they want (script+value). Each participant submits several outputs and a randomized ordering is agreed, for example, by having each participant create a very large random number for themselves.

Each participant constructs a transaction that pays out to the full group of outputs. They then create a single input connected to the value they wish to have mixed and sign it with SIGHASH_ANYONECANPAY and broadcast the input to the others along with another large random number. Each time a broadcast input is received it is appended to the transaction and when the full value is accumulated, the inputs are sorted by the random number field to ensure everyone has a bitwise-identical transaction. The participants can now announce the final transaction to the network.

SIGHASH_ANYONECANPAY makes the signatures independent of the other inputs. So it's possible to sign in isolation and later merge the inputs together into one valid transaction.

Because each input is mapped to multiple outputs of a random value split, chain observers cannot tell who owns the resulting outputs, assuming no huge disparities in the values put into the mix. If there are huge disparities, you can assume that the tiny input maps to the set of tiny outputs.
legendary
Activity: 1400
Merit: 1013
July 14, 2012, 07:38:56 PM
#1
These are some thoughts I had while considering the problem of anonymous coin mixing. Not all of the features exist currently in bitcoin but I believe they are possible:

N agents meet over an anonymous communication channel. They collaborate to create an N-of-N multisig address (M). They also agree on a time limit (L) in the form of a future block number and a payout amount (A).

Once those preconditions have been met the begin constructing a transaction (T) which takes M as an input and several outputs of size A. They nominate output addresses in a manner which can not be traced to any particular participant.

When all N agents are satisfied with T they sign the transaction but wait to broadcast it until M is sufficiently funded.

Now it's up to all participants to fund M before the time limit expires. In order to prevent cheating, the transactions they use to fund M expire if they are not included in a block prior to L and are invalid unless they are included in the same block as T. Each participant should include additional input beyond the outputs they have specified as a contribution to the transaction fee. They can attempt to contribute less than their share but that makes it unlikely that T will be processed.

If M is not sufficiently funded prior to L then mixing fails. T never happens and everybody gets their coins back because all the transactions which have M as an output are invalid.
Jump to: