Author

Topic: [RFD] Coin selection algorithm and anonymity (Read 2507 times)

member
Activity: 61
Merit: 10
April 11, 2011, 02:12:11 PM
#7
Ok, so I've worked on this a little for you.

I took the SelectCoinsMinConf function in main.cpp and changed it to match your descriptions, so I have SelectCoins1Conf, SelectCoins2Conf,  and SelectCoins3Conf (4 is going to take a little more work to keep the transactions small).  These functions are called in the SelectCoins function.  I'm not completely sure what the variables nConfMine and nConfTheirs but they seem to have to do with how deep the transactions are in the block chain, so I haven't really messed with them.

Look at the attached file below (can't get a file to attach; upload folder is full) it is just has the new functions and the changed SelectCoins function.  Tell me what you think.  I haven't had time to test this.  I'll have to setup something for the testnet.


Code:
/*
Select a coin which is exactly twice the value of the payment
This makes the two outputs identical, making it impossible to guess which is the payment and which is
the change. Variations: select two or more coins that total exactly twice the payment amount; select
one coin which is approximately twice the payment amount (this will result in two outputs which are close
in value, and as shown above you cannot reliably guess which is the payment)
*/

// Look for a coin that is within 10% of twice the right size
bool SelectCoins1Conf(int64 nTargetValue, int nConfMine, int nConfTheirs, set& setCoinsRet)
{
setCoinsRet.clear();

CRITICAL_BLOCK(cs_mapWallet)
{
vector vCoins;
vCoins.reserve(mapWallet.size());

for (map::iterator it = mapWallet.begin(); it != mapWallet.end(); ++it)
vCoins.push_back(&(*it).second);
random_shuffle(vCoins.begin(), vCoins.end(), GetRandInt);

foreach(CWalletTx * pcoin, vCoins)
{
if (!pcoin->IsFinal() || pcoin->fSpent || !pcoin->IsConfirmed())
continue;

int nDepth = pcoin->GetDepthInMainChain();

if (nDepth < (pcoin->IsFromMe() ? nConfMine : nConfTheirs))
continue;

int64 n = pcoin->GetCredit();

if (n <= 0)
continue;

if (n == nTargetValue)
{
setCoinsRet.insert(pcoin);
return(true);
}else if ((n == (2 * nTargetValue)) ||
 (n >= (1.9 * nTargetValue)) ||
 (n <= (2.1 * nTargetValue)))
{
setCoinsRet.insert(pcoin);
return(true);
}
}
}

return(false);
}

/*
Select two coins such that either (a) the value of each input is less than the payment amount and
sums to greater than (but NOT equal to) the payment amount, or (b) the value of each input is greater
than the payment amount

Example:
A=4, B=2, O1=5, O2=1

A tracker looking at this transaction cannot tell if it satisfies condition (a) or condition
(b), therefore they cannot determine which is the payment and which is the change.
For condition (a), O1 is the payment, and for condition (b) O2 is the payment.
*/
bool SelectCoins2Conf(int64 nTargetValue, int nConfMine, int nConfTheirs, set& setCoinsRet)
{
setCoinsRet.clear();

CRITICAL_BLOCK(cs_mapWallet)
{
vector vCoins;
vCoins.reserve(mapWallet.size());

size_t pCount = 0;

for (map::iterator it = mapWallet.begin(); it != mapWallet.end(); ++it)
{
vCoins.push_back(&(*it).second);
++pCount;
}
random_shuffle(vCoins.begin(), vCoins.end(), GetRandInt);

CWalletTx* pfoundlargercoin1 = NULL;

CWalletTx* smaller_coins[pCount];
size_t smaller_count = 0;
int64 smaller_value = 0;

foreach(CWalletTx * pcoin, vCoins)
{
if (!pcoin->IsFinal() || pcoin->fSpent || !pcoin->IsConfirmed())
continue;

int nDepth = pcoin->GetDepthInMainChain();

if (nDepth < (pcoin->IsFromMe() ? nConfMine : nConfTheirs))
continue;

int64 n = pcoin->GetCredit();

if (n <= 0)
continue;

if (n > nTargetValue)
{
if (pfoundlargercoin1 == NULL)
{
pfoundlargercoin1 = pcoin;
continue;
} else
{
setCoinsRet.insert(pfoundlargercoin1);
setCoinsRet.insert(pcoin);
return(true);
}
} else if (n < nTargetValue)
{
smaller_coins[smaller_count++] = pcoin;
smaller_value += n;
}
}
if (smaller_value < nTargetValue ||
smaller_value == nTargetValue && smaller_count < 2)
return(false);
for (int iLoop1 = 0; iLoop1 < smaller_count; ++iLoop1)
{
for (int iLoop2 = iLoop1; iLoop2 < smaller_count; ++iLoop2)
{
if (smaller_coins[iLoop1]->GetCredit() + smaller_coins[iLoop2]->GetCredit() >= nTargetValue)
{
setCoinsRet.insert(smaller_coins[iLoop1]);
setCoinsRet.insert(smaller_coins[iLoop2]);
return(true);
}
}
}
}

return(false);
}

/*
Select two coins: one whose value exactly matches the payment, and another coin at random.
Example:
A=5, B=2; O1=5, O2=2

It's rather obvious what we're doing here, but since we (and the tracker) know the payment and
change are randomized, a tracker cannot know which is the payment and which is the change.
*/
bool SelectCoins3Conf(int64 nTargetValue, int nConfMine, int nConfTheirs, set& setCoinsRet)
{
setCoinsRet.clear();

CRITICAL_BLOCK(cs_mapWallet)
{
vector vCoins;
vCoins.reserve(mapWallet.size());

size_t pCount = 0;

for (map::iterator it = mapWallet.begin(); it != mapWallet.end(); ++it)
{
vCoins.push_back(&(*it).second);
++pCount;
}
random_shuffle(vCoins.begin(), vCoins.end(), GetRandInt);

CWalletTx* smaller_coins[pCount];
size_t smaller_count = 0;
int64 smaller_value = 0;

CWalletTx* pfrandomcoin = NULL;
CWalletTx* pexactcoin = NULL;

foreach(CWalletTx * pcoin, vCoins)
{
if (!pcoin->IsFinal() || pcoin->fSpent || !pcoin->IsConfirmed())
continue;

int nDepth = pcoin->GetDepthInMainChain();

if (nDepth < (pcoin->IsFromMe() ? nConfMine : nConfTheirs))
continue;

int64 n = pcoin->GetCredit();

if (n <= 0)
continue;

if (n == nTargetValue)
{
pexactcoin = pcoin;

}
else
pfrandomcoin = pcoin;

if (pexactcoin != NULL && pfrandomcoin != NULL)
{
setCoinsRet.insert(pexactcoin);
setCoinsRet.insert(pfrandomcoin);
return(true);
}
}
}

return(false);
}
/*
A special variation on (1) (I call this one the "In yo face!" selection): Select multiple coins adding
to more than the payment amount, and if possible ensure at least one coin is redundant.
Example:
A=2, B=3, C=1, D=1, E=2, O1=4, O2=5

It's clear from examining this transaction that several combinations of the coins could satisfy either
O1 or O2, so the tracker cannot tell which is the payment and which is the change. Note that this algorithm,
unlike the others, is guaranteed to find a set of coins that can be used.
*/
bool SelectCoins4Conf(int64 nTargetValue, int nConfMine, int nConfTheirs, set& setCoinsRet)
{
setCoinsRet.clear();

CRITICAL_BLOCK(cs_mapWallet)
{
vector vCoins;
vCoins.reserve(mapWallet.size());

size_t pCount = 0;

for (map::iterator it = mapWallet.begin(); it != mapWallet.end(); ++it)
{
vCoins.push_back(&(*it).second);
++pCount;
}
random_shuffle(vCoins.begin(), vCoins.end(), GetRandInt);

CWalletTx* smaller_coins[pCount];

CWalletTx* pfoundlargercoin1 = NULL;
size_t smaller_count = 0;
int64 smaller_value = 0;

foreach(CWalletTx * pcoin, vCoins)
{
if (!pcoin->IsFinal() || pcoin->fSpent || !pcoin->IsConfirmed())
continue;

int nDepth = pcoin->GetDepthInMainChain();

if (nDepth < (pcoin->IsFromMe() ? nConfMine : nConfTheirs))
continue;

int64 n = pcoin->GetCredit();

if (n <= 0)
continue;

if (n > nTargetValue)
{
if (pfoundlargercoin1 == NULL)
{
pfoundlargercoin1 = pcoin;
continue;
} else
{
setCoinsRet.insert(pfoundlargercoin1);
setCoinsRet.insert(pcoin);
return(true);
}
} else if (n < nTargetValue)
{
smaller_coins[smaller_count++] = pcoin;
smaller_value += n;
}
}
if (smaller_value < nTargetValue ||
smaller_value == nTargetValue && smaller_count < 2)
return(false);
for (int iLoop1 = 0; iLoop1 < smaller_count; ++iLoop1)
{
for (int iLoop2 = iLoop1; iLoop2 < smaller_count; ++iLoop2)
{
if (smaller_coins[iLoop1]->GetCredit() + smaller_coins[iLoop2]->GetCredit() >= nTargetValue)
{
setCoinsRet.insert(smaller_coins[iLoop1]);
setCoinsRet.insert(smaller_coins[iLoop2]);
return(true);
}
}
}
}

return(false);
}

bool SelectCoins(int64 nTargetValue, set& setCoinsRet)
{
return (SelectCoins2Conf(nTargetValue, 1, 6, setCoinsRet) ||
SelectCoins1Conf(nTargetValue, 1, 6, setCoinsRet) ||
SelectCoins3Conf(nTargetValue, 1, 6, setCoinsRet) ||
SelectCoinsMinConf(nTargetValue, 1, 6, setCoinsRet) ||
SelectCoins2Conf(nTargetValue, 1, 1, setCoinsRet) ||
SelectCoins1Conf(nTargetValue, 1, 1, setCoinsRet) ||
SelectCoins3Conf(nTargetValue, 1, 1, setCoinsRet) ||
SelectCoinsMinConf(nTargetValue, 1, 1, setCoinsRet) ||
SelectCoins2Conf(nTargetValue, 0, 1, setCoinsRet) ||
SelectCoins1Conf(nTargetValue, 0, 1, setCoinsRet) ||
SelectCoins3Conf(nTargetValue, 0, 1, setCoinsRet) ||
SelectCoinsMinConf(nTargetValue, 0, 1, setCoinsRet));
}
member
Activity: 98
Merit: 20
Note that the wallet's coins are first randomized, and then coin selection is initiated.
True, but the randomization really doesn't seem to do much, except perhaps if you have multiple coins that are exactly the right value. In that case, the coin will be randomly selected.
hero member
Activity: 540
Merit: 500
An important part of a future coin selection algorithm should be the ability to spend self-generated coins (which have no history) when that is important.

This patch ( https://bitcointalksearch.org/topic/m.77952 ) is the first step. Next one is to add some mechanism to select from address(es).
donator
Activity: 826
Merit: 1060
An important part of a future coin selection algorithm should be the ability to spend self-generated coins (which have no history) when that is important.
legendary
Activity: 1222
Merit: 1016
Live and Let Live
Sounds logical and rational upon first read.
I agree that the choice of coins should be geared towards making outputs of close to equal size (some more going to the 3rd party, some less going to the 3rd party).
legendary
Activity: 1596
Merit: 1100
Note that the wallet's coins are first randomized, and then coin selection is initiated.
member
Activity: 98
Merit: 20
    Abstract
    The transaction creation algorithm is not optimized for anonymity. It leaks information which could potentially be used to track money flowing through the system. While this may not be of concern to some individuals, it is of concern to others. In order to maximize anonymity for those who wish it, the behaviour of the transaction creation in the standard client should be biased as much as possible towards anonymity. This biasing can easily be accomplished without incurring any extra transaction fees.

    The Problem
    One of Bitcoin's properties is its ability to be anonymous. This ability is not perfect, but it is there. However, the current implementation allows enough side-channel information leakage to make it less difficult for someone to track the flow of cash through various transfers. Closing these side channels may not stop a determined analyst, especially if they have sufficient computing resources, but it will certainly make their job harder. To make the tracker's job harder, we need to minimize the amount of information that allows the tracker to draw an association between two bitcoin addresses.

    The primary problem with the current software is the coin selection algorithm (by "coin" I mean a previous transaction output, i.e. vin[x].prevout. I will sometimes use the terms "coin" and "address" interchangeably). By the way, transaction fees do not affect this discussion, and are assumed to be included in the "payment amount" - the amount of Bitcoins being transferred from one person to another.

    The selection algorithm, which I call the "best fit" algorithm, boils down to:
    • If there is a coin which exactly matches the payment amount, then use it
    • Otherwise, find the coin (if any) which exceeds the payment amount by the least amount; i.e. if the payment is 10BTC and you have an 11BTC coin and a 15BTC coin, use the 11BTC coin
    • Otherwise, find the best combination of coins which sums closest to the payment amount.
    The selection algorithm makes it a little easier for someone to track the ownership of a given coin, which could allow the tracker to link two transactions to a given individual.

    Rule 1 is easy. The coin goes from address A to address B. Following this transaction is trivial.

    Rule 2 is a little more difficult, but not much. The standard client doesn't allow a single payment to multiple recipients, therefore one of the two outputs (the change) must be controlled by the same person who controls address A. The transaction generation routine mitigates this somewhat by randomly selecting whether the change goes into vout[0] or vout[1]. But we know (and therefore the tracker also knows) that the client will try to choose a coin which comes as close as possible in value to the payment, therefore it's very likely that the output with the lesser value is the change, and the output with the greater value is the payment. The greater the difference between the output values, the more this is likely. It's still a guess, of course.

    Examples (output addresses are O1 and O2, input addresses are A, B, C, etc.):
    A=4, O1=3, O2=1: O2 is likely (but not guaranteed) the payment. Therefore O1 is likely controlled by the sender, and O2 is likely controlled by the recipient.
    A=100, O1=99, O2=1: Again, O1 is likely (but still not guaranteed) the payment
    A=10, O1=4, O2=6:  Since O1 and O2 are close together, either one is likely the payment.
    A=10, O1=5, O2=5: Impossible to tell which is the payment and which is the change.

    Rule 3 is trivial. One of the outputs will have a value greater than any individual input coin, and the other output will have a value smaller than any individual coin. The output coin that has the smaller value is the change, and the output coin with the larger value is the payment. Unlike rule 2, that property is guaranteed, and no assumptions or guesses are necessary.
    Examples:
    A=2, B=3, O1=4, O2=1: O1 is the payment, O2 is the change. If the payment were 1BTC, then coin B is not needed at all for the transaction.
    A=2, B=2, C=3, O1=1, O2=6: O2 is the payment, O1 is the change. As in the first example, if the payment were 1BTC, then any one of the inputs would satisfy the payment and the other two would not be included.
    A=?, B=?; O1=4, O2=4: This cannot happen. If it did, one of the two input coins would have to be greater than or equal to the payment amount, so the other one would not be included.


    Potential Coin Selection Algorithms
    A tracker knows that all the input coins to a transaction are controlled by a single person (except under unusual situations not covered by the standard client). Therefore, transactions should use as few inputs as possible.

    Transactions with a single output give away who received the money (it's very unlikely you'll make a payment to yourself).

    Randomizing which output is the payment and which is the change is good, so keep that.

    The goal is to make it difficult for a tracker to determine which output is the payment, and which is the change.

    Keeping those principles in mind, I propose the following algorithms:

    1) Select a coin which is exactly twice the value of the payment
    This makes the two outputs identical, making it impossible to guess which is the payment and which is the change. Variations: select two or more coins that total exactly twice the payment amount; select one coin which is approximately twice the payment amount (this will result in two outputs which are close in value, and as shown above you cannot reliably guess which is the payment)

    2) Select two coins such that either (a) the value of each input is less than the payment amount and sums to greater than (but NOT equal to) the payment amount, or (b) the value of each input is greater than the payment amount

    Example:
    A=4, B=2, O1=5, O2=1

    A tracker looking at this transaction cannot tell if it satisfies condition (a) or condition (b), therefore they cannot determine which is the payment and which is the change. For condition (a), O1 is the payment, and for condition (b) O2 is the payment.

    3) Select two coins: one whose value exactly matches the payment, and another coin at random.
    Example:
    A=5, B=2; O1=5, O2=2

    It's rather obvious what we're doing here, but since we (and the tracker) know the payment and change are randomized, a tracker cannot know which is the payment and which is the change.

    4) A special variation on (1) (I call this one the "In yo face!" selection): Select multiple coins adding to more than the payment amount, and if possible ensure at least one coin is redundant.
    Example:
    A=2, B=3, C=1, D=1, E=2, O1=4, O2=5

    It's clear from examining this transaction that several combinations of the coins could satisfy either O1 or O2, so the tracker cannot tell which is the payment and which is the change. Note that this algorithm, unlike the others, is guaranteed to find a set of coins that can be used.

    Important considerations
    The transaction creation function must be able to use any of the algorithms above. The specific algorithm must be chosen randomly. The randomness can be weighted towards algorithms 1 and 2.

    The transactions that are generated are unlikely to incur any transaction fees greater than they would normally. None of these algorithms chooses excessively large numbers of coins, thus there is no additional cost to the user.

    The current algorithm can potentially loop 1000 times seeking the best combination of coins to come closest to the payment amount. These algorithms eliminate that overhead.

    It is important that the default behaviour of the Bitcoin client is geared towards anonymity. In order for anonymity to succeed, it must be difficult or impossible to distinguish between a transaction generated by someone seeking anonymity, and a transaction generated by someone who doesn't care about anonymity. If the random algorithm selection is used only by those seeking anonymity, then the tracker just needs to look for transactions that don't fit the "best fit" algorithm.

    (Minor edits to clean up formatting)
    Jump to: