IntroductionOne of most commonly raised issues with Bitcoin is that of crime and punishment. Over the past 40 years our societies have, through both fair means and foul, built an integrated anti-money laundering system that attempts to let governments trace the proceeds of crime.
Trying to fight crime through chasing money flows is an intuitively attractive proposition - much crime is motivated by profit, so it stands to reason that if you could somehow make it hard to use the profits of crime, you’d reduce all kinds of unrelated crimes simultaneously. Killing N birds with one stone, so to speak.
Evidence that existing AML approaches are worth the cost is limited - governments seem to rarely study the topic and when they do, findings that it isn’t working very well are simply used to justify making the system ever more pervasive and extreme (e.g. lower reporting thresholds).
However, crime (or fear of crime) is a very common concern amongst citizens and politicians are highly responsive to that, and therefore also highly responsible to requests from law enforcement. The worst case scenario is for Bitcoin vs fiat to be presented as a choice between either freedom or the rule of law because most people would prefer to have both, but given a binary choice we know that many will actually choose the latter over the former. The approach of the USA post 9/11 being a good example.
Can we have our cake and eat it? By that I mean, can we build an approach to fighting crime through finance which is Bitcoin-ish, that is:
- Decentralized
- Open
- Democratic (or market based, take your pick)
- Private
- Effective at raising the bar for criminals
- Potentially acceptable to lots of people with a wide range of political views, so it can be positioned as a credible alternative to just banning Bitcoin or imposing untenable AML requirements on its users?
I think the answer is yes, and we can do it by combining tainting with private set intersection protocols.
Note that the definition of “crime” is not what you might expect and I will go into that later.
OverviewPSI protocols allow a client in posession of a set to intersect its set with that of the server, such that the client doesn’t learn what’s in the servers set (beyond the elements that intersect), and the server doesn’t learn what’s in the client’s set at all (not even the intersection). PSI can be implemented using garbled circuits. Huang, Evans and Katz showed that with optimization generic MPC can either beat or match custom protocols but with greater genericity in their paper “Private Set Intersection: Are Garbled Circuits Better than Custom Protocols?”:
http://rt.cs.virginia.edu/~evans/pubs/ndss2012/psi.pdfLet us imagine that in the market there is an arbitrary number of what I’ll call blacklist providers. Some may be operated by governments, others may be operated by communities that self police (for instance, the Bitcoin Police group here on this forum). To give an extreme example, the Silk Road might operate one themselves for the purposes of handling scammers. Blacklist providers maintain sets of outputs that are blacklisted or tainted in some way, for instance because the owner reported them stolen or because it’s believed they are owned by some criminal enterprise.
End users have wallet apps that subscribe to zero or more blacklists. The users themselves choose which blacklists to use. On receiving a payment they proceed to trace backwards recursively adding outputs to their client set. The depth to trace is discussed later. Once that set is calculated they do a private set intersection with their chosen blacklist providers.
If a transaction is flagged this way, what to do is up to the end user. They could report it. They could not report it but refuse to deal with the counter-party. They could ignore it. If the counterparty is trusted, they could be asked where they got the money from and the process of walking backwards through the graph of real people or entities can begin to try and identify the culprit.
The privacy provided by these protocols is important in two ways:
- End user privacy: the provider of the blacklists doesn’t know what is in the users output set, nor when there’s a hit. This prevents blacklist providers trying to engage in global surveillance. It also means that providers cannot easily mandate any kind of action when a flagged transaction is found, because they do not even know when it occurs.
- Server privacy: the blacklist themselves can be secret, encouraging usage. In many cases victims of theft don’t want to announce to the world that they were hacked or scammed, this scheme means only the blacklist operator needs to know of their predicament. It also means outputs can be tainted or blacklisted without the target necessarily finding out - even if the blacklist is open access and not restricted to particular parties, they’d have to be constantly checking it and repeat queries could be identified and blocked by simply requiring an anonymous account (or passport/fidelity-bond).
Definition of crimeOne issue with trying to fight crime in an international, decentralized financial network is the fact that there’s no one definition of what bad behaviour is. Most obviously, different jursidictions have different laws. Less obviously, the laws of a country may not accurately reflect the feelings of its citizens, e.g. in oppressive regimes or when large numbers of ordinary people end up criminalized (war on drugs, etc).
Fortunately, a list-of-blacklists approach automatically solves this problem for us. If each blacklist represents a particular class of behaviour and nobody except the user knows when there a hit on the blacklist, then the difficulty of tracing a “guilty” party increases dramatically with each hop in the chain of trades which does not result in blocking/reporting.
Let’s make this concrete. On this forum there are lots of libertarians. The libertarian philosophy emphasizes private property rights (theft is bad), and individual freedom (you should be able to get high if you want to). In a highly libertarian society, whilst some entity may serve a blacklist of outputs involved in the drugs trade many people may choose not to check it - or alternatively, to check it and then ignore the outcomes. Even if the government runs that blacklist there’s no way to know when there was a hit.
Alternatively, some private institution may run a blacklist that deals only with theft - if money is stolen via hacking then the outputs can be added to the blacklist set so people who believe theft is bad (ie, nearly everyone) and that the provider is trustworthy can check against it. Note that there’s no automatic punishment of these transactions here, so there’s no concern with people maliciously reporting their own legitimate payments as theft … all that would happen is when the legitimate receiver tries to respend the coins and is questioned/reported/flagged they would show evidence that the payment was real (like, a signed payment request) and the malicious party would be ignored by the blacklist provider in future.
In this way, investigation of crime can be decentralized, both allowing far more huge and effective scale than existing AML whilst simultaneously allowing people to exercise their own judgement over what is bad. The fewer people that are checking a particular blacklist, the more effort is required to recursively ask people “where did you get this money from?” and therefore the likelyhood of prosecution drops in line with societies collective lack of interest.
How deep to traceOne obvious problem is that if the taint depth is fixed, then any bad guy can just generate a tree of transactions deeper than that and know they won’t ever be flagged. The depth can be specified in terms of time rather than tree depth, but this is also an arbitrary magic number that can be gamed - for many crimes waiting a while may not be a big deal.
One possibility is to use whitelists for tracing. Let’s call a hub of economic activity like an exchange, popular merchant or even tax collector a nexus. Nexuses can serve private sets of outputs that they have themselves checked against blacklists. Therefore you can keep tracing until you find a hit against one of the whitelists, at which point you don’t need to go any further - you know that part of the tree is clean. If you're a nexus you then add those outputs to your own private set. Eventually the outputs can be dropped according to some specific formula determined by the cost of the nexuses resources.
If the recursive exploration gets too large without hitting a nexus, that is itself a sign that the transaction may be suspicious, but it’s a signal that’s hard to game because in most cases the counterparty won’t know what the depth profile of your average transaction looks like.
Resource consumption and full-set attacksPSI protocols can be implemented using generic multi-party computation but it was previously assumed that this would be too slow, and special protocols had to be designed. The previously linked paper dispels this notion with measurements of real implementations and shows that with sufficiently smart optimization, generic MPC with Yao’s garbled circuits can match or beat the best known specific protocols. Note that the paper also contains a nice overview of how garbled circuits work if you aren’t familiar with the topic.
The UTXO set size is currently 4M. Even with a 32 bit set size (4 billion possible outputs in the blacklist), according to their results it is possible to run the multi-party computation in around 10 seconds. Admittedly that is with a LAN and desktop computers, not smartphones and WANs. However there have been more optimizations since and real-world demonstration of intersection of sets from large universes such as contact lists that run on Android phones, so by the time anyone actually implements this scheme it’s very likely to be feasible due to better algorithms and hardware.
One advantage to using generic garbled circuits is that you can easily add auditing on top to prevent a client attempting to upload, for example, the entire UTXO set and thus stripping server privacy. The size of the set to intersect can be limited to some reasonably small value and accounts/passports/fidelity bonds used to prevent or block scraping attacks in which someone tries to enumerate the entire set by submitting lots of queries.
Government attacksIt should be obvious that this scheme is not intended to resist a maximally coercive government. Bitcoin itself cannot work inside a totalitarian police state because a currency must be widely accepted to be useful, and thus people must publicly advertise that they accept it. An unconstrained government can just fine, jail or kill anyone who advertises that they accept Bitcoin.
Instead this scheme is designed to be an acceptable proposal to existing western governments that are somewhat democratic. It balances the desire to fight crime with the need for privacy and blocking abuse by the state. In practice, all democracies recognize this balance is important and (try to) constrain what the police can do.
One failure mode is that governments may try to mandate police produced “superblacklists” that merge things the user cares about stopping with things the user doesn’t - given a match, it’s impossible to know what the reason for blacklisting was, and given the requirement for everyone to use it, auditing compliance is fairly easy using sting operations.
There isn’t any good solution to this beyond the people demanding that the blacklists be fine grained (i.e. one for drug offences, one for theft, one for corruption, etc). It’s hard to argue against such a setup because it only adds information to the system: having undifferentiated blacklists is tantamount to admitting that some laws don’t enjoy wide support but you want to enforce them anyway. It’s a position that’s trivially reduced to “I don’t believe in democracy”, which is a difficult position for a politician to hold over the long run.