So, I was thinking a lot about privacy for users that don't run a personal Full Node + Electrum Server themselves.
Of course, with only ~10.000 nodes and millions of Bitcoin users globally, it means the vast majority of Bitcoin users is sacrificing privacy by querying (sometimes fixed, built-in) Electrum servers.
In cryptography there is the concept of
PIR / Private Information Retrieval.
The trivial PIR scheme is the one where the client downloads the whole database and performs the query locally, which would equate to running their own full node.
However, since
'95 with PIR there are much smarter ways to achieve private information retrieval which I think would highly benefit the majority of Bitcoin users.
The earliest, simpler (but more sophisticated than trivial PIR) PIR concepts actually rely on multiple, non-colluding servers with exact copies of the database. This is exactly what we have in Bitcoin already. Tons of nodes with an exact copy of the blockchain. This means it might be possible to implement this mostly client-side with maybe little changes to ElectrumX. Many recent improvements in PIR are mostly aimed to reduce the number of needed servers; however, we don't need to worry about this at all which would make the whole thing much simpler.
The core idea of multi-server PIR is as follows: the client doesn't query just one (Electrum) server, but instead multiple, for various different files (addresses). Only one of these files (addresses) is the one actually interesting for the client. That way any one server cannot assume that any of the datasets queried actually belong to the client.
For example: user wants to query the balance of their address 'B'. (this could be transaction information or any other blockchain information as well)
Instead of querying one server for the balance of address B, it queries multiple servers like this:
- 'Server 1, give me balance of addresses A, B, C'
- 'Server 2, give me balance of addresses A, C'.
Now, mostly to save on communication
(I think), the servers XOR the binary representation of the balances (in the example: balance
A+balance
B+balance
C and balance
A+balance
C) locally before sending; then the client will XOR the results: (balance
A+balance
B+balance
C) + (balance
A+balance
C). The unwanted, randomly chosen addresses' balances cancel each other out and they gain the balance of address B.
To implement it into ElectrumX, and retain backwards compatibility to non-PIR queries, a PIR query would just need some kind of extra indicator that makes it clear the client isn't just querying multiple addresses for which they want the separate balances in clear; instead, that the client wants the XOR of those balances.
It would be possible to just get the balances and discard the ones a client is not actually interested in, as well, but especially with larger queries the communication overhead would get quite big. XOR of the query results solves this issue.
What do you guys think about implementing such thing in Bitcoin / ElectrumX? Was something like this maybe attempted already? Other suggestions for improvements of the idea? Or is it all superfluous for some reason after all? Glad for any input on the topic!