Author

Topic: PIR / Private Information Retrieval - Privacy for light wallets? (Read 855 times)

hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
By database I am assuming you mean P2P block files, right? Otherwise if you just fetch the server state, it is not truly private because the blockchain is still obtained server-side.
I was just using that terminology to introduce PIR, as it's commonly applied to database queries. As the blockchain is somewhat of a decentralized database, I believed it should be possible to adapt PIR for Bitcoin blockchain queries.

I've yet to check out the paper BlackHatCoiner sent, but at first thought, whatever they did should be possible to adapt to a light Bitcoin client scenario.
All it really needs is querying balances and transaction histories of addresses, as well as submitting transactions to the blockchain.
Not sure whether they have a feature for that, though - few block explorers do have a 'submit raw transaction' feature.

If the paper's authors provide an API to their block explorer, it may directly be usable for the application I had in mind.

Quote from: NotATether link=topic=5365132.msg60991847#msg60991847
I get that it has to be a "light" client, but with no central oracle to verify all the nodes' authenticity, you're still trusting that the database sump hasn't been tampered with before the connection opened (something that encryption cannot remedy).
If you are referring to the general limitation of SPV wallets (trusting the server providers); that's true, like it always is. A partial remedy would be having multiple servers and avoiding large amounts.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
By database I am assuming you mean P2P block files, right? Otherwise if you just fetch the server state, it is not truly private because the blockchain is still obtained server-side.

I get that it has to be a "light" client, but with no central oracle to verify all the nodes' authenticity, you're still trusting that the database sump hasn't been tampered with before the connection opened (something that encryption cannot remedy).
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
Some are in the process of implementing this idea with fully homomorphic encryption, but as a block explorer: https://btc.usespiral.com/
This is their paper: https://eprint.iacr.org/2022/368
And this their repository: https://github.com/menonsamir/spiral-rs
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Oh sure; if you want to use the node to hold the funds, that's also a possibility. But then you're going to communicate via HTTP? HTTPS? Will you write a custom communication protocol? Probably best to open SSH (built-in encryption) and use the CLI then. However, opening SSH to the internet is a risk as well. I'd personally probably store the keys (funds) in the light wallet (small amounts) and use a personal Electrum server.
I'm not well versed in internet protocols but the way you do it depends on whether you are going to use the default options available by the node client or are you going to write something on top of it. In bitcoin's network we are communicating over HTTP and the SSL encryption would require certificate which you may not want to bother with for such a simple thing so I'd stick with HTTP with my own encryption.

P.S. It could be a 2of2 multi-signature setup for additional security.
Of course the 'own crypto' could be some proven thing like RSA or just a shared symmetric key. However, I generally like to 'keep it simple, stupid'; so one of the easiest, secure things would honestly be SSH. It's configured already, anyway, so no additional service is going to run on the machine and it's all encrypted by default as well. In case of opening SSH to the internet, I'd recommend to set it to a non-standard port and turning off password-based authentication; instead using an RSA or ECDSA keypair.
legendary
Activity: 3472
Merit: 10611
Oh sure; if you want to use the node to hold the funds, that's also a possibility. But then you're going to communicate via HTTP? HTTPS? Will you write a custom communication protocol? Probably best to open SSH (built-in encryption) and use the CLI then. However, opening SSH to the internet is a risk as well. I'd personally probably store the keys (funds) in the light wallet (small amounts) and use a personal Electrum server.
I'm not well versed in internet protocols but the way you do it depends on whether you are going to use the default options available by the node client or are you going to write something on top of it. In bitcoin's network we are communicating over HTTP and the SSL encryption would require certificate which you may not want to bother with for such a simple thing so I'd stick with HTTP with my own encryption.

P.S. It could be a 2of2 multi-signature setup for additional security.
legendary
Activity: 3430
Merit: 3080
BIP157 filters would have a useful purpose: rescanning transaction history for a wallet on a pruned node that doesn't have the necessary block history.

IIRC scanning with a local copy of the filters turned out to be really slow, hardly faster than scanning the blocks (assuming you had them)

yeah, that was my point.

I never tried the patch, so have no idea as to the performance. All it would do is give you the option to rescan any wallet if you have a pruned node (and all the filters). which is not as useful as it sounds, but still perhaps the most practical option in limited circumstances.

an arguable downside would be that people running pruned nodes may find it hard to justify keeping anything but the minimum blocks, but it's also probably hard to make that problem too much worse than it is already.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
It's true; the setup I recommend is also always running Core + Electrum (maybe via Tor - alone for the simplicity of tunneling through NAT like this) and connecting to it remotely.
Electrum protocol is only useful if you don't know the addresses or transactions you need to look up (it is indexed so it is quick to look it up). It is your own node with your own deterministic wallet so it already knows all your addresses and transaction histories.
I also don't think you need Tor, you can simply create a key pair and use it to encrypt all your communication between the two devices. Nobody in the middle would know what you would be doing.
Oh sure; if you want to use the node to hold the funds, that's also a possibility. But then you're going to communicate via HTTP? HTTPS? Will you write a custom communication protocol? Probably best to open SSH (built-in encryption) and use the CLI then. However, opening SSH to the internet is a risk as well. I'd personally probably store the keys (funds) in the light wallet (small amounts) and use a personal Electrum server.
legendary
Activity: 3472
Merit: 10611
It's true; the setup I recommend is also always running Core + Electrum (maybe via Tor - alone for the simplicity of tunneling through NAT like this) and connecting to it remotely.
Electrum protocol is only useful if you don't know the addresses or transactions you need to look up (it is indexed so it is quick to look it up). It is your own node with your own deterministic wallet so it already knows all your addresses and transaction histories.
I also don't think you need Tor, you can simply create a key pair and use it to encrypt all your communication between the two devices. Nobody in the middle would know what you would be doing.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
You know what? It might actually be viable e.g. to write a phone wallet that is a fully verifying, but pruned node. Of course it's not for everyone; it will take some time to get up and running. But with computing performance of smartphones catching up with actual laptops these days, it could be viable to code the App in a way that it only syncs at night or when the phone is charging and connected to WiFi, something like that. And given a few nights and unmetered DSL connection, it should work nicely.
This is interesting but is there any demand for it?
It is not just the initial synchronization, the client has to remain in sync by downloading every block as they are found which will consume a lot of battery and your data per day. I think instead of wanting to put your phone under a lot of pressure, it is best to run a full node at home on PC and then use your phone to connect to your own PC over the internet. It could be done over a secure channel with encryption involved.
It's true; the setup I recommend is also always running Core + Electrum (maybe via Tor - alone for the simplicity of tunneling through NAT like this) and connecting to it remotely. However, I see where this could be useful. For example, if someone doesn't have the money and knowledge to get the equipment to run a dedicated node, or maybe there are routing issues (see Russia who somehow wants to block Tor traffic lately) and stuff like that. You'd really have a full node in your pocket at all times, which is somehow pretty cool.

if you can handle mashing some disparate git branches together, and building the resultant monstrosity, Bitcoin Core can be run on an android phone 'today' (I predict that anyone crazy enough to attempt this will take up to a week to get something working). Although I thoroughly recommend not doing this, even once it's an official build.
I don't really like mashing projects together; as you say, it would result in a piece of software that nobody should honestly use.. So if I find the time, I may write something from scratch, but I've mostly moved away from mobile app programming. I'd do an exception to get Bitcoin Core onto mobile, though. Tongue
staff
Activity: 4326
Merit: 8951
BIP157 filters would have a useful purpose: rescanning transaction history for a wallet on a pruned node that doesn't have the necessary block history.
I say "would" because, of course, this feature was coded and then (sadly) abandoned. But it's possible in principle (and, yes, I know, off-topic Cheesy)

IIRC scanning with a local copy of the filters turned out to be really slow, hardly faster than scanning the blocks (assuming you had them) due to the design property that the prior block hash is used to salt the hashing. This property is pretty important for some imagined uses, but pointless for local use.

So yeah, it could be used for that-- but for that application a purely local filter that didn't need that salting would work a lot better. ::shrugs::
legendary
Activity: 3430
Merit: 3080
if you can handle mashing some disparate git branches together, and building the resultant monstrosity, Bitcoin Core can be run on an android phone 'today' (I predict that anyone crazy enough to attempt this will take up to a week to get something working). Although I thoroughly recommend not doing this, even once it's an official build.
legendary
Activity: 3472
Merit: 10611
You know what? It might actually be viable e.g. to write a phone wallet that is a fully verifying, but pruned node. Of course it's not for everyone; it will take some time to get up and running. But with computing performance of smartphones catching up with actual laptops these days, it could be viable to code the App in a way that it only syncs at night or when the phone is charging and connected to WiFi, something like that. And given a few nights and unmetered DSL connection, it should work nicely.
This is interesting but is there any demand for it?
It is not just the initial synchronization, the client has to remain in sync by downloading every block as they are found which will consume a lot of battery and your data per day. I think instead of wanting to put your phone under a lot of pressure, it is best to run a full node at home on PC and then use your phone to connect to your own PC over the internet. It could be done over a secure channel with encryption involved.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Okay, so I read a few bits and pieces here and there about BIP157 and BIP158 today. It seems to me that using these should be a viable alternative for privately using a light wallet, while skipping Electrum altogether.

I'm dubious.  They're a constant factor communication reduction, but the chain is always growing.   So even if they use little enough data to be acceptable it's just a matter of time until they're not again.

Plus, if you're willing to take many more times the bandwidth of a non-private lite wallet just to get privacy you can just run a (pruned) full node.  Yes, it's more bandwidth than the filters but it's MUCH more private because transaction relay means they have announcement privacy, and much more secure.
You know what? It might actually be viable e.g. to write a phone wallet that is a fully verifying, but pruned node. Of course it's not for everyone; it will take some time to get up and running. But with computing performance of smartphones catching up with actual laptops these days, it could be viable to code the App in a way that it only syncs at night or when the phone is charging and connected to WiFi, something like that. And given a few nights and unmetered DSL connection, it should work nicely.
newbie
Activity: 22
Merit: 6
Finally, if I am satisfied, I will empty my Electrum Wallet and keep the Wasabi Wallet.

You might want to keep your Electrum wallet and it's backup in rare case someone send Bitcoin to your old address.


True
legendary
Activity: 3430
Merit: 3080
Okay, so I read a few bits and pieces here and there about BIP157 and BIP158 today. It seems to me that using these should be a viable alternative for privately using a light wallet, while skipping Electrum altogether.

I'm dubious.

BIP157 filters would have a useful purpose: rescanning transaction history for a wallet on a pruned node that doesn't have the necessary block history.

I say "would" because, of course, this feature was coded and then (sadly) abandoned. But it's possible in principle (and, yes, I know, off-topic Cheesy)
staff
Activity: 4326
Merit: 8951
Okay, so I read a few bits and pieces here and there about BIP157 and BIP158 today. It seems to me that using these should be a viable alternative for privately using a light wallet, while skipping Electrum altogether.

I'm dubious.  They're a constant factor communication reduction, but the chain is always growing.   So even if they use little enough data to be acceptable it's just a matter of time until they're not again.

Plus, if you're willing to take many more times the bandwidth of a non-private lite wallet just to get privacy you can just run a (pruned) full node.  Yes, it's more bandwidth than the filters but it's MUCH more private because transaction relay means they have announcement privacy, and much more secure.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Of course, all light wallets (at least the ones I tried) have a default Electrum server.
Keep in mind that using the Electrum protocol but using a single "default Electrum server" without the option to change, is a point of centralization and the wallet can be referred to as "server dependent". These wallets offer 0 privacy and are open to certain attack vectors so they lack some security too.
That's the motivation behind this topic, sorry if that wasn't clear Cheesy

You are right, most light wallets (almost all phone wallets) are server dependent. There may be a hidden option in one or two that let user switch from their centralized server to the more decentralized Electrum nodes but majority will rely on their own service.
Of course, all light wallets (at least the ones I tried) have a default Electrum server. But to my knowledge, they do almost all use Electrum protocol.

That makes sense if you tried BTC light wallet since both Electrum protocol and server are open-source, since the developer just need to use the API. But i have doubt for multi-coin wallet.
Oh yes; I noticed multi-coin wallets sometimes have no possibility to choose the servers for each coin, but I'm not into altcoins to be honest. After all this is under Bitcoin > Development & Technical Discussion and not in Altcoin section.

And I think using BIP157/158 instead, could be a big improvement.

It's definitely big improvement from privacy side, but most light wallet doesn't focus on privacy. Besides, i find there's decent difference on initial time to sync between Electrum (using Tor) and Wasabi wallet which could annoy some regular users.
Yeah; anything that offers more privacy in most cases needs more time; it's one of the big challenges in PIR as well: how to make it efficient. The trivial PIR schemes I presented in the beginning for example, either need a ton of communication or a ton of computation on server-side, both of which will make it slower.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
You are right, most light wallets (almost all phone wallets) are server dependent. There may be a hidden option in one or two that let user switch from their centralized server to the more decentralized Electrum nodes but majority will rely on their own service.
Of course, all light wallets (at least the ones I tried) have a default Electrum server. But to my knowledge, they do almost all use Electrum protocol.

That makes sense if you tried BTC light wallet since both Electrum protocol and server are open-source, since the developer just need to use the API. But i have doubt for multi-coin wallet.

And I think using BIP157/158 instead, could be a big improvement.

It's definitely big improvement from privacy side, but most light wallet doesn't focus on privacy. Besides, i find there's decent difference on initial time to sync between Electrum (using Tor) and Wasabi wallet which could annoy some regular users.
legendary
Activity: 3472
Merit: 10611
Of course, all light wallets (at least the ones I tried) have a default Electrum server.
Keep in mind that using the Electrum protocol but using a single "default Electrum server" without the option to change, is a point of centralization and the wallet can be referred to as "server dependent". These wallets offer 0 privacy and are open to certain attack vectors so they lack some security too.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Are you sure most of them actually use Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html) or they have similar protocol which simply send list of addresses to server?
You are right, most light wallets (almost all phone wallets) are server dependent. There may be a hidden option in one or two that let user switch from their centralized server to the more decentralized Electrum nodes but majority will rely on their own service.
Of course, all light wallets (at least the ones I tried) have a default Electrum server. But to my knowledge, they do almost all use Electrum protocol. And I think using BIP157/158 instead, could be a big improvement.
legendary
Activity: 3472
Merit: 10611
Are you sure most of them actually use Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html) or they have similar protocol which simply send list of addresses to server?
You are right, most light wallets (almost all phone wallets) are server dependent. There may be a hidden option in one or two that let user switch from their centralized server to the more decentralized Electrum nodes but majority will rely on their own service.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Finally, if I am satisfied, I will empty my Electrum Wallet and keep the Wasabi Wallet.

You might want to keep your Electrum wallet and it's backup in rare case someone send Bitcoin to your old address.

From the light clients I use, most seem to be based on the Electrum protocol, with the exception of Breez and Wasabi.

Are you sure most of them actually use Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html) or they have similar protocol which simply send list of addresses to server?
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Okay, so I read a few bits and pieces here and there about BIP157 and BIP158 today. It seems to me that using these should be a viable alternative for privately using a light wallet, while skipping Electrum altogether.

From the light clients I use, most seem to be based on the Electrum protocol, with the exception of Breez and Wasabi. They allow to connect to a Bitcoin node that runs with these new BIPs activated (and use a provided one by default which should be fine as well, due to these BIPs).
newbie
Activity: 22
Merit: 6
Thanks @BlackHatCoiner  and @o_e_l_e_o for the insights.

I've already installed the Wasabi Wallet, generated a wallet and written down the 12 seed words and the password. I have also generated a receive address. Next step is to restore the wallet on a second computer, to verify that Wasabi is actually capable of restoring a wallet using the seed words and the password. If everything goes well I will send some BTC to it. Finally, if I am satisfied, I will empty my Electrum Wallet and keep the Wasabi Wallet.
legendary
Activity: 2268
Merit: 18775
That's the feature I am looking for. Having a "paper backup" gives me some peace of mind. I will look further into it and hopefully I will drop electrum and switch to wasabi wallet.
That's certainly the right attitude to have. Backing up your wallet on to paper and keeping it completely non-digital is the best way to store a back up.

Wasabi will generate a 12 word seed phrase for you when you first set up your wallet. You need to write this down. This is compatible with the BIP39 standard for bitcoin wallets, and so you can use it (with your password as I explain below) to recover your wallet in to a fresh install of Wasabi or in to another piece of BIP39 compatible software.

You will also be asked to set up a password for your wallet. Wasabi is a little different from most other wallets, in that this password doesn't just encrypt the wallet saved on your device, but it is also used as a passphrase (also called seed extension or 13th word) in conjunction with your seed phrase to derive your wallet. That means that the seed phrase on its own cannot recover your wallet. You must also have this password in addition. Write it down too.

Ideally store these two written back ups separately.
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
Yeah, Wasabi enhances both your privacy and anonymity. One of its feature, that is called “CoinJoin” and which was firstly proposed in this very forum, combines yours inputs with strangers' and creates outputs that incommode the traceability.

As for anonymity, everything's done via Tor by default. Don't use VPNs, or anything you think will be better than just using Tor. It's more than fine by itself.
newbie
Activity: 22
Merit: 6
Thanks @ETFbitcoin and @o_e_l_e_o for your answers.

I checked the wasabi wallet and it says something like
Quote
The process will also generate a set of recovery words, which you’ll need in case you forget your password.

That's the feature I am looking for. Having a "paper backup" gives me some peace of mind. I will look further into it and hopefully I will drop electrum and switch to wasabi wallet.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Is there any way to mitigate the risks of using electrum.

Aside from what @o_e_l_e_o said, consider Wasabi Wallet (https://blog.wasabiwallet.io) which offer good balance between ease of use and privacy protection.

What is this spying about? Recoding the IP associated with the addresses and transactions?

Aside from IP, address and transaction, there are few other things such as
1. Date/time when you connect to server.
2. List of your addresses. It means the server know address A, B, ..., N is owned by same person.
3. Range of your Electrum version which could be determined from version of Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html).

Is there any way to mitigate the risks of using electrum.
The only way to fully mitigate the risks would be to run your own Electrum server and point your Electrum client at your own server.

Take note your privacy already weakened if you use same wallet, you'll need to move your Bitcoin to different wallet without weakening your privacy (e.g. move all Bitcoin from different address in single transaction).
legendary
Activity: 2268
Merit: 18775
Is there any way to mitigate the risks of using electrum.
The only way to fully mitigate the risks would be to run your own Electrum server and point your Electrum client at your own server.

Using a VPN maybe?
Using a VPN hides your true IP address from the Electrum servers you connect to. However, those Electrum servers can still see all the addresses in your wallet you are querying and therefore link them all together. Whether or not that compromises your privacy will depend on how and where you are spending your bitcoin.

In addition, using a VPN simply moves the knowledge of your real IP address from the Electrum server to your VPN server. A better solution (but still not 100%) would be to use Tor instead.
newbie
Activity: 22
Merit: 6

Unfortunately, already pretty much the only incentive for people to run electrum servers is to spy on the clients, and this eliminates that "benefit". Smiley

I am a bit worried about this. I am an occasional user of electrum. I like the fact that you can restore the walled with the seed and that is "instant on" in the sense that there is no need to download the blockchain. Is there any way to mitigate the risks of using electrum. Using a VPN maybe? What is this spying about? Recoding the IP associated with the addresses and transactions?

Thanks in advance.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
n0nce gave an informal description of a IT-PIR like technique.  The way he described it though doesn't provide much privacy, because his description makes it sound like he's only asking for a couple distractor addresses. I think that's mostly just an artifact of the description.  In IT-PIR every query normally operates on the entire database.
Yeah; to get information-theoretical security, every query needs to ask for a random (looking) set of items, on average half the number of items in the database.
To make it simple, I showed it using random bit vectors and 'constructing' the last one so that the bitwise XOR of all vectors yields just one address.

In theory, in true multi-server-PIR you actually construct fully random queries for all but one server, then build that query so that the results of all servers XORed together result in the 'actual query'.

For example, in a database with 8 elements, where element 3 is the one of desire, each server gets queries as follows:
  • A: 01101000 (random 8-bit string)
  • B: 10010001 (other random 8-bit string)
  • C: 11011001 (calculated like: 00100000 + (A + B))

If a particular SQL query accesses a particular record, how would this be hidden from the server hosting the DB? If someone queries a potential address that doesn’t exist in the DB (because it has never received a transaction)(via a where type clause), how would this be hidden from the server?
Check my above quote: you don't just query one record. In my toy example, the client queries server A with 3 out of 8 items, server B with 3 out of 8 and server B with 5 out of 8 items. On average, you query half of the database. The element of desire is the one at index 2 (third item); two of the three servers don't even get queried in this case, for that item. So they learn absolutely nothing from those queries, which is why they gain absolutely no advantage and the security is indeed information-theoretical in nature.

In this cryptosystem the public key is an RSA number (the product of two large primes).  The main property of the GM cryptosystem is that if you know the private key (the factorization) you know the order of the multiplicative group mod P. And if you know the order, you can easily tell if a number is a quadratic residue (if it has a sqrt mod P) by computing the jacobi symbol which can be done with an exponentiation ladder or by using the extended euclidean algorithm. Under the assumption of the cryptosystem if you don't know the order you can't tell if a number is a QR or not.

GM is useful because if A=B*C (mod P)  then QR(A) = QR(B) xor QR(C).  So essentially it is a cryptosystem where I can give you encrypted values and you can apply an arbitrary set of XORs on them (and xors on the results of earlier xors) and also xor them with 1 (by squaring them) and then I can decrypt the result of your computation by using my private key to test the QR-ness of the result.
Goldwasser-Micali is a great option as well! It doesn't give information-theoretical security, but computational security which is definitely enough for all intents and purposes as of now. It has its drawbacks as well; but ditching IT-'requirement' definitely opens up more options for more efficient algorithms. However, they are usually a bit trickier to follow along, so I just presented the 'original' IT-PIR scheme as proposed by Chor in '95, since I find it very easy and natural to understand & mostly wanted to get the idea out there.
staff
Activity: 4326
Merit: 8951
I admit I've given more attention on CPIR (because I think the sybil problem makes the IT assumptions not that useful), so I can't off the top of my head give you a concrete construction for a toy IT PIR that has good communications efficiency.

The thing I think you are probably missing most doesn't require you understand the PIR at that level of detail:

What PIR tools really give you is just an "oblivious array":  An array of (e.g.) 100 byte entries of known size, and the PIR lets the client query any entry without the server(s) learning which entry the client queried.

From this alone you can construct your private "database": E.g. to do a prefix match you might require that the array be sorted and have the client perform a bisection search on the array using log(N) oblivious queries.  Any database operation you can imagine can be constructed this way, but obviously they'll only be efficiency if they don't access to many cells. Smiley

While I don't recall what the concrete construction looks like for IT PIR,  I can give a simple-ish (or at least from memory) example of a how a basic CPIR protocol is constructed using the the Goldwasser-Micali (GM) cryptosystem:

In this cryptosystem the public key is an RSA number (the product of two large primes).  The main property of the GM cryptosystem is that if you know the private key (the factorization) you know the order of the multiplicative group mod P. And if you know the order, you can easily tell if a number is a quadratic residue (if it has a sqrt mod P) by computing the jacobi symbol which can be done with an exponentiation ladder or by using the extended euclidean algorithm. Under the assumption of the cryptosystem if you don't know the order you can't tell if a number is a QR or not.

GM is useful because if A=B*C (mod P)  then QR(A) = QR(B) xor QR(C).  So essentially it is a cryptosystem where I can give you encrypted values and you can apply an arbitrary set of XORs on them (and xors on the results of earlier xors) and also xor them with 1 (by squaring them) and then I can decrypt the result of your computation by using my private key to test the QR-ness of the result.

For discussion, we'll imagine that our private database is an array of 2^16 bits which you have and I want to query some particular bit without telling you which.  You arrange the database into a 256x256 square array, and the bit I want to query is at position [i,j].    I pick 256 random numbers mod P such that all of them are quadratic residues, except for the one at position j, Lets call them:  q_0 to q_255

I send you my public key P and these 256 q values. For each row you make a copy of the q values and square (mod P) the value when the bit in that cell is 0, then compute the product (mod P) of all the processed values for that row.  At the end you have 256 values-- one for each row, you send them to me.

reply_row = product(data_(row,col) ? q_col : q_col^2)

I ignore all but the i-th one, and then because I know the private key I can test if the value in question is a 0 by checking if it's a QR or not (because of the above mentioned: QR(A) = QR(B) xor QR(C)).

Now this version is inefficient in that you send me back 256 values, but I only pay attention to the i-th one.   This inefficiency can be fixed by applying the scheme recursively (using the "reply" products as the table being queried).

Extending the scheme to more than 1-bit-per-cell is simple:  You still send the same query, but apply it in parallel to as 1-bit-tables as your message has bits.

Main downside is doing all this mod P work (where P has to be big enough to make it unfactorable!) is slow, and also the queries are size sqrt(table_cells) (with the recursion the response can be as small as you like).
copper member
Activity: 2996
Merit: 2374
There are multiple kinds of PIR.

Information theoretic PIR is


n0nce gave an informal description of a IT-PIR like technique. 
I read some papers on the topic, but I don’t think I have a clear understanding of how IT PIR works.

If a particular SQL query accesses a particular record, how would this be hidden from the server hosting the DB? If someone queries a potential address that doesn’t exist in the DB (because it has never received a transaction)(via a where type clause), how would this be hidden from the server?
sr. member
Activity: 1190
Merit: 469

Percy implements both kinds (and can combine them too)-- and its pages describe the capabilities in great detail.  It might be too technical for some people, --- but that's okay because no one is obligated to give an opinion here if the material is beyond your interest or patience to figure out. Smiley


This forum is lucky to had someone as knowledgeable as mr. Maxwell. He's a legend in bitcoin.

staff
Activity: 4326
Merit: 8951
There are multiple kinds of PIR.

Computational PIR is based on asymmetric cryptography. With CPIR you query one server and it learns nothing about what you're querying. But the downside is that it takes a lot of CPU time.

Information theoretic PIR is based on threshold secret sharing. With IT-PIR you query M servers and no one learns anything about your queries so long as N of the M don't collude.  IT PIR is faster, but it's potentially vulnerable to sybil attack (you think you're connecting to N independent parties but they're not really independent).

Percy implements both kinds (and can combine them too)-- and its pages describe the capabilities in great detail.  It might be too technical for some people, --- but that's okay because no one is obligated to give an opinion here if the material is beyond your interest or patience to figure out. Smiley

n0nce gave an informal description of a IT-PIR like technique.  The way he described it though doesn't provide much privacy, because his description makes it sound like he's only asking for a couple distractor addresses. I think that's mostly just an artifact of the description.  In IT-PIR every query normally operates on the entire database.
sr. member
Activity: 1190
Merit: 469

It is increasing the privacy. Did you read about PIR at all so far?
Yeah I glanced over it but you didn't really give the most concrete simple explanation.

Quote
Just think about it: in the scheme I suggested (which seems unpractical to implement), in theory the privacy is what we call 'perfect'. The server learns nothing at all when receiving the query. The query would be a bitstring of address indexes (I know, such a thing doesn't exist), of which the client wants the balances. Since they are chosen at random, on average half the bits are set, so half of the addresses are selected and their balances retreived, then XORed and finally returned. Any one server receives a list of addresses that has 0.5 probability of actually containing the address the client is interested in, so it doesn't even get an advantage over a random guesser of 50%, but no advantage at all. Perfectly secure.

if you were making all these queries to the same server over and over then yeah but since you're making them to independent servers that have no knowledge of each other and the requests they are receiving then that's different I would think.

what you're basically trying to argue is that by polluting the network with enough requests to different servers you can conceal the actual information you're requesting. i doubt it's really accomplishing that or anything useful. Cheesy
staff
Activity: 4326
Merit: 8951
Unfortunately, already pretty much the only incentive for people to run electrum servers is to spy on the clients, and this eliminates that "benefit". Smiley
So you mean, if Electrum servers can't spy anymore due to such mechanisms, they lose the whole incentive to run such a server?

Exactly, and especially no incentive to run PIR version that would be many times more expensive to run.

It's even hard to have the users pay for access-- because without access they can't spend their Bitcoins!  (and plus, most users would not want to pay a fair price when they can get wallet services from someone that will spy on them but lie about it for free)
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
~
In order to make a query, you need to send the query to the server. So if you are asking about a million addresses, you will need to send a list of a million addresses to the server.

An electrum client initially has no information about any transaction or any address. So it will need to obtain information about addresses from an electrum server. This is important because it has no way of querying unrelated addresses that have received transactions.
Yes, it's true, it's not trivial indeed. For now, I just use my own Electrum node (guide coming soon), but there must be a better solution for the majority of users which does not run a full node somehow.


In this scheme, the privacy is guaranteed by the fact that the client asks various servers for on average half of the DB entries for each server. Since the client queries multiple servers, on some servers this set of entries does include the item of interest, while on others it does not. The distribution is again 50:50 so no server gains any information (not even a 50% probability reduction when guessing) about which item is actually queried.

unless you have a way of requesting information that the server doesn't know what it is serving then it's not increasing the privacy at all. also, your solution buts a burden on other servers to have to serve data that you're not really interested in you're just using them and wasting their resources/bandwidth.
It is increasing the privacy. Did you read about PIR at all so far? Just think about it: in the scheme I suggested (which seems unpractical to implement), in theory the privacy is what we call 'perfect'. The server learns nothing at all when receiving the query. The query would be a bitstring of address indexes (I know, such a thing doesn't exist), of which the client wants the balances. Since they are chosen at random, on average half the bits are set, so half of the addresses are selected and their balances retreived, then XORed and finally returned. Any one server receives a list of addresses that has 0.5 probability of actually containing the address the client is interested in, so it doesn't even get an advantage over a random guesser of 50%, but no advantage at all. Perfectly secure.
sr. member
Activity: 1190
Merit: 469

In this scheme, the privacy is guaranteed by the fact that the client asks various servers for on average half of the DB entries for each server. Since the client queries multiple servers, on some servers this set of entries does include the item of interest, while on others it does not. The distribution is again 50:50 so no server gains any information (not even a 50% probability reduction when guessing) about which item is actually queried.



unless you have a way of requesting information that the server doesn't know what it is serving then it's not increasing the privacy at all. also, your solution buts a burden on other servers to have to serve data that you're not really interested in you're just using them and wasting their resources/bandwidth.
copper member
Activity: 2996
Merit: 2374
Deterministic light wallet software is going to ask for information about addresses that have never been used, but will be used in the future. You can ask about every address that has ever received a transaction to hide your previously used addresses, but if you ask about an address that has never been used, and that address subsequently receives a transaction, chances are that address belongs to you. The only way to avoid asking for information about addresses that have never been used but will be used in the future is to ask for information about every address ever used, but if you do this, the server will essentially return the entire blockchain.
That's a very good point. That's why to perform traditional multi-server PIR, you would need to query for on average half of all possible addresses per server and get an XOR of the balances from all the servers so from each server you just receive a few bytes. This XOR makes the difference between downloading just a few bytes and downloading the entire blockchain. However, I'm not sure there exists a way for Bitcoin Core and Electrum to quickly get millions of address balances at once. Probably, the right indexes for this don't exist yet. However, other PIR schemes could be used which require less lookups, I think.
In order to make a query, you need to send the query to the server. So if you are asking about a million addresses, you will need to send a list of a million addresses to the server.

When using a database, any column you are filtering by, you should have an index. Since electrum servers are able to return transactions about individual transactions and about individual addresses, their database contains indexes on both columns.

An electrum client initially has no information about any transaction or any address. So it will need to obtain information about addresses from an electrum server. This is important because it has no way of querying unrelated addresses that have received transactions.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Deterministic light wallet software is going to ask for information about addresses that have never been used, but will be used in the future. You can ask about every address that has ever received a transaction to hide your previously used addresses, but if you ask about an address that has never been used, and that address subsequently receives a transaction, chances are that address belongs to you. The only way to avoid asking for information about addresses that have never been used but will be used in the future is to ask for information about every address ever used, but if you do this, the server will essentially return the entire blockchain.
That's a very good point. That's why to perform traditional multi-server PIR, you would need to query for on average half of all possible addresses per server and get an XOR of the balances from all the servers so from each server you just receive a few bytes. This XOR makes the difference between downloading just a few bytes and downloading the entire blockchain. However, I'm not sure there exists a way for Bitcoin Core and Electrum to quickly get millions of address balances at once. Probably, the right indexes for this don't exist yet. However, other PIR schemes could be used which require less lookups, I think.

This isn't hard fundamentally, -- the rocket science programming has already been done: http://percy.sourceforge.net/
I know, right, that's why it would be so cool to see it 'in the field'! Cheesy
I'll definitely look into the schemes that were used in that implementation; maybe they address some of the issues in the first proposed, relatively simple scheme.

But the problem is that the computational cost the server is just astronomical.
Riight, I see. It's really a pity, because in cryptography we work in the 'ideal world' where XOR is free, databases are fast, etc... but then the challenge is really to implement it with good efficiency.

Unfortunately, already pretty much the only incentive for people to run electrum servers is to spy on the clients, and this eliminates that "benefit". Smiley
So you mean, if Electrum servers can't spy anymore due to such mechanisms, they lose the whole incentive to run such a server?
staff
Activity: 4326
Merit: 8951
This isn't hard fundamentally, -- the rocket science programming has already been done: http://percy.sourceforge.net/

Just make a database of transactions with spv proofs.  And also make another databases of addresses to tx-database-indexes.  Maybe staple a cleartext index of the address database to it to reduce the number of bisections needed.  Feed that to percy++.

Bitcoin-core has rpcs that will import txn with their proofs into the wallet-- created partially with this usage in mind!

But the problem is that the computational cost the server is just astronomical.  Under most reasonable cost metrics it's "cheaper" to just send the client the whole database.  This only becomes reasonable if you're assuming extremely expensive wireless data rates or what not, and then you still have the problem of who is going to pay for the server.

Unfortunately, already pretty much the only incentive for people to run electrum servers is to spy on the clients, and this eliminates that "benefit". Smiley
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Electrum's use of bloom filters does not even address the issue without having an error rate that results in massive amounts of data being sent to the light node.

1. Electrum doesn't use bloom filter, it simply send list of address to Electrum server.
2. Bloom Filter (BIP 37) is broken and now succeed by BIP 157/158.
copper member
Activity: 2996
Merit: 2374
In this scheme, the privacy is guaranteed by the fact that the client asks various servers for on average half of the DB entries for each server. Since the client queries multiple servers, on some servers this set of entries does include the item of interest, while on others it does not. The distribution is again 50:50 so no server gains any information (not even a 50% probability reduction when guessing) about which item is actually queried.

Deterministic light wallet software is going to ask for information about addresses that have never been used, but will be used in the future. You can ask about every address that has ever received a transaction to hide your previously used addresses, but if you ask about an address that has never been used, and that address subsequently receives a transaction, chances are that address belongs to you. The only way to avoid asking for information about addresses that have never been used but will be used in the future is to ask for information about every address ever used, but if you do this, the server will essentially return the entire blockchain.

The number of addresses that have never been used is simply too large to hide addresses that have never received any transactions. There could be 100,000 light nodes each make queries to full nodes that take a year to transmit fully, and if proper RNG is used, it would be possible to deduct which addresses belong to which light node that receive transactions after the query is made.

Electrum's use of bloom filters does not even address the issue without having an error rate that results in massive amounts of data being sent to the light node. It is incorrect to say that electrum uses bloom filters.
legendary
Activity: 3472
Merit: 10611
IMO downloading a bunch of random blocks and discarding them after the addresses were extracted is a lot cheaper.
Surely the problem with this is if I send a query to a server asking for the balance of 1000 addresses last used in block X, 1000 addresses last used in block Y, and one address last used in block Z, then it is entirely obvious which address is mine.

Would a better option just be to connect to 20 different Electrum servers and ask each server to send you 1000 random addresses with a balance? (Change thr numbers as you see fit.)
We could connect to 20 different Electrum nodes too.

But to improve my original idea you could download 3000 blocks and not waste that much traffic either.
The SPV client sends the full node a SendCmpct message.
Then by having the headers already sends a GetData message for certain blocks.
The full node will return a compact block with header and short IDs (much smaller than sending the block itself).
The SPV client can now ask for any number of [full] transactions from that block by sending a GetBlockTxn message. (could be 1 tx per block).

The only issue with this (apart from putting pressure on the node to look up these blocks, constructing compact message and sending) is that I believe bitcoin core implementation of BIP-152 doesn't return old blocks (depth=5) as compact.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Well, looking up balances of any address is possible in core if the balance is non-zero, since that's the UTX0 set.

But indeed there's no address index in bitcoin core, if you want to find the transaction history - they have a transaction index, but oddly won't do an address index. Meh.
legendary
Activity: 2268
Merit: 18775
It's not practical option because Electrum server have request limit to prevent DDoS attack.
Well sure, but to implement any of this suggestion will require lots of new functionality to be added to Electrum servers, so there is no reason the DDoS protection couldn't be addressed too. Given that some people run Electrum wallets with >1000 addresses, asking a server for a one off batch of 1000 random addresses doesn't seem too onerous, especially since you wouldn't need a new batch of addresses everybtime you opened the wallet.
legendary
Activity: 2268
Merit: 18775
IMO downloading a bunch of random blocks and discarding them after the addresses were extracted is a lot cheaper.
Surely the problem with this is if I send a query to a server asking for the balance of 1000 addresses last used in block X, 1000 addresses last used in block Y, and one address last used in block Z, then it is entirely obvious which address is mine.

Would a better option just be to connect to 20 different Electrum servers and ask each server to send you 1000 random addresses with a balance? (Change thr numbers as you see fit.)
legendary
Activity: 3472
Merit: 10611
An idea... The UTXO set is only around 4-5GB. I would easily allow a mobile wallet to use 5GB out of my... 256? 512? GB of storage, which I don't use anyway, if that's required to get perfect privacy.
The main point of using a SPV client is to reduce the amount of data downloaded and stored. I don't think users would go for something like that which they also have to update every block. There is also the additional issue of trust which can be mitigated to some extent if multiple nodes were used but it still is a big issue.

IMO downloading a bunch of random blocks and discarding them after the addresses were extracted is a lot cheaper.

Quote
I'm not sure; can't it do a single DB request which asks for e.g. 1 million items instead of doing 1M requests at one item each? It's a while that I worked with large databases, but I'm pretty sure such 'aggregated' queries are pretty fast. DMBS could actually even allow to query an XOR of the specified items directly, which should be even more performant.
Someone else should comment on this or it has to be tested, my knowledge of databases is very limited.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
For example: user wants to query the balance of their address 'B'. (this could be transaction information or any other blockchain information as well)
As @ymgve2 the biggest problem with this idea is that a SPV client doesn't have the blockchain so it doesn't know which addresses were used already so it can not produce addresses A and C and if they are produced randomly the whole process becomes pointless.
Well, in standard 'linear summation PIR' though, the clients also don't have the database. So they also don't yet know about which item has which ID. The theoretical examples use indexes of datasets as a simplification; of course when implementing, it has to be possible to query for addresses (in Bitcoin scenario) or generic SQL queries (in database scenario) under PIR.
Maybe however, it requires a more sophisticated PIR scheme, that's possible as well... Smiley

An idea... The UTXO set is only around 4-5GB. I would easily allow a mobile wallet to use 5GB out of my... 256? 512? GB of storage, which I don't use anyway, if that's required to get perfect privacy.

It's not enough to ask 20 addresses, that's the issue.
The idea is that you could ask for thousands or millions of addresses per server. XOR is insanely performant on most databases.
You are forgetting that the SPV client doesn't just need the balance of its address, it needs the history. Meaning if you query 1 million used addresses the Electrum node has to send back at least a million transactions. Assuming 200 bytes on average, that is at least 200 MB data which is ~4 times bigger that the entire block headers file (some addresses have more transactions and some transactions are much bigger).
Yes, it's true; I have to think about this.

This is also a big burden on the Electrum node as it would have to look up inside its database a million times per request.
I'm not sure; can't it do a single DB request which asks for e.g. 1 million items instead of doing 1M requests at one item each? It's a while that I worked with large databases, but I'm pretty sure such 'aggregated' queries are pretty fast. DMBS could actually even allow to query an XOR of the specified items directly, which should be even more performant.

Was something like this maybe attempted already?
BIP 157 & 158 have similar goal, but it works differently. One of wallet which use it is Wasabi Wallet which also use Tor by default.
That sounds very interesting; I'm happy about any other kind of approach as well! As long as SPV clients don't need to continue sending cleartext queries to servers forever... Cheesy
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Was something like this maybe attempted already?

BIP 157 & 158 have similar goal, but it works differently. One of wallet which use it is Wasabi Wallet which also use Tor by default.
legendary
Activity: 3472
Merit: 10611
For example: user wants to query the balance of their address 'B'. (this could be transaction information or any other blockchain information as well)
As @ymgve2 the biggest problem with this idea is that a SPV client doesn't have the blockchain so it doesn't know which addresses were used already so it can not produce addresses A and C and if they are produced randomly the whole process becomes pointless.
One solution could be to download one or more random blocks from (regular) full nodes by sending a getdata message (SPV client already has the headers) then use a bunch of random addresses from there.

It's not enough to ask 20 addresses, that's the issue.
The idea is that you could ask for thousands or millions of addresses per server. XOR is insanely performant on most databases.
You are forgetting that the SPV client doesn't just need the balance of its address, it needs the history. Meaning if you query 1 million used addresses the Electrum node has to send back at least a million transactions. Assuming 200 bytes on average, that is at least 200 MB data which is ~4 times bigger that the entire block headers file (some addresses have more transactions and some transactions are much bigger).

This is also a big burden on the Electrum node as it would have to look up inside its database a million times per request.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
How does a brand new client with no information about the chain know which "dummy" addresses to query for? It can't ask for the content of randomly generated addresses, because then it would be pretty obvious which one is the real one since only one of the queried addresses were ever seen on the blockchain.
I am not sure yet on the implementation; I was asking mostly about the idea / concept and I'm happy to hear any suggestions! In the theoretical / classical PIR examples, the queries are just bit vectors. All the indexes with value '1', are elements to be queried. Of course, Bitcoin isn't a database like this, where every utxo has a specific database index. However, in classical databases we also don't just query with the item ID either, but instead queries consist of various properties that need to match, and PIR does work on such databases. So I'm sure it can be implemented in Bitcoin.

Also the performance degradation from querying the balance of 20x or 50000x or whatever addresses for each real one would make this useless in practice.
Well, on regular databases at least, querying one or multiple datasets doesn't incur much overhead over querying just one item; and database XOR is also quite fast. If it wasn't, the whole research field of PIR wouldn't exist, after all. I wouldn't say it's useless without even having tried it. Especially multi-server PIR would be quite easy to setup and could be experimented with to really get tangible results.
full member
Activity: 162
Merit: 230
How does a brand new client with no information about the chain know which "dummy" addresses to query for? It can't ask for the content of randomly generated addresses, because then it would be pretty obvious which one is the real one since only one of the queried addresses were ever seen on the blockchain.

Also the performance degradation from querying the balance of 20x or 50000x or whatever addresses for each real one would make this useless in practice.
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Maybe, due to my tiredness today, but can you explain me a little bit what's this private information retrieval or how it works in simple terms, cause I make no sense from the wikipedia article. The only thing I've understood is that you can retrieve an item from a server without having to reveal this server-side; I'm not sure for this either.
Excellent! Smiley You can retrieve a database entry without revealing to the server which entry you are looking for. That's exactly what PIR is all about!

I'm trying to acknowledge this with common sense and fail at the how the hell part. How can you really be sure that what you're sending to the server cannot be seen by it?
Yep, some schemes are secure against computationally unbounded attackers, while others are only secure against computationally bounded attackers, but in practice they are all pretty good. The more sophisticated ones use OT etc., which is often needed to overcome the issue of having multiple non-colluding servers. The cool thing in Bitcoin is that we DO already have such servers in place. So we can use one of the much simpler schemes like the 'original' PIR concept from '95.

In this scheme, the privacy is guaranteed by the fact that the client asks various servers for on average half of the DB entries for each server. Since the client queries multiple servers, on some servers this set of entries does include the item of interest, while on others it does not. The distribution is again 50:50 so no server gains any information (not even a 50% probability reduction when guessing) about which item is actually queried.

Each server returns an XOR of the queried items; when XORing all results together, the result is the one single 'item of interest', since the queries are constructed such that all other items cancel each other out through XOR.

For example, in a database with 8 elements, where element 3 is the one of desire, each server gets queries as follows:
  • A: 01101000 (random 8-bit string)
  • B: 10010001 (other random 8-bit string)
  • C: 11011001 (calculated like: 00100000 + (A + B))

This gives very high security (I think information-theoretical) because the number of 1's and 0's are on average equally distributed.
So in theory querying 2 servers like this would result in a result 'download' that has the size of the balances of all Bitcoin addresses (on average half of all addresses per server) Grin

Maybe better explanation of the above example:
Code:
A: 01101000 => server A is queried for the second, third and 5th database item
B: 10010001 => server B is queried for items 1, 4, 8
C: 11011001 => server C is queried for items 1, 2, 4, 5, 8

01101000 was chosen uniformly at random, 10010001 as well.
We want item 3, so the query would be 00100000.

Construct C:
    01101000
XOR 10010001
XOR 00100000
------------------
    11011001

The servers return the queried item, but after XORing them. In a database with 8 items it doesn't become quite obvious yet, but think of it having 100.000 items. On average, every server is queried for ~50.000 items, performs an XOR of them and returns that to the client.

XORing the files (or address balances) is fine because the query C is constructed so that XORing all of the queries (or all of the resulting datasets) nets to just 'item 3'.
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
Maybe, due to my tiredness today, but can you explain me a little bit what's this private information retrieval or how it works in simple terms, cause I make no sense from the wikipedia article. The only thing I've understood is that you can retrieve an item from a server without having to reveal this server-side; I'm not sure for this either.

The article requires you to know terms such as oblivious transfer or what's the quadratic residuosity problem which I've no idea.

I'm trying to acknowledge this with common sense and fail at the how the hell part. How can you really be sure that what you're sending to the server cannot be seen by it? Let's assume you want to know the balance of an address. In order to find out, you'll have to send it to a server and the server will return you the amount based on the search it has made. If it doesn't know the address how will it search its database?
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
Could this not be done at the client side?
Electrum has a list of servers, it asks 20 servers for information 20 addresses and just gives the user the information for the 5 they need. (made up numbers)
It's not enough to ask 20 addresses, that's the issue.
The idea is that you could ask for thousands or millions of addresses per server. XOR is insanely performant on most databases.

In theory, in true multi-server-PIR you actually construct fully random queries for all but one server, then build that query so that the results of all servers XORed together result in the 'actual query'.

For example, in a database with 8 elements, where element 3 is the one of desire, each server gets queries as follows:
  • A: 01101000 (random 8-bit string)
  • B: 10010001 (other random 8-bit string)
  • C: 11011001 (calculated like: 00100000 + (A + B))

This gives very high security (I think information-theoretical) because the number of 1's and 0's are on average equally distributed.
So in theory querying 2 servers like this would result in a result 'download' that has the size of the balances of all Bitcoin addresses (on average half of all addresses per server) Grin

I'm not sure if it is plausible to query each server for on average half of all Bitcoin addresses (even if just the XOR of them has to be transmitted). So maybe another PIR scheme may be better suited, or it may be deemed 'good enough' to just query each server for e.g. max. of 10.000 addresses or something like that.

But, it does generate a lot more traffic for the servers. Which most (all?) people are providing for free so there is that.
Well, think about the mobile wallet users Grin I don't think refreshing your wallet balance and waiting for a 5GB download (made up number Wink) would be much fun and cheap on the monthly data plan ^^
legendary
Activity: 3500
Merit: 6320
Crypto Swap Exchange
Could this not be done at the client side?
Electrum has a list of servers, it asks 20 servers for information 20 addresses and just gives the user the information for the 5 they need. (made up numbers)

Then it does internally what it needs to do a transaction and sends that through a 21st server.

The local desktop keeps a record of what server it asked what and what server it sent what thought.

Not as robust as full PIR but with enough clients asking enough questions it really makes data mining harder.

But, it does generate a lot more traffic for the servers. Which most (all?) people are providing for free so there is that.

-Dave
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
So, I was thinking a lot about privacy for users that don't run a personal Full Node + Electrum Server themselves.
Of course, with only ~10.000 nodes and millions of Bitcoin users globally, it means the vast majority of Bitcoin users is sacrificing privacy by querying (sometimes fixed, built-in) Electrum servers.

In cryptography there is the concept of PIR / Private Information Retrieval.
The trivial PIR scheme is the one where the client downloads the whole database and performs the query locally, which would equate to running their own full node.
However, since '95 with PIR there are much smarter ways to achieve private information retrieval which I think would highly benefit the majority of Bitcoin users.

The earliest, simpler (but more sophisticated than trivial PIR) PIR concepts actually rely on multiple, non-colluding servers with exact copies of the database. This is exactly what we have in Bitcoin already. Tons of nodes with an exact copy of the blockchain. This means it might be possible to implement this mostly client-side with maybe little changes to ElectrumX. Many recent improvements in PIR are mostly aimed to reduce the number of needed servers; however, we don't need to worry about this at all which would make the whole thing much simpler.

The core idea of multi-server PIR is as follows: the client doesn't query just one (Electrum) server, but instead multiple, for various different files (addresses). Only one of these files (addresses) is the one actually interesting for the client. That way any one server cannot assume that any of the datasets queried actually belong to the client.

For example: user wants to query the balance of their address 'B'. (this could be transaction information or any other blockchain information as well)
Instead of querying one server for the balance of address B, it queries multiple servers like this:
  • 'Server 1, give me balance of addresses A, B, C'
  • 'Server 2, give me balance of addresses A, C'.

Now, mostly to save on communication(I think), the servers XOR the binary representation of the balances (in the example: balanceA+balanceB+balanceC and balanceA+balanceC) locally before sending; then the client will XOR the results: (balanceA+balanceB+balanceC) + (balanceA+balanceC). The unwanted, randomly chosen addresses' balances cancel each other out and they gain the balance of address B.

To implement it into ElectrumX, and retain backwards compatibility to non-PIR queries, a PIR query would just need some kind of extra indicator that makes it clear the client isn't just querying multiple addresses for which they want the separate balances in clear; instead, that the client wants the XOR of those balances.
It would be possible to just get the balances and discard the ones a client is not actually interested in, as well, but especially with larger queries the communication overhead would get quite big. XOR of the query results solves this issue.

What do you guys think about implementing such thing in Bitcoin / ElectrumX? Was something like this maybe attempted already? Other suggestions for improvements of the idea? Or is it all superfluous for some reason after all? Glad for any input on the topic!
Jump to: