Pages:
Author

Topic: PIR / Private Information Retrieval - Privacy for light wallets? - page 2. (Read 797 times)

legendary
Activity: 3472
Merit: 10611
Are you sure most of them actually use Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html) or they have similar protocol which simply send list of addresses to server?
You are right, most light wallets (almost all phone wallets) are server dependent. There may be a hidden option in one or two that let user switch from their centralized server to the more decentralized Electrum nodes but majority will rely on their own service.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Finally, if I am satisfied, I will empty my Electrum Wallet and keep the Wasabi Wallet.

You might want to keep your Electrum wallet and it's backup in rare case someone send Bitcoin to your old address.

From the light clients I use, most seem to be based on the Electrum protocol, with the exception of Breez and Wasabi.

Are you sure most of them actually use Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html) or they have similar protocol which simply send list of addresses to server?
hero member
Activity: 882
Merit: 5829
not your keys, not your coins!
Okay, so I read a few bits and pieces here and there about BIP157 and BIP158 today. It seems to me that using these should be a viable alternative for privately using a light wallet, while skipping Electrum altogether.

From the light clients I use, most seem to be based on the Electrum protocol, with the exception of Breez and Wasabi. They allow to connect to a Bitcoin node that runs with these new BIPs activated (and use a provided one by default which should be fine as well, due to these BIPs).
newbie
Activity: 22
Merit: 6
Thanks @BlackHatCoiner  and @o_e_l_e_o for the insights.

I've already installed the Wasabi Wallet, generated a wallet and written down the 12 seed words and the password. I have also generated a receive address. Next step is to restore the wallet on a second computer, to verify that Wasabi is actually capable of restoring a wallet using the seed words and the password. If everything goes well I will send some BTC to it. Finally, if I am satisfied, I will empty my Electrum Wallet and keep the Wasabi Wallet.
legendary
Activity: 2268
Merit: 18587
That's the feature I am looking for. Having a "paper backup" gives me some peace of mind. I will look further into it and hopefully I will drop electrum and switch to wasabi wallet.
That's certainly the right attitude to have. Backing up your wallet on to paper and keeping it completely non-digital is the best way to store a back up.

Wasabi will generate a 12 word seed phrase for you when you first set up your wallet. You need to write this down. This is compatible with the BIP39 standard for bitcoin wallets, and so you can use it (with your password as I explain below) to recover your wallet in to a fresh install of Wasabi or in to another piece of BIP39 compatible software.

You will also be asked to set up a password for your wallet. Wasabi is a little different from most other wallets, in that this password doesn't just encrypt the wallet saved on your device, but it is also used as a passphrase (also called seed extension or 13th word) in conjunction with your seed phrase to derive your wallet. That means that the seed phrase on its own cannot recover your wallet. You must also have this password in addition. Write it down too.

Ideally store these two written back ups separately.
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
Yeah, Wasabi enhances both your privacy and anonymity. One of its feature, that is called “CoinJoin” and which was firstly proposed in this very forum, combines yours inputs with strangers' and creates outputs that incommode the traceability.

As for anonymity, everything's done via Tor by default. Don't use VPNs, or anything you think will be better than just using Tor. It's more than fine by itself.
newbie
Activity: 22
Merit: 6
Thanks @ETFbitcoin and @o_e_l_e_o for your answers.

I checked the wasabi wallet and it says something like
Quote
The process will also generate a set of recovery words, which you’ll need in case you forget your password.

That's the feature I am looking for. Having a "paper backup" gives me some peace of mind. I will look further into it and hopefully I will drop electrum and switch to wasabi wallet.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Is there any way to mitigate the risks of using electrum.

Aside from what @o_e_l_e_o said, consider Wasabi Wallet (https://blog.wasabiwallet.io) which offer good balance between ease of use and privacy protection.

What is this spying about? Recoding the IP associated with the addresses and transactions?

Aside from IP, address and transaction, there are few other things such as
1. Date/time when you connect to server.
2. List of your addresses. It means the server know address A, B, ..., N is owned by same person.
3. Range of your Electrum version which could be determined from version of Electrum protocol (https://electrumx-spesmilo.readthedocs.io/en/latest/protocol.html).

Is there any way to mitigate the risks of using electrum.
The only way to fully mitigate the risks would be to run your own Electrum server and point your Electrum client at your own server.

Take note your privacy already weakened if you use same wallet, you'll need to move your Bitcoin to different wallet without weakening your privacy (e.g. move all Bitcoin from different address in single transaction).
legendary
Activity: 2268
Merit: 18587
Is there any way to mitigate the risks of using electrum.
The only way to fully mitigate the risks would be to run your own Electrum server and point your Electrum client at your own server.

Using a VPN maybe?
Using a VPN hides your true IP address from the Electrum servers you connect to. However, those Electrum servers can still see all the addresses in your wallet you are querying and therefore link them all together. Whether or not that compromises your privacy will depend on how and where you are spending your bitcoin.

In addition, using a VPN simply moves the knowledge of your real IP address from the Electrum server to your VPN server. A better solution (but still not 100%) would be to use Tor instead.
newbie
Activity: 22
Merit: 6

Unfortunately, already pretty much the only incentive for people to run electrum servers is to spy on the clients, and this eliminates that "benefit". Smiley

I am a bit worried about this. I am an occasional user of electrum. I like the fact that you can restore the walled with the seed and that is "instant on" in the sense that there is no need to download the blockchain. Is there any way to mitigate the risks of using electrum. Using a VPN maybe? What is this spying about? Recoding the IP associated with the addresses and transactions?

Thanks in advance.
hero member
Activity: 882
Merit: 5829
not your keys, not your coins!
n0nce gave an informal description of a IT-PIR like technique.  The way he described it though doesn't provide much privacy, because his description makes it sound like he's only asking for a couple distractor addresses. I think that's mostly just an artifact of the description.  In IT-PIR every query normally operates on the entire database.
Yeah; to get information-theoretical security, every query needs to ask for a random (looking) set of items, on average half the number of items in the database.
To make it simple, I showed it using random bit vectors and 'constructing' the last one so that the bitwise XOR of all vectors yields just one address.

In theory, in true multi-server-PIR you actually construct fully random queries for all but one server, then build that query so that the results of all servers XORed together result in the 'actual query'.

For example, in a database with 8 elements, where element 3 is the one of desire, each server gets queries as follows:
  • A: 01101000 (random 8-bit string)
  • B: 10010001 (other random 8-bit string)
  • C: 11011001 (calculated like: 00100000 + (A + B))

If a particular SQL query accesses a particular record, how would this be hidden from the server hosting the DB? If someone queries a potential address that doesn’t exist in the DB (because it has never received a transaction)(via a where type clause), how would this be hidden from the server?
Check my above quote: you don't just query one record. In my toy example, the client queries server A with 3 out of 8 items, server B with 3 out of 8 and server B with 5 out of 8 items. On average, you query half of the database. The element of desire is the one at index 2 (third item); two of the three servers don't even get queried in this case, for that item. So they learn absolutely nothing from those queries, which is why they gain absolutely no advantage and the security is indeed information-theoretical in nature.

In this cryptosystem the public key is an RSA number (the product of two large primes).  The main property of the GM cryptosystem is that if you know the private key (the factorization) you know the order of the multiplicative group mod P. And if you know the order, you can easily tell if a number is a quadratic residue (if it has a sqrt mod P) by computing the jacobi symbol which can be done with an exponentiation ladder or by using the extended euclidean algorithm. Under the assumption of the cryptosystem if you don't know the order you can't tell if a number is a QR or not.

GM is useful because if A=B*C (mod P)  then QR(A) = QR(B) xor QR(C).  So essentially it is a cryptosystem where I can give you encrypted values and you can apply an arbitrary set of XORs on them (and xors on the results of earlier xors) and also xor them with 1 (by squaring them) and then I can decrypt the result of your computation by using my private key to test the QR-ness of the result.
Goldwasser-Micali is a great option as well! It doesn't give information-theoretical security, but computational security which is definitely enough for all intents and purposes as of now. It has its drawbacks as well; but ditching IT-'requirement' definitely opens up more options for more efficient algorithms. However, they are usually a bit trickier to follow along, so I just presented the 'original' IT-PIR scheme as proposed by Chor in '95, since I find it very easy and natural to understand & mostly wanted to get the idea out there.
staff
Activity: 4172
Merit: 8419
I admit I've given more attention on CPIR (because I think the sybil problem makes the IT assumptions not that useful), so I can't off the top of my head give you a concrete construction for a toy IT PIR that has good communications efficiency.

The thing I think you are probably missing most doesn't require you understand the PIR at that level of detail:

What PIR tools really give you is just an "oblivious array":  An array of (e.g.) 100 byte entries of known size, and the PIR lets the client query any entry without the server(s) learning which entry the client queried.

From this alone you can construct your private "database": E.g. to do a prefix match you might require that the array be sorted and have the client perform a bisection search on the array using log(N) oblivious queries.  Any database operation you can imagine can be constructed this way, but obviously they'll only be efficiency if they don't access to many cells. Smiley

While I don't recall what the concrete construction looks like for IT PIR,  I can give a simple-ish (or at least from memory) example of a how a basic CPIR protocol is constructed using the the Goldwasser-Micali (GM) cryptosystem:

In this cryptosystem the public key is an RSA number (the product of two large primes).  The main property of the GM cryptosystem is that if you know the private key (the factorization) you know the order of the multiplicative group mod P. And if you know the order, you can easily tell if a number is a quadratic residue (if it has a sqrt mod P) by computing the jacobi symbol which can be done with an exponentiation ladder or by using the extended euclidean algorithm. Under the assumption of the cryptosystem if you don't know the order you can't tell if a number is a QR or not.

GM is useful because if A=B*C (mod P)  then QR(A) = QR(B) xor QR(C).  So essentially it is a cryptosystem where I can give you encrypted values and you can apply an arbitrary set of XORs on them (and xors on the results of earlier xors) and also xor them with 1 (by squaring them) and then I can decrypt the result of your computation by using my private key to test the QR-ness of the result.

For discussion, we'll imagine that our private database is an array of 2^16 bits which you have and I want to query some particular bit without telling you which.  You arrange the database into a 256x256 square array, and the bit I want to query is at position [i,j].    I pick 256 random numbers mod P such that all of them are quadratic residues, except for the one at position j, Lets call them:  q_0 to q_255

I send you my public key P and these 256 q values. For each row you make a copy of the q values and square (mod P) the value when the bit in that cell is 0, then compute the product (mod P) of all the processed values for that row.  At the end you have 256 values-- one for each row, you send them to me.

reply_row = product(data_(row,col) ? q_col : q_col^2)

I ignore all but the i-th one, and then because I know the private key I can test if the value in question is a 0 by checking if it's a QR or not (because of the above mentioned: QR(A) = QR(B) xor QR(C)).

Now this version is inefficient in that you send me back 256 values, but I only pay attention to the i-th one.   This inefficiency can be fixed by applying the scheme recursively (using the "reply" products as the table being queried).

Extending the scheme to more than 1-bit-per-cell is simple:  You still send the same query, but apply it in parallel to as 1-bit-tables as your message has bits.

Main downside is doing all this mod P work (where P has to be big enough to make it unfactorable!) is slow, and also the queries are size sqrt(table_cells) (with the recursion the response can be as small as you like).
copper member
Activity: 2870
Merit: 2298
There are multiple kinds of PIR.

Information theoretic PIR is


n0nce gave an informal description of a IT-PIR like technique. 
I read some papers on the topic, but I don’t think I have a clear understanding of how IT PIR works.

If a particular SQL query accesses a particular record, how would this be hidden from the server hosting the DB? If someone queries a potential address that doesn’t exist in the DB (because it has never received a transaction)(via a where type clause), how would this be hidden from the server?
sr. member
Activity: 1036
Merit: 350

Percy implements both kinds (and can combine them too)-- and its pages describe the capabilities in great detail.  It might be too technical for some people, --- but that's okay because no one is obligated to give an opinion here if the material is beyond your interest or patience to figure out. Smiley


This forum is lucky to had someone as knowledgeable as mr. Maxwell. He's a legend in bitcoin.

staff
Activity: 4172
Merit: 8419
There are multiple kinds of PIR.

Computational PIR is based on asymmetric cryptography. With CPIR you query one server and it learns nothing about what you're querying. But the downside is that it takes a lot of CPU time.

Information theoretic PIR is based on threshold secret sharing. With IT-PIR you query M servers and no one learns anything about your queries so long as N of the M don't collude.  IT PIR is faster, but it's potentially vulnerable to sybil attack (you think you're connecting to N independent parties but they're not really independent).

Percy implements both kinds (and can combine them too)-- and its pages describe the capabilities in great detail.  It might be too technical for some people, --- but that's okay because no one is obligated to give an opinion here if the material is beyond your interest or patience to figure out. Smiley

n0nce gave an informal description of a IT-PIR like technique.  The way he described it though doesn't provide much privacy, because his description makes it sound like he's only asking for a couple distractor addresses. I think that's mostly just an artifact of the description.  In IT-PIR every query normally operates on the entire database.
sr. member
Activity: 1036
Merit: 350

It is increasing the privacy. Did you read about PIR at all so far?
Yeah I glanced over it but you didn't really give the most concrete simple explanation.

Quote
Just think about it: in the scheme I suggested (which seems unpractical to implement), in theory the privacy is what we call 'perfect'. The server learns nothing at all when receiving the query. The query would be a bitstring of address indexes (I know, such a thing doesn't exist), of which the client wants the balances. Since they are chosen at random, on average half the bits are set, so half of the addresses are selected and their balances retreived, then XORed and finally returned. Any one server receives a list of addresses that has 0.5 probability of actually containing the address the client is interested in, so it doesn't even get an advantage over a random guesser of 50%, but no advantage at all. Perfectly secure.

if you were making all these queries to the same server over and over then yeah but since you're making them to independent servers that have no knowledge of each other and the requests they are receiving then that's different I would think.

what you're basically trying to argue is that by polluting the network with enough requests to different servers you can conceal the actual information you're requesting. i doubt it's really accomplishing that or anything useful. Cheesy
staff
Activity: 4172
Merit: 8419
Unfortunately, already pretty much the only incentive for people to run electrum servers is to spy on the clients, and this eliminates that "benefit". Smiley
So you mean, if Electrum servers can't spy anymore due to such mechanisms, they lose the whole incentive to run such a server?

Exactly, and especially no incentive to run PIR version that would be many times more expensive to run.

It's even hard to have the users pay for access-- because without access they can't spend their Bitcoins!  (and plus, most users would not want to pay a fair price when they can get wallet services from someone that will spy on them but lie about it for free)
hero member
Activity: 882
Merit: 5829
not your keys, not your coins!
~
In order to make a query, you need to send the query to the server. So if you are asking about a million addresses, you will need to send a list of a million addresses to the server.

An electrum client initially has no information about any transaction or any address. So it will need to obtain information about addresses from an electrum server. This is important because it has no way of querying unrelated addresses that have received transactions.
Yes, it's true, it's not trivial indeed. For now, I just use my own Electrum node (guide coming soon), but there must be a better solution for the majority of users which does not run a full node somehow.


In this scheme, the privacy is guaranteed by the fact that the client asks various servers for on average half of the DB entries for each server. Since the client queries multiple servers, on some servers this set of entries does include the item of interest, while on others it does not. The distribution is again 50:50 so no server gains any information (not even a 50% probability reduction when guessing) about which item is actually queried.

unless you have a way of requesting information that the server doesn't know what it is serving then it's not increasing the privacy at all. also, your solution buts a burden on other servers to have to serve data that you're not really interested in you're just using them and wasting their resources/bandwidth.
It is increasing the privacy. Did you read about PIR at all so far? Just think about it: in the scheme I suggested (which seems unpractical to implement), in theory the privacy is what we call 'perfect'. The server learns nothing at all when receiving the query. The query would be a bitstring of address indexes (I know, such a thing doesn't exist), of which the client wants the balances. Since they are chosen at random, on average half the bits are set, so half of the addresses are selected and their balances retreived, then XORed and finally returned. Any one server receives a list of addresses that has 0.5 probability of actually containing the address the client is interested in, so it doesn't even get an advantage over a random guesser of 50%, but no advantage at all. Perfectly secure.
sr. member
Activity: 1036
Merit: 350

In this scheme, the privacy is guaranteed by the fact that the client asks various servers for on average half of the DB entries for each server. Since the client queries multiple servers, on some servers this set of entries does include the item of interest, while on others it does not. The distribution is again 50:50 so no server gains any information (not even a 50% probability reduction when guessing) about which item is actually queried.



unless you have a way of requesting information that the server doesn't know what it is serving then it's not increasing the privacy at all. also, your solution buts a burden on other servers to have to serve data that you're not really interested in you're just using them and wasting their resources/bandwidth.
copper member
Activity: 2870
Merit: 2298
Deterministic light wallet software is going to ask for information about addresses that have never been used, but will be used in the future. You can ask about every address that has ever received a transaction to hide your previously used addresses, but if you ask about an address that has never been used, and that address subsequently receives a transaction, chances are that address belongs to you. The only way to avoid asking for information about addresses that have never been used but will be used in the future is to ask for information about every address ever used, but if you do this, the server will essentially return the entire blockchain.
That's a very good point. That's why to perform traditional multi-server PIR, you would need to query for on average half of all possible addresses per server and get an XOR of the balances from all the servers so from each server you just receive a few bytes. This XOR makes the difference between downloading just a few bytes and downloading the entire blockchain. However, I'm not sure there exists a way for Bitcoin Core and Electrum to quickly get millions of address balances at once. Probably, the right indexes for this don't exist yet. However, other PIR schemes could be used which require less lookups, I think.
In order to make a query, you need to send the query to the server. So if you are asking about a million addresses, you will need to send a list of a million addresses to the server.

When using a database, any column you are filtering by, you should have an index. Since electrum servers are able to return transactions about individual transactions and about individual addresses, their database contains indexes on both columns.

An electrum client initially has no information about any transaction or any address. So it will need to obtain information about addresses from an electrum server. This is important because it has no way of querying unrelated addresses that have received transactions.
Pages:
Jump to: