This thread hasn't had a lot of activity recently. Here's what I'd say:
Bloom filter based on SHA(SHA(privkey)) is interesting. Allows for plausible deniability by encoding multiple wallets (different wallets are accessed by using a different password). If OP wants to change the spec to a simple bloom filter, I'd be in support of that. Proposed algorithm:
let m (number of bits in field) = 32
let n (number of passwords inserted) = 2 (assuming 1 "real" password and 1 "fake" password. This is only for optimization; the algorithm supports more)
To minimize the probability of a false collision, we want to use (m/n)ln(2) "hash functions" per password, so
let k (number of set bits per password) = (32/2)ln(2) = 11.09 = ~11
The probability of false positives is (1-e^(-kn/m))^k = (1-e^(-22/32))^11 = ~.000458
Now, this is much larger than the chance of a false positive with just straight up SHA(SHA(privkey)), but the ability to add arbitrary passwords is pretty awesome. I would like to encourage discussion on this. I am honestly on the fence about this one; both are good solutions.
SHA(SHA(privkey)):
pro: Very low chance of the user accidentally opening wallet with wrong password
con: If compelled to decrypt wallet, there's no easy way to avoid actually giving them your real wallet info
Bloom filter:
pro: Very easy to add a fake/secondary wallet (which you can actually put a little bit of money in) with any fake password you want. Plausible deniability.
con: Higher chance of user incorrectly entering password.
Pseudocode:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
privkey = decrypt_wallet(password).calculate_master_secret()
hash = SHA(SHA(privkey))
for i in range(0,11):
filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1
English description:
Create a real password P1 and a "fake" password P2. The user can specify P2 if they want a "fake" password that will decrypt to a valid wallet. If the user doesn't want a "fake" password, P2 is to be randomly generated (so all users have a similar bloom filter). For each password P0...Pn:
1. Decrypt the wallet and calculate the master secret key Kn
2. Calculate SHA(SHA(Kn)) and call this Hn (Hn[0:4] is currently used in the spec as a wallet checksum)
3. For each of the first 11 bytes in Hn (let's call each byte Hn_i):
a. Take the lowest 5 bits of Hn_i. Call this number (which ranges from 0 to 31) D
b. Set the Dth bit of the bloom filter to 1
I think the existing date field is sufficient for optimization. It doubt it would be worth the effort to put more detail in the wallet spec. I would like to hear from gmaxwell and other client devs on this, since they're the authority on SPV optimization. Yes, we can optimize scan times by putting restrictions on the order of wallet generation, but I think that belongs in a separate BIP.