Pages:
Author

Topic: Proposal: Base58 encoded HD Wallet root key with optional encryption - page 2. (Read 20986 times)

member
Activity: 116
Merit: 10
Updated the spec.

Changed prefix to 2 bytes, 'RK' and 'rk' for clear version and encrypted version respectively.
Added entropy field to encrypted version, moved KDF field from prefix into entropy field.
Changed computation of H to use PBKDF2-HMAC-SHA512 instead of Scrypt.
Changed checksum field to bloom field in encrypted version. Now supports 2 passwords.
member
Activity: 116
Merit: 10
Ok, I've done some searching and unfortunately, there are no 2 byte prefixes for all 3 possible lengths if we increase everything by 1 or 2 bytes.

I'm down for increasing the length a bit with a random salt. I don't have any strong opinions on how this is done, but it sounds good to me.

That would also mean we could go back to my original idea about sha(sha(privkey)) being used as a bloom filter element.

Yes. I'm making the change right now. I'm currently updating the test vectors and adding new ones, plus second password.

member
Activity: 98
Merit: 10
Ok, I've done some searching and unfortunately, there are no 2 byte prefixes for all 3 possible lengths if we increase everything by 1 or 2 bytes.

I'm down for increasing the length a bit with a random salt. I don't have any strong opinions on how this is done, but it sounds good to me.

That would also mean we could go back to my original idea about sha(sha(privkey)) being used as a bloom filter element.
member
Activity: 116
Merit: 10
No BIP number as of yet. I'll have to ask the list again, once we've finished the bloom filter stuff.

I'd say call it HD wallet root key. Smiley
newbie
Activity: 39
Merit: 0
This proposal is next up for development in my bip32utils library.  Do you have any indication yet what a BIP number might be for it, and do you have a proposed short name for it?  Otherwise, I'll call it the "riplin export format" Smiley
member
Activity: 116
Merit: 10
Ok, I've done some searching and unfortunately, there are no 2 byte prefixes for all 3 possible lengths if we increase everything by 1 or 2 bytes. But I found a way to make it work if we increase by 1/2/3 bytes for the 16/32/64 byte root keys respectively. Not ideal, but at least there's a pattern.

I've also changed the prefix from 'ws' and 'WS' to 'rk' and 'RK', something I wanted to do ever since the term 'root key' was introduced.

The unencrypted lengths stay the same, but the prefix length is reduced to 2 bytes, giving us one extra byte.

Actually, 1 byte less works for the unencrypted variant. Updated the table.

LengthPrefixMinMaxCount
24RK0x28C10x28C66
40RK0x4AC50x4AD113
72RK0xFBB30xFBDE44
26rk0xF83F0xF85321
43rk0x67310x67399
76rk0x4EB40x4EB96

So the unencrypted format becomes something like:

prefix(2 bytes) + date(2 bytes) + checksum(4 bytes) + root key(16/32/64 bytes)


The encrypted format becomes something like:

prefix(2 bytes) + date(2 bytes) + KDF(5 bits) + entropy(11/19/27 bits) + bloom filter(4 bytes) + encrypted root key(16/32/64 bytes)


Thoughts?
member
Activity: 116
Merit: 10
There's a third option. Basically, if we want the same security as we have right now, the bloom has to be based on the full decryption pass and thus can't be part of the salt.

So we need to add some extra bytes for salt entropy. Currently, the prefix is 3 bytes long. I'll have to check, but if we add 1 or 2 extra bytes, it'll become 2 bytes again, giving us either 11 or 19 bits of extra entropy (+ 5 bits for KDF selection).

I'll try and have a look at that tomorrow.
member
Activity: 98
Merit: 10

Actually, come to think of it, this gives a brute-force attack a big optimization. They only have to test against the bloom filter now. There's no need to do the strong hash. So this has to be hashed using the strong hash.


OK, we have two options here:

1. Use strongH(password) to calculate bloom filter. Slow, but secure. Unfortunately, this can not be delegated. So perhaps a no-go.

2. Use sha(password) to calculate bloom filter. Fast, but reduces the password search space by up to eleven-ish bits (I think). I thought you were saying you were OK with this. However, I guess eleven-ish bits is a little high for me.

Alternate proposed solution:

Code:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
     hash = PBKDF2_HMAC_SHA512(password, "", 65536, 11)
     for i in range(0,11):
          filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1

This still allows the attacker to reduce the search space by something like 2^11, but at the expense of doing 2^16 extra rounds of PBKDF2-HMAC-SHA512 per checked password.

Even the lowest-end device can handle this without delegation.
member
Activity: 116
Merit: 10
Actually, it makes you wonder whether or not the key needs to be encrypted with a strong KDF at all at that point. If the strong KDF is used to generate the bloom filter, then the hash used to encrypt is going to be irrelevant, right? As an attacker, you're testing to see if you generated a password that passes the bloom filter and you move on from there, so that's the step that should be costly.

So that means that this is the step that has to be outsourced. And that means that we need a good salt for this step. So we're right back where we started.
member
Activity: 116
Merit: 10
3. This means doing the strong hash multiple times and not being able to outsource it. I think it's fine to use something like a double SHA256 here. It's not going to leak anything useful anyway.

3. I agree. Revised code:

Code:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
     hash = sha(password)
     for i in range(0,11):
          filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1

Actually, come to think of it, this gives a brute-force attack a big optimization. They only have to test against the bloom filter now. There's no need to do the strong hash. So this has to be hashed using the strong hash.

member
Activity: 98
Merit: 10
1. There isn't really any reason to not at least *allow* PBKDF2-SHA512 as one of the available KDFs.

[1.] How many iterations do you suggest if we were to add this as one of the KDFs?

2. It would be easier for embedded and mobile devices if Scrypt wasn't mandatory.

3. The Scrypt with parameters 10,1,1 might as well be replaced with PBKDF2-SHA512.

[2.] Very well, I'll see if I can rework it to use PBKDF2-SHA512 for the preH and postH sorry, I meant for H.

Alternate bloom filter proposal:
Code:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
     hash = strongH(password, "") # strongH is the user-selected KDF. Fixed/no salt used. See below.
     for i in range(0,11):
          filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1
It's acceptable to use strongH without a salt here, because we're only revealing 10 or fewer bits of the password hash. We're not revealing enough information about the password to do any damage. This means the bloom filter is a function of *both* passwords (one of which may be random), so it still adds useful entropy to the salt.

3. This means doing the strong hash multiple times and not being able to outsource it. I think it's fine to use something like a double SHA256 here. It's not going to leak anything useful anyway.

1. We have a lot of KDF codes to spare. I would recommend 2 or even 3 PBKDF2-HMAC-SHA512 versions (just like we have with Scrypt). We are not too focused on performance, so these are my suggested iteration counts:

65536 (2^16). More than most existing implementations use for the faster PBKDF2-HMAC-SHA256. Acceptable performance on very resource-constrained embedded systems.

2097152 (2^21). Very heavy. In Python, takes 30+seconds to do this many rounds of HMAC-SHA512. Probably takes at least a few seconds in well-optimized C.

And, if we want a really heavy option (although honestly, I think the Scrypt options are probably fine for this), we can also do

67108864 (2^26). For the very paranoid.

2. Great! Like I said, it should be pretty much a drop-in to the list of KDFs. It would work pretty much exactly the same. What do you think about using it for the final key-stretch (the Scrypt with parameters 10,1,1)? If we got rid of the 10,1,1 Scrypt or replaced it with PBKDF2-HMAC-SHA512, people could stick to explicitly NIST-approved memory-constant crypto. Good for the paranoid and good for embedded developers.

3. I agree. Revised code:

Code:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
     hash = sha(password)
     for i in range(0,11):
          filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1
member
Activity: 116
Merit: 10
1. There isn't really any reason to not at least *allow* PBKDF2-SHA512 as one of the available KDFs.

How many iterations do you suggest if we were to add this as one of the KDFs?

2. It would be easier for embedded and mobile devices if Scrypt wasn't mandatory.

3. The Scrypt with parameters 10,1,1 might as well be replaced with PBKDF2-SHA512.

Very well, I'll see if I can rework it to use PBKDF2-SHA512 for the preH and postH sorry, I meant for H.

Alternate bloom filter proposal:
Code:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
     hash = strongH(password, "") # strongH is the user-selected KDF. Fixed/no salt used. See below.
     for i in range(0,11):
          filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1
It's acceptable to use strongH without a salt here, because we're only revealing 10 or fewer bits of the password hash. We're not revealing enough information about the password to do any damage. This means the bloom filter is a function of *both* passwords (one of which may be random), so it still adds useful entropy to the salt.

This means doing the strong hash multiple times and not being able to outsource it. I think it's fine to use something like a double SHA256 here. It's not going to leak anything useful anyway.

member
Activity: 98
Merit: 10
Of course, there's also the non-offloadable scrypt hash later on. However, the parameters of 10,1,1 are so small that it's probably barely any better than PBKDF2-SHA512 for attack resistance.

[1.] That one is only done for key stretching and the hypothetical "what if the 3rd party gets a hold of your wallet" situation. It's not the main protection against brute-force attacks.

If we allow for, say, PBKDF2-SHA512 to be used for strongH generation, it can be run on extremely memory-contrained devices. I also really doubt that the second, non-offloadable scrypt with parameters 10,1,1 is much better than PBKDF2-SHA512.

[2.] Here's an interesting thread on PBKDF2: http://stackoverflow.com/questions/4433216/password-encryption-pbkdf2-using-sha512-x-1000-vs-bcrypt

Look at that table.

[3.] I'd rather introduce an extra Scrypt option that's a bit weaker than 14,8,8 over introducing PBKDF2-SHA512.


[4.]Additionally, about the bloom filtering, I think we're going to have a bit of an issue with that one. The checksum as it currently exists is part of the salt used when generating preH. We won't be able to do that if we introduce the bloom filter. This will lower the entropy of the salt considerably.

1. I'm aware of this. However, you haven't addressed the fact that, if the device processing the delegated "strong" hash is compromised, the security of the wallet is reduced to that of HMAC-SHA512, which is significantly lower than *either* Scrypt of PBKDF2-SHA512. There are two possibilities here:

     a.) No PBKDF2-SHA512. Memory constrained devices are *forced* to use external device to calculate strong hash. This introduces a huge new attack surface, because if the external device is compromised, only a single HMAC-SHA512 stands in the way.

     b.) We allow PBKDF2-SHA512. Memory constrained devices at least have the *option* of stretching the key on their own. Remember, no one is forced to use PBKDF2-SHA512. Attack surface is significantly diminished.

2. Yes, Scrypt is effective, but clearly so is PBKDF2. I'm not suggesting replacing Scrypt; merely allowing people not to use it.

3. Why not both? There are additional advantages to having PBKDF2-SHA512 as well as Scrypt. Beyond the fact that it's friendlier for memory-constrained devices, it could also simplify code (It would be possible to have an embedded device with only SHA2-family hashes, which I know are easily implemented on even very weak 8-bit microcontrollers), and would allow people who don't trust Scrypt yet to use only NIST-certified and thoroughly time-vetted hash algorithms.

4. This is true. Something to consider. I have a suggestion at the end of this post.


The Mycelium Wallet on Android already uses scrypt for its password-protected backups, and for BIP38. On very old devices we have trouble running this, so we will have to introduce a "weaker" version for devices with < 20MB per process.

For a future standard for HD wallet, non-hardcoded scrypt parameters would be preferable, depending on password length and device speed and usage frequency, and usecases (offline device) the desired parameters could be different. I like scrypt, though. If it turns out that scrypt is broken, there should be a V2 of this spec, with a new hashing algo, or the ability to specify the KDF algo.

Some comments on what i have read so far:

1) direct encoding of scrypt parameters would be preferable as direct values as opposed to versioned parameter lists (up& downwards compatible)

2) The generation date is not really that important. Maybe other wallets need it but out software would ignore it with the current scheme, but i see how it potentially speeds up initial syncing for a fresh backup. but thinking long-term it will not help much, since the time between backup and restore will always increase.

3) If the KDF is offloadable, make sure that the "non-offloadable" part does not strictly require a scrypt implementation, so that it may run on devices with very limited resources. (think future revisions of embedded devices)

4) We absolutely want the checksum with bloom filtering in future, so that the user may specify additional passwords/pins for multiple identities. Our non-trivial job will be to design a wallet format that preserves the deniability with the stored metadata.
again, the chosen parameters should not be hard-coded, but i can imagine that default values like 4 bytes with 6 hash functions (optimum for N=4, p=0.02) would make sense.
the question is, do we want to push the plausible deniability so far, as to obfuscate the fact how many additional passwords possibly exists? if so, we would hit a limit after adding 4 passwords,

5) what i do not fully understand, is how bloom filtering impacts the preH entropy. can you elaborate on this?



Absolutely agree. I think having PBKDF2-SHA512 as an option (not the only option!) would nicely address 1, 3, and your non-enumerated concerns. In case Scrypt is broken (or more likely, is vulnerable to memory/timing side channel attacks), there should be no *mandatory* usage of Scrypt; it should simply be one of the selectable KDFs. I suggest replacing the 10,1,1 Scrypt with PBKDF2-SHA512. StrongH is where all the big key stretching comes from anyway, so we might as well limit our usage of Scrypt to that.

Re. 5), the problem is this. The bloom filter is a function of (salt, password, encrypted data). Therefore, we can't use the bloom filter in the salt, because then we would get that the bloom filter is a function of ((version, date, bloom filter), password, encrypted data). It's impossible to have the bloom filter be a function of itself. I have an alternate suggestion at the end of this post.


[1.] We've got 32 possible KDFs. Surely that's enough to have a Scrypt parameter set for every hardware platform that wants to implement this? Additionally, I want to add support for catena in the future as it fixes some issues found in Scrypt. See: https://github.com/cforler/catena


[2.] It uses Scrypt 10,1,1 purely for the key stretching. I didn't want to muck about with a fixed output length hash function, because I potentially need more bytes than SHA512 currently provides.


[3.] I'm right there with you, but I also want to preserve the entropy of preH's salt.


1. Indeed. If we have enough KDF codes for Scrypt and Catena, don't we have enough KDF codes for one or two PBKDF2-SHA512s?

2. PBKDF2-SHA512 provides an arbitrary number of output bytes! It would be a *very* simple addition to the spec, because it does basically the exact same thing as Scrypt.

3. Perhaps we can do both.


Summary:

1. There isn't really any reason to not at least *allow* PBKDF2-SHA512 as one of the available KDFs.

2. It would be easier for embedded and mobile devices if Scrypt wasn't mandatory.

3. The Scrypt with parameters 10,1,1 might as well be replaced with PBKDF2-SHA512.

4. The bloom filter as I described it earlier would negatively impact salt entropy. It would be best if we could re-work this.


Alternate bloom filter proposal:
Code:
filter = 0 # 32 bit integer
valid_passwords = [user_password, fake_password] # Can be any number of passwords. To preserve plausible deniability of all users, the spec should mandate a randomly generated fake password if the user doesn't want one
for password in valid_passwords:
     hash = strongH(password, "") # strongH is the user-selected KDF. Fixed/no salt used. See below.
     for i in range(0,11):
          filter |= 1 << (hash[i] & 0x1F) # Sets a random bit in the filter to 1
It's acceptable to use strongH without a salt here, because we're only revealing 10 or fewer bits of the password hash. We're not revealing enough information about the password to do any damage. This means the bloom filter is a function of *both* passwords (one of which may be random), so it still adds useful entropy to the salt.
member
Activity: 116
Merit: 10
1) direct encoding of scrypt parameters would be preferable as direct values as opposed to versioned parameter lists (up& downwards compatible)

We've got 32 possible KDFs. Surely that's enough to have a Scrypt parameter set for every hardware platform that wants to implement this? Additionally, I want to add support for catena in the future as it fixes some issues found in Scrypt. See: https://github.com/cforler/catena

3) If the KDF is offloadable, make sure that the "non-offloadable" part does not strictly require a scrypt implementation, so that it may run on devices with very limited resources. (think future revisions of embedded devices)

It uses Scrypt 10,1,1 purely for the key stretching. I didn't want to muck about with a fixed output length hash function, because I potentially need more bytes than SHA512 currently provides.

4) We absolutely want the checksum with bloom filtering in future, so that the user may specify additional passwords/pins for multiple identities. Our non-trivial job will be to design a wallet format that preserves the deniability with the stored metadata.
again, the chosen parameters should not be hard-coded, but i can imagine that default values like 4 bytes with 6 hash functions (optimum for N=4, p=0.02) would make sense.
the question is, do we want to push the plausible deniability so far, as to obfuscate the fact how many additional passwords possibly exists? if so, we would hit a limit after adding 4 passwords,

I'm right there with you, but I also want to preserve the entropy of preH's salt.

5) what i do not fully understand, is how bloom filtering impacts the preH entropy. can you elaborate on this?

Quote
6. Calculate the checksum = SHA256(SHA256(IL))[0...3]

7. Create the salt as being: prefix + date + checksum.
8. Derive hash preH from the passphrase and the salt using HMAC-SHA512(key = salt, msg = passphrase).

Currently the checksum is a subsection of the hash of the HD wallet master private key and is used in the salt.

Since for the second password we basically don't know what the key is until we decrypt for the first time, we can't change the bloom.

We'd either have to remove the checksum from the salt (losing entropy) or change the input of the bloom (not being based on the HD wallet master private key).
hero member
Activity: 668
Merit: 501
Bump. Would like to hear opinions on bloom filter.

The Mycelium Wallet on Android already uses scrypt for its password-protected backups, and for BIP38. On very old devices we have trouble running this, so we will have to introduce a "weaker" version for devices with < 20MB per process.

For a future standard for HD wallet, non-hardcoded scrypt parameters would be preferable, depending on password length and device speed and usage frequency, and usecases (offline device) the desired parameters could be different. I like scrypt, though. If it turns out that scrypt is broken, there should be a V2 of this spec, with a new hashing algo, or the ability to specify the KDF algo.

Some comments on what i have read so far:

1) direct encoding of scrypt parameters would be preferable as direct values as opposed to versioned parameter lists (up& downwards compatible)

2) The generation date is not really that important. Maybe other wallets need it but out software would ignore it with the current scheme, but i see how it potentially speeds up initial syncing for a fresh backup. but thinking long-term it will not help much, since the time between backup and restore will always increase.

3) If the KDF is offloadable, make sure that the "non-offloadable" part does not strictly require a scrypt implementation, so that it may run on devices with very limited resources. (think future revisions of embedded devices)

4) We absolutely want the checksum with bloom filtering in future, so that the user may specify additional passwords/pins for multiple identities. Our non-trivial job will be to design a wallet format that preserves the deniability with the stored metadata.
again, the chosen parameters should not be hard-coded, but i can imagine that default values like 4 bytes with 6 hash functions (optimum for N=4, p=0.02) would make sense.
the question is, do we want to push the plausible deniability so far, as to obfuscate the fact how many additional passwords possibly exists? if so, we would hit a limit after adding 4 passwords,

5) what i do not fully understand, is how bloom filtering impacts the preH entropy. can you elaborate on this?




member
Activity: 116
Merit: 10
Of course, there's also the non-offloadable scrypt hash later on. However, the parameters of 10,1,1 are so small that it's probably barely any better than PBKDF2-SHA512 for attack resistance.

That one is only done for key stretching and the hypothetical "what if the 3rd party gets a hold of your wallet" situation. It's not the main protection against brute-force attacks.

If we allow for, say, PBKDF2-SHA512 to be used for strongH generation, it can be run on extremely memory-contrained devices. I also really doubt that the second, non-offloadable scrypt with parameters 10,1,1 is much better than PBKDF2-SHA512.

Here's an interesting thread on PBKDF2: http://stackoverflow.com/questions/4433216/password-encryption-pbkdf2-using-sha512-x-1000-vs-bcrypt

Look at that table.

I'd rather introduce an extra Scrypt option that's a bit weaker than 14,8,8 over introducing PBKDF2-SHA512.


Additionally, about the bloom filtering, I think we're going to have a bit of an issue with that one. The checksum as it currently exists is part of the salt used when generating preH. We won't be able to do that if we introduce the bloom filter. This will lower the entropy of the salt considerably.

member
Activity: 98
Merit: 10

As a KDF?  Can you provide some evidence for that?

If instead you were saying PBKDF2-SHA512, okay— but probably not worth the complexity to include more mandatory code.

The parameters selected should work on embedded devices. And with the support for delegation it doesn't matter quite as much if the selected parmeters didn't work on some device.

Granted, there isn't a whole lot of skepticism of scrypt, but people do recognize that scrypt is not as thoroughly vetted as, say, PBKDF2-SHA*. For those worried about, say, side channel attacks on memory usage, not using scrypt may be preferable.

Let's say we use the absolute minimum scrypt parameters for our embedded device. The most memory-efficient KDF function (KDF 0 in this spec) will use over 16MB of RAM, which is too high for many very small/cheap devices. Yes, we can delegate out the strong hashing, but this might not always be practical or wanted. Users may not have access to a stronger device (think of a very small, cheap, autonomous wallet generator device), or may not trust a stronger device enough to use it and possibly reduce their security to that of HMAC-SHA512.

Of course, there's also the non-offloadable scrypt hash later on. However, the parameters of 10,1,1 are so small that it's probably barely any better than PBKDF2-SHA512 for attack resistance.

Agreed. The whole point of using Scrypt is to reduce the number of brute force attempts per second. SHA is very suitable for optimization whereas Scrypt less so.

Assume an external device must be used for strong hashing. If the external device is compromised, and the attacker has access to the encrypted wallet such that they can determine the salt, the security of the passphrase is limited to HMAC-SHA512 anyway.

If we allow for, say, PBKDF2-SHA512 to be used for strongH generation, it can be run on extremely memory-contrained devices. I also really doubt that the second, non-offloadable scrypt with parameters 10,1,1 is much better than PBKDF2-SHA512.
member
Activity: 116
Merit: 10
Agreed. The whole point of using Scrypt is to reduce the number of brute force attempts per second. SHA is very suitable for optimization whereas Scrypt less so.

gmaxwell, do you have any thoughts on changing the checksum to be a bloom filter so 2 passwords can be used (for plausible deniability)?

staff
Activity: 4284
Merit: 8808
1. Some people trust SHA more than Scrypt
As a KDF?  Can you provide some evidence for that?

I've never seen someone suggest that but perhaps they should.

In any case it absolutely should not use SHA512 as a KDF. SHA512 is a hash function, not a KDF, and it does not make an acceptably secure option.

If instead you were saying PBKDF2-SHA512, okay— but probably not worth the complexity to include more mandatory code.

Quote
2. Scrypt is hard on embedded devices. SHA isn't.
The parameters selected should work on embedded devices. And with the support for delegation it doesn't matter quite as much if the selected parmeters didn't work on some device.
member
Activity: 98
Merit: 10
Bump. Would like to hear opinions on bloom filter.

Would it be possible to slightly modify the spec to allow for only non-scrypt hashing algorithms?

That would involve adding, say, SHA512 as a KDF, and getting rid of the fixed "scrypt.hash" in the encryption/decryption functions.

2 reasons this might be a good idea:
1. Some people trust SHA more than Scrypt
2. Scrypt is hard on embedded devices. SHA isn't.
Pages:
Jump to: