Pages:
Author

Topic: Proposal: Base58 encoded HD Wallet root key with optional encryption - page 5. (Read 21014 times)

hero member
Activity: 836
Merit: 1030
bits of proof
A HD root can generate 2^32-1 keys and any of these keys might be the root for a further derivation. Now assume you get a HD root and would want to determine the funds associated with it. It is not feasible to find them in absence of assumptions. I suggested some assumptions I keep while developing HD wallets.

Assumptions:

1. : the root you have identifies leafs of the hierarchy. If not then use the same assumptions on the sub-levels on-by-one.
2. : if key(n) derivable from the current root is first used in block b then key(m) | m>n is not used in block d | d < b.
3. : if you do not find a use for key (n) then there is no use of any key (m) | m > n + w.

You might not like to have the assumptions, but I think they are sensible. I can not see how you implement scaleable discovery of funds associated with a root without these or similar assumptions.

If brute forcing the checksum for an alternate password is not feasible with 4 bytes then reduce the checksum length instead of introducing a more complex concept.
member
Activity: 98
Merit: 10
Grau, if I'm understanding you correctly, you say that we could optimize blockchain scan times by only using keys serially? So the root key is used in the blockchain before any derivative keys, and those are used before any of their derivative keys, etc.

How do you propose to enforce this? Prevent users from choosing which address they'd like to use? It seems like that would put some annoying limitations on flexibility.


I agree with riplin that brute forcing a 32 bit hash is not a viable solution. Not only would the plausibly deniable passwords be very hard to remember, but it would take forever!

The bloom filter is interesting. You could choose arbitrary passwords to use for plausible deniability.
hero member
Activity: 836
Merit: 1030
bits of proof
Therefore I suggest to extend the proposal with a start key index (4 bytes) and a look ahead window (4 bytes).
Actually 2 bytes should be more than enough for look ahead.
hero member
Activity: 836
Merit: 1030
bits of proof
For plausible deniability it should be possible to mine an alternate password that create a key that passes the checksum as is.

If you mean mining a password on the current checksum (as currently defined in the spec), then you've got quite a bit of work ahead of you. Matching a 32bit checksum is a lot of work and the resulting password could be total gibberish and pretty hard to remember, not to mention that you're basically brute forcing the KDF, which is was chosen to resist such attacks. Or am I misunderstanding what you mean?

Yes, its a lot of work, but doable and the denial is then rather plausible. If you think that is too big of a hurdle then reduce the checksum length.


Storing the HD root is useful.
Recreating the keys that are used on the block chain needs however further conventions, so it can be done in a scalable manner.

Let's call the keys generated by the root key(0), key(1), key (3)....

Now let us assume that if key (n) is first used in block b, then there should be no reference to key(m) (m > n) in any block before b. With this rule one can efficiently scan the block chain in a single pass using a look ahead window of w:

Assume key(s) is used after block b (identified by the birth date of the root) and scan subsequent blocks. If any key (m) (s<=m
Note that the search window increases during the scan. A good HD wallet software would further optimize the use of keys such that it seeks to reduce the search window by spending outputs on keys earlier in the key sequence. It would seek to re-set first key index s with a new birth date.

Therefore I suggest to extend the proposal with a start key index (4 bytes) and a look ahead window (4 bytes).




member
Activity: 116
Merit: 11
For plausible deniability it should be possible to mine an alternate password that create a key that passes the checksum as is.

If you mean mining a password on the current checksum (as currently defined in the spec), then you've got quite a bit of work ahead of you. Matching a 32bit checksum is a lot of work and the resulting password could be total gibberish and pretty hard to remember, not to mention that you're basically brute forcing the KDF, which is was chosen to resist such attacks. Or am I misunderstanding what you mean?


Awesome! Smiley
hero member
Activity: 836
Merit: 1030
bits of proof
For plausible deniability it should be possible to mine an alternate password that create a key that passes the checksum as is.
member
Activity: 116
Merit: 11
I've been doing some tests with the bloom code in the qt client, and if I understand it correctly, when set to 4 expected elements and a 1% false positive rate, the bloom filter is 4 bytes in size with 5 hash functions.

So that would generate up to 20 random bits of data in the 4 byte field. I think that's sufficient to keep the entropy up. Also, since it's 4 elements, that's 4 potential passwords that could be used.

In other words, I could modify the spec to encode up to 4 passwords into the checksum instead of the double SHA256 of the private key. Any less than 4 passwords would be filled with 5 random bits of data in the checksum for obfuscation.

What do you guys think?
member
Activity: 116
Merit: 11
Thanks. Some small suggested changes:

1. Specify endianness of the each field. In your examples, the prefix is big-endian, but the date is little-endian. I recommend big-endian for all fields (since in many scripting languages, it's easier to go from an integer to a big-endian string than a little-endian string).

Bitcoin is little endian, with exception of big int math. BIP 38 is also little endian, with exception of prefix. I've clarified that the date field is little endian.

2. Recommend that client implementers check the blockchain, say, a day before the beginning of the "recorded" date to account for slightly incorrect date calculations. Either that or specify that the date code should be based on date minus one day.

Done.

All test vectors passed in my implementation!

Yay!

Please specify that unicode passwords (like the chinese one you gave in the test vectors) should be converted to UTF-8.

Done.
member
Activity: 98
Merit: 10
All test vectors passed in my implementation!

Please specify that unicode passwords (like the chinese one you gave in the test vectors) should be converted to UTF-8.
member
Activity: 98
Merit: 10
Thanks. Some small suggested changes:

1. Specify endianness of the each field. In your examples, the prefix is big-endian, but the date is little-endian. I recommend big-endian for all fields (since in many scripting languages, it's easier to go from an integer to a big-endian string than a little-endian string).

2. Recommend that client implementers check the blockchain, say, a day before the beginning of the "recorded" date to account for slightly incorrect date calculations. Either that or specify that the date code should be based on date minus one day.
member
Activity: 116
Merit: 11
I've changed the checksum to be a double SHA256 of the private key instead of the address string and I changed the hashing so it can be outsourced to a 3rd party.

Great. I'm working to update my reference implementation.

Great! Make sure you can parse the test vectors. Smiley Also, apetersson's suggestion of changing the checksum to a bloom filter is interesting. The only thing I worry about is losing entropy, so anyone here that knows how to construct a sufficiently obfuscated bloom filter that's both useful and doesn't affect entropy too much, please let us know.

I'm a little confused on the 3rd party thing, though. What is the need for this?

On page 2 in this thread, there's discussion on having 3rd party KDF support added.

Also, a few questions on the implementation side of things:

Quote
8. Derive hash preH from the passphrase and the salt using HMAC-SHA512(key = salt, msg = passphrase).
9. Derive hash strongH from preH using the selected KDF + parameters, where preH is both salt and message. This step can optionally be outsourced to a 3rd party.
10. Derive hash postH from the salt and the passphrase using HMAC-SHA512(key = passphrase, msg = salt).
11. Derive a hash H from the strongH and the postH using scrypt, where message = postH and salt = strongH and parameters n = 210, r = 1, p = 1. The output length = root key length + 32.

1. Why do we use key=salt, msg=passphrase in 8. but key=passphrase, msg=salt in 10.?

If the 3rd party that does the key stretching for us is malicious, and preH and postH are the same, he has all he needs to decrypt our root key (should he ever get his hands on it). By using a different salt in step 11, that's no longer the case and he'll have to brute force it.

2. Can you please expand on step 9.? If we're using scrypt, we also need to provide an output length parameter. What should that be? Why do we use preH for both salt and message? Is that safe?

Oops, you're right. Length of strongH = 64 bytes. I'll update the spec. Using preH as salt and message is perfectly fine. It's basically the same as not having a salt like SHA256.

member
Activity: 98
Merit: 10
I've changed the checksum to be a double SHA256 of the private key instead of the address string and I changed the hashing so it can be outsourced to a 3rd party.

Great. I'm working to update my reference implementation.

I'm a little confused on the 3rd party thing, though. What is the need for this?

Also, a few questions on the implementation side of things:

Quote
8. Derive hash preH from the passphrase and the salt using HMAC-SHA512(key = salt, msg = passphrase).
9. Derive hash strongH from preH using the selected KDF + parameters, where preH is both salt and message. This step can optionally be outsourced to a 3rd party.
10. Derive hash postH from the salt and the passphrase using HMAC-SHA512(key = passphrase, msg = salt).
11. Derive a hash H from the strongH and the postH using scrypt, where message = postH and salt = strongH and parameters n = 210, r = 1, p = 1. The output length = root key length + 32.

1. Why do we use key=salt, msg=passphrase in 8. but key=passphrase, msg=salt in 10.?

2. Can you please expand on step 9.? If we're using scrypt, we also need to provide an output length parameter. What should that be? Why do we use preH for both salt and message? Is that safe?
hero member
Activity: 668
Merit: 501
would it be possible to modify the checksum to a bloom filter so that you can do plausible deniability with multiple passwords?
member
Activity: 116
Merit: 11
I've changed the checksum to be a double SHA256 of the private key instead of the address string and I changed the hashing so it can be outsourced to a 3rd party.

sr. member
Activity: 358
Merit: 250
Well, I certainly intend it to replace BIP38.  BIP38 was dropped onto the wiki without any public review or discussion and contains a number of unfortunate shortcomings which are corrected here.

The WRT third party generation, I think this proposal does accommodate that in a not-quite 1:1 manner... in that you can send someone off your encrypted key to transcribe.  Though even if it doesn't thats just a single use case— the fact that BIP38 can only accommodate a single address is a reason something should replace it even for that use case. Though I'm not sure how common that usecase will be in the future considering recent regulatory activity.

Regardless of how it found its way to the wiki (over a year ago), BIP38 has already become the defacto standard for single key encryption for off-line storage, possibly since there wasn't really anything else out there. Many (most?) paper wallets now support it, along with at least one major wallet (Blockchain).

To be sure it may have a few shortcomings like fixed scrypt parameters, but its actually pretty versatile and cryptographically sound for its intended use: Encrypting single private keys and third-party key generation - it appears that it was never really intended to create/backup deterministic wallets, so these aren't necessarily shortcomings anyway, since the standards probably don't have much overlap for most use cases.

member
Activity: 98
Merit: 10
Block number might be a reasonable alternative to a date? Less arbitrary?

Good thinking, but that actually exacerbates the problem I brought up earlier (that some computers aren't internet connected). The block production time is actually a tad unpredictable, since sometimes (in fact, for the last few months) the network hash rate grows so fast that the difficulty adjustment algorithm can't compensate enough to keep the average block production time very close to 10 minutes. Therefore, it's effectively impossible for an offline wallet generator (which is probably one of the most common uses of these wallet encryption algorithms) to get an accurate, or even semi-accurate, block number. But date can trivially be mapped to block height on a networked machine, once the coins are being spent.
member
Activity: 130
Merit: 10
Block number might be a reasonable alternative to a date? Less arbitrary?
member
Activity: 116
Merit: 11
I just boarded a plane, so I can't look at I right now, but a quick question for gmaxwell and others about using the hash of the private key instead of the address: yes / no?
member
Activity: 98
Merit: 10
Implemented it here: https://github.com/wyager/Encrypted-HD-wallet

I followed the suggestion of the checksum being SHA256(SHA256(private_key))[0:3] instead of SHA256(SHA256(bitcoin_address))[0:3].

It's not heavily tested, but as far as I can tell it works fine.
Pages:
Jump to: