Topic: Deterministic wallets - page 5. (Read 48433 times)

3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?

Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)

If I'm guessing correctly, the reason that BIP32 specifies that for I_L>=n that resulting keypair is invalid is because we would like to have the privkey chosen uniformly at random from the valid range, so if we take I_L modulo n then the smaller privkey values will have a little bit higher probability?

Edit: instead of specifying that k_i is invalid, we could specify for example k_i=SHA256(I_L) in this case (and invoke SHA256 again if k_i is still invalid, and so on), so that we couldn't have forced gaps in the indices of the HD wallet? But if the probability is less than 1/2^127 then specifying that i is invalid is also fine.

I still agree with thanke that for type-1 derivation it makes slightly more sense to have k_i=I_L and specify that k_i is invalid if I_L=0 or I_L>=n, rather than to have k_i=k_par+I_L (mod n) and specify that k_i is invalid if I_L>=n. Note that the OP of this thread also specifies for type-1 that "Privatekey(type,n) is then simply set to H(n|S|type)". With k_i=I_L we could still treat d=k_i-k_par as the difference, so d*G+K_par is the corresponding pubkey, in case this d in needed for compatibility with type-2 in some possible context.

Since n = 2^256 - 2^32 - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1 the case I_L >= n occurs in one out of 2^224 times...

iddo

sr. member

Activity: 360

Merit: 251

Quote from: iddo on May 07, 2013, 07:44:50 AM

3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?

Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)

If I'm guessing correctly, the reason that BIP32 specifies that for I_L>=n that resulting keypair is invalid is because we would like to have the privkey chosen uniformly at random from the valid range, so if we take I_L modulo n then the smaller privkey values will have a little bit higher probability?

Edit: instead of specifying that k_i is invalid, we could specify for example k_i=SHA256(I_L) in this case (and invoke SHA256 again if k_i is still invalid, and so on), so that we couldn't have forced gaps in the indices of the HD wallet? But if the probability is less than 1/2^127 then specifying that i is invalid is also fine.

I still agree with thanke that for type-1 derivation it makes slightly more sense to have k_i=I_L and specify that k_i is invalid if I_L=0 or I_L>=n, rather than to have k_i=k_par+I_L (mod n) and specify that k_i is invalid if I_L>=n. Note that the OP of this thread also specifies for type-1 that "Privatekey(type,n) is then simply set to H(n|S|type)". With k_i=I_L we could still treat d=k_i-k_par as the difference, so d*G+K_par is the corresponding pubkey, in case this d in needed for compatibility with type-2 in some possible context.

thanke

member

Activity: 104

Merit: 10

Quote from: iddo on May 07, 2013, 07:44:50 AM

Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?

If Kpar is used as a parent then it will be a rather "static" object, so that Kpar will likely be used for spending during the lifetime of c_i. Plus when you compute a derived key you would need the pubkey and the chaincode at the same time anyway. Besides that, I disagree based on the general picture: pubkey=public, chaincode=anonymity, privkey=funds, which I like and find easy to sell and easy to argue about.

My only point was that usually hash(pubkey)=public rather than pubkey=public, so if for example the attacker somehow obtained the privkey k_5 and the chaincode c_1 but only knows the corresponding hash(K_1) then with your scheme he can derive c_2,c_3,c_4,c_5 and then derive the entire subtree of k_5, while with BIP32 he couldn't do it unless the coins of K_1 get spent and K_1 is revealed on the blockchain.

Yes, I understood your point. My point is that K_1 is so static (valid for the lifetime of the tree rooted at K_1) that it will get revealed on the blockchain rather soon. To avoid your attack the user would have to empty the whole subtree rooted at K_1 before ever using spending from K_1. That is not realistic.

With chaincode=anonymity I think that you meant that "leakage of a chaincode"=="loss of anonymity", not that the chaincodes themselves contribute to anonymity? The purpose of the chaincodes is hierarchical derivations, in particular the ability to delegate/share a subtree with another person, and perhaps greater security when we only need to access the relevant chaincode instead of the master seed. Are there other purposes that I'm missing?

Yes, agree, these are the purposes.

Quote from: iddo on May 07, 2013, 07:44:50 AM

But your concern reveals a weakness in our general concept. That's also why I once asked if intermediate keys are to be used on the blockchain at all. Because re-using pubkeys on the blockchain is generally not encouraged. Intermediate nodes can avoid that by reserving one child key branch (i=0) for themselves and using that instead of their "own" key.

Which concern?
I failed to understand how using only the keys of leaf nodes is related to not re-using keys on the blockchain. The key gets re-used when you receive more than one payment to the same receiving address, how is that related to the hierarchical layout of the wallet?

Your concern about a privacy loss if cpar leaks and Kpar has been used for spending.

It is generally discouraged to receive more than one payment to the same address. One reason is anonymity (not only the receiver's, also the payer's anonymity), the other reason is that if you have two outputs to the same address, you're unlikely to spend them both at the same time, so the public key will be revealed before you get to spend the second output. I am saying that this general guideline is not compatible with using intermediate tree nodes for receiving, because an intermediate node is too "static" (it roots a subtree which probably has a significant lifetime). Only leafs are suitable "one-time objects".

iddo

sr. member

Activity: 360

Merit: 251

Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?

If Kpar is used as a parent then it will be a rather "static" object, so that Kpar will likely be used for spending during the lifetime of c_i. Plus when you compute a derived key you would need the pubkey and the chaincode at the same time anyway. Besides that, I disagree based on the general picture: pubkey=public, chaincode=anonymity, privkey=funds, which I like and find easy to sell and easy to argue about.

My only point was that usually hash(pubkey)=public rather than pubkey=public, so if for example the attacker somehow obtained the privkey k_5 and the chaincode c_1 but only knows the corresponding hash(K_1) then with your scheme he can derive c_2,c_3,c_4,c_5 and then derive the entire subtree of k_5, while with BIP32 he couldn't do it unless the coins of K_1 get spent and K_1 is revealed on the blockchain. Similarly with regard to privacy concerns, if the attacker obtained just c_1 then he could derive c_2,c_3,...,c_i until reaching two consecutive pubkeys that were revealed on the blockchain, which he would detect by testing G*f(c_i)+K_i==K_{i+1}, and therefore know the subtree of all future pubkeys from here (as opposed to BIP32 where the attacker only has c_1 and hash(K_1) and therefore cannot do anything because he cannot derive any other chaincodes).

With chaincode=anonymity I think that you meant that "leakage of a chaincode"=="loss of anonymity", not that the chaincodes themselves contribute to anonymity? The purpose of the chaincodes is hierarchical derivations, in particular the ability to delegate/share a subtree with another person, and perhaps greater security when we only need to access the relevant chaincode instead of the master seed. Are there other purposes that I'm missing?

But your concern reveals a weakness in our general concept. That's also why I once asked if intermediate keys are to be used on the blockchain at all. Because re-using pubkeys on the blockchain is generally not encouraged. Intermediate nodes can avoid that by reserving one child key branch (i=0) for themselves and using that instead of their "own" key.

Which concern?
I failed to understand how using only the keys of leaf nodes is related to not re-using keys on the blockchain. The key gets re-used when you receive more than one payment to the same receiving address, how is that related to the hierarchical layout of the wallet?

thanke

member

Activity: 104

Merit: 10

Edit2: You're hiding the output size of the HMAC there, it should be 512 bits, and if c_i=I_R is 256 bits then you feed less than optimal entropy into the hash function, so the lengths don't work as well as they could. Please write your proposed type-2 HMAC derivation more clearly?

This was intentional, it should be trivial to fill in the corrent sizes once everything else is settled.

Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?

If Kpar is used as a parent then it will be a rather "static" object, so that Kpar will likely be used for spending during the lifetime of c_i. Plus when you compute a derived key you would need the pubkey and the chaincode at the same time anyway. Besides that, I disagree based on the general picture: pubkey=public, chaincode=anonymity, privkey=funds, which I like and find easy to sell and easy to argue about.

But your concern reveals a weakness in our general concept. That's also why I once asked if intermediate keys are to be used on the blockchain at all. Because re-using pubkeys on the blockchain is generally not encouraged. Intermediate nodes can avoid that by reserving one child key branch (i=0) for themselves and using that instead of their "own" key.

4. A further suggestion for type-1 on top of 3.: (k_i,c_i) := HMAC(S,i) where S := HMAC(kpar,cpar). This achieves that k_i=func(kpar,cpar,i) as before. Plus it has the following properties:
- storing S there is no need to decrypt kpar when deriving children (the new advantage of this)
- children cannot be derived from kpar alone (iddo insisted on this property)
- children cannot be derived from cpar alone (we assume cpar is stored unencrypted)
Storing S is of course optional. The common (default) use would be just to recompute it each time.

On the one hand you say that it's advantageous only if we store S, and on the other hand you recommend to recompute S each time?
Basically if we always store S then it means that when we derive a new privkey then only the child privkey (not the parent privkey) will be in memory? But the user will still need to provide his AES passphrase to encrypt the child privkey, so I don't think that we gain much with this idea?

Correct, that's why I wouldn't bother with storing S in the reference implementation. We also don't want to complicate extended keys further, their structure shall remain unchanged, they are just pairs.
Storing S is more of an option for an advanced user, who wants to store S on a different machine than k_par for some reason, and encrypts the k_i with a different key than he encrypts k_par with. As a result of this, parent and child funds are completely separated, whereas before parent funds required a subset (kpar) of the information needed for child funds (kpar and cpar).

If you have an actual reason to doubt SHA512 then I think that resorting to SHA256 shouldn't help you sleep better at nights.

True. What about reducing code dependencies? Is that irrelevant?

iddo

sr. member

Activity: 360

Merit: 251

3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?

Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)

4. A further suggestion for type-1 on top of 3.: (k_i,c_i) := HMAC(S,i) where S := HMAC(kpar,cpar). This achieves that k_i=func(kpar,cpar,i) as before. Plus it has the following properties:
- storing S there is no need to decrypt kpar when deriving children (the new advantage of this)
- children cannot be derived from kpar alone (iddo insisted on this property)
- children cannot be derived from cpar alone (we assume cpar is stored unencrypted)
Storing S is of course optional. The common (default) use would be just to recompute it each time.

On the one hand you say that it's advantageous only if we store S, and on the other hand you recommend to recompute S each time?
Basically if we always store S then it means that when we derive a new privkey then only the child privkey (not the parent privkey) will be in memory? But the user will still need to provide his AES passphrase to encrypt the child privkey, so I don't think that we gain much with this idea?

Quote from: thanke on May 05, 2013, 12:14:07 PM

5. It is certainly possible to do everything with HMAC-SHA256 alone. For instance, if you need two values like (I_L,I_R) you can do I_L=HMAC(secret,0) and I_L=HMAC(secret,1). The question is, does it reduce dependencies of the code, code review, etc. to be worthwhile? Or does it matter here that SHA256 has by now become the best-tested hash function in the world?

If you have an actual reason to doubt SHA512 then I think that resorting to SHA256 shouldn't help you sleep better at nights.

iddo

sr. member

Activity: 360

Merit: 251

Quote

Doesn't it break type-1 derivations?

Of course, type-1 would be left at (I_L,I_R) := func(cpar,kpar,i).

Apologies, please see my edited post there.

Quote from: thanke on May 05, 2013, 12:14:07 PM

Quote from: iddo on May 05, 2013, 11:45:07 AM

Suppose that's I'm a user of BIP32 and I wish to re-use some I_L, how could I do it?
It appears that this concern is with regard to a non-issue, because we cannot reuse I_L with BIP32 ? And even if we could, we'd only lose privacy, not security.
I'm not sure whether the speedup that we obtain with the additive variant is significant. If it is, then I guess that it's better to implement the faster version, especially for heavy users such as online wallet servers, e.g. the MtGox wallet. Maybe it'd also be helpful for smartphones etc.
About side-channel attacks, I don't think that there's any significant difference between I_L*k_par and I_L+k_par ?

True, it's a non-issue with BIP32 as it stands, as long as nobody tries to re-use I_L by some non-spec behaviour.
It seems the decisive argument between additive and multiplicative is protection against timing-attacks, probably Pieter can comment on that. If both can be implemented in constant time I would go with multiplicative.

If some hacker tries to re-use I_L because of some peculiar reason, and loses privacy (not security) as a result, I could live with that...
I don't think that there could be any timing attacks if the privkey derivation is just I_L*k_par or I_L+k_par, Pieter already commented on timing attacks in post #140

thanke

member

Activity: 104

Merit: 10

Quote from: iddo on May 05, 2013, 11:45:07 AM

[..] it's difficult to keep track of all the posts in this long thread:)

Right

, so why again was this:

Quote from: iddo on May 05, 2013, 11:45:07 AM

After all, I am still in favor of type-2 in the form of (I_L,I_R) := HMAC(c_par,i), k_i := I_L*k_par, c_i := I_R. The two reasons are: 1) it allows something new without breaking anything old, 2) it simplifies the definition.

Doesn't it break type-1 derivations?

Of course, type-1 would be left at (I_L,I_R) := func(cpar,kpar,i).

Suppose that's I'm a user of BIP32 and I wish to re-use some I_L, how could I do it?
It appears that this concern is with regard to a non-issue, because we cannot reuse I_L with BIP32 ? And even if we could, we'd only lose privacy, not security.
I'm not sure whether the speedup that we obtain with the additive variant is significant. If it is, then I guess that it's better to implement the faster version, especially for heavy users such as online wallet servers, e.g. the MtGox wallet. Maybe it'd also be helpful for smartphones etc.
About side-channel attacks, I don't think that there's any significant difference between I_L*k_par and I_L+k_par ?

True, it's a non-issue with BIP32 as it stands, as long as nobody tries to re-use I_L by some non-spec behaviour.
It seems the decisive argument between additive and multiplicative is protection against timing-attacks, probably Pieter can comment on that. If both can be implemented in constant time I would go with multiplicative.

iddo

sr. member

Activity: 360

Merit: 251

2. Additive vs multiplicative: I don't know what your status is regarding constant-time implementations. But unless timing attacks discredit the multiplicative variant, I would stay with it because of the issue raised in #155. Note that #155 is a non-issue as long as you can guarantee that no I_L is ever reused. If you can't guarantee that, in particular if you wanted to allow the re-use of chaincodes, then multiplicative would be mandatory.

I haven't noticed post #155, it's difficult to keep track of all the posts in this long thread:)

Suppose that's I'm a user of BIP32 and I wish to re-use some I_L, how could I do it?
It appears that this concern is with regard to a non-issue, because we cannot reuse I_L with BIP32 ? And even if we could, we'd only lose privacy, not security.
I'm not sure whether the speedup that we obtain with the additive variant is significant. If it is, then I guess that it's better to implement the faster version, especially for heavy users such as online wallet servers, e.g. the MtGox wallet. Maybe it'd also be helpful for smartphones etc.
About side-channel attacks, I don't think that there's any significant difference between I_L*k_par and I_L+k_par ?

iddo

sr. member

Activity: 360

Merit: 251

Quote from: iddo on May 03, 2013, 01:05:27 PM

Quote from: Pieter Wuille on May 03, 2013, 12:10:13 PM

Is there a use case to allow updating keys without updating chain codes, that's worth breaking the current spec for?

The supposed use case is described in post #167,#172,#176 (edit: to be self-contained here, the use case is that A gives B some pubkey and the corresponding chaincode so that B could derive the subsequent pubkeys via type-2, then A creates a new keypair and gives the new pubkey to B, to be used with the same old chaincode, therefore there's no need to send a new chaincode via eavesdropping-resistant communication, but the communication still has to be authenticated otherwise anyone could impersonate A).

Let me elaborate on this use case. Think of many B's, for example A is a telcom corp and the B's are millions of dealers in every village on the planet. Each B knows its I_L which is considered as B's random name, issued by A, used to identify B. Now if A publishing a (signed) new root pubkey, this is a single, identical piece of information required by all B's. No individual piece of information is required to be exchanged with each B. The scenario is mainly intended for when the new root pubkey is completely unrelated to the old root pubkey, for example if a new owner takes over A. Think of A's root pubkey being published in a certificate of some PKI. Then each B can download it from keyservers. Regardless of encryption, this is a significantly different kind of communication than of each node in the tree getting updated from its parent.

That could indeed be nice, but still it's probably not a big deal if A also creates a new root chaincode and sends it to all the B's, especially if this root replacement would typically be a one-time important event, as opposed to frequent events. Also, if A and the B's knew in advance that they would need to support this kind of secure communication, then they can use something like attribute based encryption, and thereby have a solution which is completely external to BIP32 and achieves the same properties that you seek, and it's also more flexible because it could be used by A and the B's to exchange other kinds of data instead of just the chaincode.

And I'm not so sure that the properties that you seek come without risks: it appears that in the scenario that you described the stakes are raised, in the sense that if an attacker succeeds in forging A's signature for the root pubkey, then the attacker can cause huge damage with just one forged signature.