Pages:
Author

Topic: Deterministic wallets - page 5. (Read 48264 times)

member
Activity: 104
Merit: 10
May 10, 2013, 08:09:38 AM
I found an argument why the multiplier I_L in K_i := I_L *Kpar (or k_i := I_L*G + Kpar) should depend on Kpar.

Recall: In BIP 32 we have (IL,IR) = func(cpar,Kpar,i) whereas I previously had suggested (IL,IR) = func(cpar,i).

The argument is about the provability of the link between Kpar and K_i. There are three things that one might want to prove:
1. Kpar and Ki have the same owner
2. Ki was derived from Kpar (and not some other parent)
3. Kpar existed before Ki

Revealing I_L clearly proves 1. So let's discuss the other two properties.
Case IL=func(cpar,i): Given Ki we can choose arbitrary cpar and i, compute I_L:=func(cpar,i), compute Kpar with Ki=I_L*Kpar. So we can trivially come up with infinitely many plausible parents Kpar.
Case IL=func(cpar,Kpar,i): The above is not possible anymore, at least if Kpar is properly hashed in func. If we know Kpar we can check that it is a parent, otherwise we cannot come up with any plausible parent at all. So we can prove 2. and 3. after all.

For this to be useful, as we might not want to reveal cpar, we would define:
I_L := func(func(cpar,i),Kpar)
and reveal func(cpar,i) as proof of the link. So I actually suggest (for type-2):
Code:
I := HMAC( HMAC(cpar,i) , Kpar)

However, this is not yet an argument why c_i should be func(cpar,Kpar,i) instead of func(cpar,i).
sr. member
Activity: 360
Merit: 251
May 09, 2013, 02:14:57 PM
The reason for specifying that I_L >= n isn't allowed is indeed to guarantee uniformity. However, as this has such a ridiculously small chance to ever occur, I don't want whatever rule is used to deal with it to be complex to implement.

Also, the reason for using the same construction (k_i = k_par + I_L) for type-1, is to 1) make the code to implement it more uniform 2) not give a performance advantage to type-1. I agree that strictly from a security point of view k_i = I_L is just as good.

OK, great, please let me just mention that IMHO you should finalize BIP32 as it is now. In the time period from when BIP32 is final until it's ready to be rolled out in the Satoshi client, maybe we should solicit more peer review by other crypto specialists, just to be extra safe before this major change.

Regarding thanke's alternative scheme, I'd say that it could have some drawbacks (possibly less security/privacy, using less than optimal entropy), and its use case isn't too convincing (and doesn't support type-1 derivations that already existed in the subtree, so we'd need to be careful not to cause mix-ups).

There are still pending issues that are external to BIP32 itself, and as you mentioned it'd be good to have recommended guidelines.
I'll summarize here several external issues that seemed important to me:
1) exporting some intermediate node as the root node of a new HD wallet. One use case for this is the hot storage / cold storage issue, where we use type-1 at some relevant place of the tree structure to derive a new keypair whose subtree would be considered hot storage, and exporting this new keypair as a root of a secondary HD wallet file so that its AES passphrase will decrypt only the hot storage privkeys. In this scenario it could be nice that the secure cold storage wallet also gives access to the hot storage part.
2) incentivize users to keep their chaincodes private, I recommended to have a deniable encryption feature in this context.
3) HD brainwallet passphrase derivation via self-descriptive strengthened keying. BTW if you agree that scrypt makes sense for password derivation, this also implies that you wouldn't need to mess with OpenCL code in the Satoshi client (because we'd minimize the GPU advantage anyway). I asked Colin Percival about his recommended scrypt parameters for password derivation and he said r=8, p=1: http://bitbin.it/E68HeKkM
legendary
Activity: 1072
Merit: 1174
May 08, 2013, 06:26:04 PM
The reason for specifying that I_L >= n isn't allowed is indeed to guarantee uniformity. However, as this has such a ridiculously small chance to ever occur, I don't want whatever rule is used to deal with it to be complex to implement.

Also, the reason for using the same construction (k_i = k_par + I_L) for type-1, is to 1) make the code to implement it more uniform 2) not give a performance advantage to type-1. I agree that strictly from a security point of view k_i = I_L is just as good.
sr. member
Activity: 360
Merit: 251
May 08, 2013, 05:28:46 PM
I still agree with thanke that for type-1 derivation it makes slightly more sense to have k_i=I_L and specify that k_i is invalid if I_L=0 or I_L>=n, rather than to have k_i=k_par+I_L (mod n) and specify that k_i is invalid if I_L>=n. Note that the OP of this thread also specifies for type-1 that "Privatekey(type,n) is then simply set to H(n|S|type)". With k_i=I_L we could still treat d=k_i-k_par as the difference, so d*G+K_par is the corresponding pubkey, in case this d in needed for compatibility with type-2 in some possible context.

One tiny correction: k_i=n is valid according to https://en.bitcoin.it/wiki/Private_key, so if type-1 is k_i=I_L then k_i is invalid when I_L=0 or I_L>n
Alternatively, type-1 could be k_i=1+I_L, and it'd be invalid when k_i>n
Or maybe we should stick with the current k_i=k_par+I_L (mod n), any of these 3 possibilities would be OK.
sr. member
Activity: 360
Merit: 251
May 08, 2013, 05:27:22 PM
It is generally discouraged to receive more than one payment to the same address. One reason is anonymity (not only the receiver's, also the payer's anonymity), the other reason is that if you have two outputs to the same address, you're unlikely to spend them both at the same time, so the public key will be revealed before you get to spend the second output. I am saying that this general guideline is not compatible with using intermediate tree nodes for receiving, because an intermediate node is too "static" (it roots a subtree which probably has a significant lifetime). Only leafs are suitable "one-time objects".

I don't really see the problem here. The default HD wallet layout that probably most users will have is just one or more subaccounts where the subaccount is an unbounded linear path, meaning that it'd be similar to the current random-independent wallet in the sense that the user would just keep generating new receiving addresses as needed, except that your initial backup of the root node is enough to restore all the subsequent privkeys, and greater security / less hassle because there's no need to access the encrypted privkeys in order to generate new receiving addresses (plus the other possible benefits like type-2 pubkey delegation and brainwallet passphrase).
member
Activity: 104
Merit: 10
May 08, 2013, 07:24:50 AM
True, the order of G is 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
so roughly one out of 2^128 times
sr. member
Activity: 360
Merit: 251
May 08, 2013, 06:15:40 AM
Since n =  2^256 - 2^32 - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1 the case I_L >= n occurs in one out of 2^224 times...

I think that that's the order of the field F_p, not n=order(G)
member
Activity: 104
Merit: 10
May 08, 2013, 06:11:24 AM
3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?

Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)

If I'm guessing correctly, the reason that BIP32 specifies that for I_L>=n that resulting keypair is invalid is because we would like to have the privkey chosen uniformly at random from the valid range, so if we take I_L modulo n then the smaller privkey values will have a little bit higher probability?

Edit: instead of specifying that k_i is invalid, we could specify for example k_i=SHA256(I_L) in this case (and invoke SHA256 again if k_i is still invalid, and so on), so that we couldn't have forced gaps in the indices of the HD wallet? But if the probability is less than 1/2^127 then specifying that i is invalid is also fine.

I still agree with thanke that for type-1 derivation it makes slightly more sense to have k_i=I_L and specify that k_i is invalid if I_L=0 or I_L>=n, rather than to have k_i=k_par+I_L (mod n) and specify that k_i is invalid if I_L>=n. Note that the OP of this thread also specifies for type-1 that "Privatekey(type,n) is then simply set to H(n|S|type)". With k_i=I_L we could still treat d=k_i-k_par as the difference, so d*G+K_par is the corresponding pubkey, in case this d in needed for compatibility with type-2 in some possible context.

Since n =  2^256 - 2^32 - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1 the case I_L >= n occurs in one out of 2^224 times...
sr. member
Activity: 360
Merit: 251
May 08, 2013, 05:08:01 AM
3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?

Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)

If I'm guessing correctly, the reason that BIP32 specifies that for I_L>=n that resulting keypair is invalid is because we would like to have the privkey chosen uniformly at random from the valid range, so if we take I_L modulo n then the smaller privkey values will have a little bit higher probability?

Edit: instead of specifying that k_i is invalid, we could specify for example k_i=SHA256(I_L) in this case (and invoke SHA256 again if k_i is still invalid, and so on), so that we couldn't have forced gaps in the indices of the HD wallet? But if the probability is less than 1/2^127 then specifying that i is invalid is also fine.

I still agree with thanke that for type-1 derivation it makes slightly more sense to have k_i=I_L and specify that k_i is invalid if I_L=0 or I_L>=n, rather than to have k_i=k_par+I_L (mod n) and specify that k_i is invalid if I_L>=n. Note that the OP of this thread also specifies for type-1 that "Privatekey(type,n) is then simply set to H(n|S|type)". With k_i=I_L we could still treat d=k_i-k_par as the difference, so d*G+K_par is the corresponding pubkey, in case this d in needed for compatibility with type-2 in some possible context.
member
Activity: 104
Merit: 10
May 08, 2013, 03:57:35 AM
Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?

If Kpar is used as a parent then it will be a rather "static" object, so that Kpar will likely be used for spending during the lifetime of c_i. Plus when you compute a derived key you would need the pubkey and the chaincode at the same time anyway. Besides that, I disagree based on the general picture: pubkey=public, chaincode=anonymity, privkey=funds, which I like and find easy to sell and easy to argue about.

My only point was that usually hash(pubkey)=public rather than pubkey=public, so if for example the attacker somehow obtained the privkey k_5 and the chaincode c_1 but only knows the corresponding hash(K_1) then with your scheme he can derive c_2,c_3,c_4,c_5 and then derive the entire subtree of k_5, while with BIP32 he couldn't do it unless the coins of K_1 get spent and K_1 is revealed on the blockchain.

Yes, I understood your point. My point is that K_1 is so static (valid for the lifetime of the tree rooted at K_1) that it will get revealed on the blockchain rather soon. To avoid your attack the user would have to empty the whole subtree rooted at K_1 before ever using spending from K_1. That is not realistic.

With chaincode=anonymity I think that you meant that "leakage of a chaincode"=="loss of anonymity", not that the chaincodes themselves contribute to anonymity? The purpose of the chaincodes is hierarchical derivations, in particular the ability to delegate/share a subtree with another person, and perhaps greater security when we only need to access the relevant chaincode instead of the master seed. Are there other purposes that I'm missing?

Yes, agree, these are the purposes.

But your concern reveals a weakness in our general concept. That's also why I once asked if intermediate keys are to be used on the blockchain at all. Because re-using pubkeys on the blockchain is generally not encouraged. Intermediate nodes can avoid that by reserving one child key branch (i=0) for themselves and using that instead of their "own" key.

Which concern?
I failed to understand how using only the keys of leaf nodes is related to not re-using keys on the blockchain. The key gets re-used when you receive more than one payment to the same receiving address, how is that related to the hierarchical layout of the wallet?

Your concern about a privacy loss if cpar leaks and Kpar has been used for spending.

It is generally discouraged to receive more than one payment to the same address. One reason is anonymity (not only the receiver's, also the payer's anonymity), the other reason is that if you have two outputs to the same address, you're unlikely to spend them both at the same time, so the public key will be revealed before you get to spend the second output. I am saying that this general guideline is not compatible with using intermediate tree nodes for receiving, because an intermediate node is too "static" (it roots a subtree which probably has a significant lifetime). Only leafs are suitable "one-time objects".
sr. member
Activity: 360
Merit: 251
May 07, 2013, 08:44:50 AM
Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?

If Kpar is used as a parent then it will be a rather "static" object, so that Kpar will likely be used for spending during the lifetime of c_i. Plus when you compute a derived key you would need the pubkey and the chaincode at the same time anyway. Besides that, I disagree based on the general picture: pubkey=public, chaincode=anonymity, privkey=funds, which I like and find easy to sell and easy to argue about.

My only point was that usually hash(pubkey)=public rather than pubkey=public, so if for example the attacker somehow obtained the privkey k_5 and the chaincode c_1 but only knows the corresponding hash(K_1) then with your scheme he can derive c_2,c_3,c_4,c_5 and then derive the entire subtree of k_5, while with BIP32 he couldn't do it unless the coins of K_1 get spent and K_1 is revealed on the blockchain. Similarly with regard to privacy concerns, if the attacker obtained just c_1 then he could derive c_2,c_3,...,c_i until reaching two consecutive pubkeys that were revealed on the blockchain, which he would detect by testing G*f(c_i)+K_i==K_{i+1}, and therefore know the subtree of all future pubkeys from here (as opposed to BIP32 where the attacker only has c_1 and hash(K_1) and therefore cannot do anything because he cannot derive any other chaincodes).

With chaincode=anonymity I think that you meant that "leakage of a chaincode"=="loss of anonymity", not that the chaincodes themselves contribute to anonymity? The purpose of the chaincodes is hierarchical derivations, in particular the ability to delegate/share a subtree with another person, and perhaps greater security when we only need to access the relevant chaincode instead of the master seed. Are there other purposes that I'm missing?

But your concern reveals a weakness in our general concept. That's also why I once asked if intermediate keys are to be used on the blockchain at all. Because re-using pubkeys on the blockchain is generally not encouraged. Intermediate nodes can avoid that by reserving one child key branch (i=0) for themselves and using that instead of their "own" key.

Which concern?
I failed to understand how using only the keys of leaf nodes is related to not re-using keys on the blockchain. The key gets re-used when you receive more than one payment to the same receiving address, how is that related to the hierarchical layout of the wallet?
member
Activity: 104
Merit: 10
May 06, 2013, 04:56:01 AM
Edit2: You're hiding the output size of the HMAC there, it should be 512 bits, and if c_i=I_R is 256 bits then you feed less than optimal entropy into the hash function, so the lengths don't work as well as they could. Please write your proposed type-2 HMAC derivation more clearly?

This was intentional, it should be trivial to fill in the corrent sizes once everything else is settled.

Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?

If Kpar is used as a parent then it will be a rather "static" object, so that Kpar will likely be used for spending during the lifetime of c_i. Plus when you compute a derived key you would need the pubkey and the chaincode at the same time anyway. Besides that, I disagree based on the general picture: pubkey=public, chaincode=anonymity, privkey=funds, which I like and find easy to sell and easy to argue about.

But your concern reveals a weakness in our general concept. That's also why I once asked if intermediate keys are to be used on the blockchain at all. Because re-using pubkeys on the blockchain is generally not encouraged. Intermediate nodes can avoid that by reserving one child key branch (i=0) for themselves and using that instead of their "own" key.

4. A further suggestion for type-1 on top of 3.: (k_i,c_i) := HMAC(S,i) where S := HMAC(kpar,cpar). This achieves that k_i=func(kpar,cpar,i) as before. Plus it has the following properties:
- storing S there is no need to decrypt kpar when deriving children (the new advantage of this)
- children cannot be derived from kpar alone (iddo insisted on this property)
- children cannot be derived from cpar alone (we assume cpar is stored unencrypted)
Storing S is of course optional. The common (default) use would be just to recompute it each time.

On the one hand you say that it's advantageous only if we store S, and on the other hand you recommend to recompute S each time?
Basically if we always store S then it means that when we derive a new privkey then only the child privkey (not the parent privkey) will be in memory? But the user will still need to provide his AES passphrase to encrypt the child privkey, so I don't think that we gain much with this idea?

Correct, that's why I wouldn't bother with storing S in the reference implementation. We also don't want to complicate extended keys further, their structure shall remain unchanged, they are just pairs.
Storing S is more of an option for an advanced user, who wants to store S on a different machine than k_par for some reason, and encrypts the k_i with a different key than he encrypts k_par with. As a result of this, parent and child funds are completely separated, whereas before parent funds required a subset (kpar) of the information needed for child funds (kpar and cpar).

If you have an actual reason to doubt SHA512 then I think that resorting to SHA256 shouldn't help you sleep better at nights.

True. What about reducing code dependencies? Is that irrelevant?
sr. member
Activity: 360
Merit: 251
May 05, 2013, 05:33:44 PM
3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?

Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)

4. A further suggestion for type-1 on top of 3.: (k_i,c_i) := HMAC(S,i) where S := HMAC(kpar,cpar). This achieves that k_i=func(kpar,cpar,i) as before. Plus it has the following properties:
- storing S there is no need to decrypt kpar when deriving children (the new advantage of this)
- children cannot be derived from kpar alone (iddo insisted on this property)
- children cannot be derived from cpar alone (we assume cpar is stored unencrypted)
Storing S is of course optional. The common (default) use would be just to recompute it each time.

On the one hand you say that it's advantageous only if we store S, and on the other hand you recommend to recompute S each time?
Basically if we always store S then it means that when we derive a new privkey then only the child privkey (not the parent privkey) will be in memory? But the user will still need to provide his AES passphrase to encrypt the child privkey, so I don't think that we gain much with this idea?

5. It is certainly possible to do everything with HMAC-SHA256 alone. For instance, if you need two values like (I_L,I_R) you can do I_L=HMAC(secret,0) and I_L=HMAC(secret,1). The question is, does it reduce dependencies of the code, code review, etc. to be worthwhile? Or does it matter here that SHA256 has by now become the best-tested hash function in the world?     

If you have an actual reason to doubt SHA512 then I think that resorting to SHA256 shouldn't help you sleep better at nights.
sr. member
Activity: 360
Merit: 251
May 05, 2013, 01:27:14 PM
Quote
Doesn't it break type-1 derivations?
Of course, type-1 would be left at (I_L,I_R) := func(cpar,kpar,i).
Apologies, please see my edited post there.

Suppose that's I'm a user of BIP32 and I wish to re-use some I_L, how could I do it?
It appears that this concern is with regard to a non-issue, because we cannot reuse I_L with BIP32 ? And even if we could, we'd only lose privacy, not security.
I'm not sure whether the speedup that we obtain with the additive variant is significant. If it is, then I guess that it's better to implement the faster version, especially for heavy users such as online wallet servers, e.g. the MtGox wallet. Maybe it'd also be helpful for smartphones etc.
About side-channel attacks, I don't think that there's any significant difference between I_L*k_par and I_L+k_par ?

True, it's a non-issue with BIP32 as it stands, as long as nobody tries to re-use I_L by some non-spec behaviour.
It seems the decisive argument between additive and multiplicative is protection against timing-attacks, probably Pieter can comment on that. If both can be implemented in constant time I would go with multiplicative.

If some hacker tries to re-use I_L because of some peculiar reason, and loses privacy (not security) as a result, I could live with that...
I don't think that there could be any timing attacks if the privkey derivation is just I_L*k_par or I_L+k_par, Pieter already commented on timing attacks in post #140
member
Activity: 104
Merit: 10
May 05, 2013, 01:14:07 PM
[..] it's difficult to keep track of all the posts in this long thread:)

Right  Wink, so why again was this:

After all, I am still in favor of type-2 in the form of (I_L,I_R) := HMAC(c_par,i), k_i := I_L*k_par, c_i := I_R. The two reasons are: 1) it allows something new without breaking anything old, 2) it simplifies the definition.

Doesn't it break type-1 derivations?

Of course, type-1 would be left at (I_L,I_R) := func(cpar,kpar,i).

Suppose that's I'm a user of BIP32 and I wish to re-use some I_L, how could I do it?
It appears that this concern is with regard to a non-issue, because we cannot reuse I_L with BIP32 ? And even if we could, we'd only lose privacy, not security.
I'm not sure whether the speedup that we obtain with the additive variant is significant. If it is, then I guess that it's better to implement the faster version, especially for heavy users such as online wallet servers, e.g. the MtGox wallet. Maybe it'd also be helpful for smartphones etc.
About side-channel attacks, I don't think that there's any significant difference between I_L*k_par and I_L+k_par ?

True, it's a non-issue with BIP32 as it stands, as long as nobody tries to re-use I_L by some non-spec behaviour.
It seems the decisive argument between additive and multiplicative is protection against timing-attacks, probably Pieter can comment on that. If both can be implemented in constant time I would go with multiplicative.
sr. member
Activity: 360
Merit: 251
May 05, 2013, 12:45:07 PM
2. Additive vs multiplicative: I don't know what your status is regarding constant-time implementations. But unless timing attacks discredit the multiplicative variant, I would stay with it because of the issue raised in #155. Note that #155 is a non-issue as long as you can guarantee that no I_L is ever reused. If you can't guarantee that, in particular if you wanted to allow the re-use of chaincodes, then multiplicative would be mandatory.

I haven't noticed post #155, it's difficult to keep track of all the posts in this long thread:)

Suppose that's I'm a user of BIP32 and I wish to re-use some I_L, how could I do it?
It appears that this concern is with regard to a non-issue, because we cannot reuse I_L with BIP32 ? And even if we could, we'd only lose privacy, not security.
I'm not sure whether the speedup that we obtain with the additive variant is significant. If it is, then I guess that it's better to implement the faster version, especially for heavy users such as online wallet servers, e.g. the MtGox wallet. Maybe it'd also be helpful for smartphones etc.
About side-channel attacks, I don't think that there's any significant difference between I_L*k_par and I_L+k_par ?
sr. member
Activity: 360
Merit: 251
May 05, 2013, 12:13:56 PM
  • Is there a use case to allow updating keys without updating chain codes, that's worth breaking the current spec for?

The supposed use case is described in post #167,#172,#176 (edit: to be self-contained here, the use case is that A gives B some pubkey and the corresponding chaincode so that B could derive the subsequent pubkeys via type-2, then A creates a new keypair and gives the new pubkey to B, to be used with the same old chaincode, therefore there's no need to send a new chaincode via eavesdropping-resistant communication, but the communication still has to be authenticated otherwise anyone could impersonate A).

Let me elaborate on this use case. Think of many B's, for example A is a telcom corp and the B's are millions of dealers in every village on the planet. Each B knows its I_L which is considered as B's random name, issued by A, used to identify B. Now if A publishing a (signed) new root pubkey, this is a single, identical piece of information required by all B's. No individual piece of information is required to be exchanged with each B. The scenario is mainly intended for when the new root pubkey is completely unrelated to the old root pubkey, for example if a new owner takes over A. Think of A's root pubkey being published in a certificate of some PKI. Then each B can download it from keyservers. Regardless of encryption, this is a significantly different kind of communication than of each node in the tree getting updated from its parent.

That could indeed be nice, but still it's probably not a big deal if A also creates a new root chaincode and sends it to all the B's, especially if this root replacement would typically be a one-time important event, as opposed to frequent events. Also, if A and the B's knew in advance that they would need to support this kind of secure communication, then they can use something like attribute based encryption, and thereby have a solution which is completely external to BIP32 and achieves the same properties that you seek, and it's also more flexible because it could be used by A and the B's to exchange other kinds of data instead of just the chaincode.

And I'm not so sure that the properties that you seek come without risks: it appears that in the scenario that you described the stakes are raised, in the sense that if an attacker succeeds in forging A's signature for the root pubkey, then the attacker can cause huge damage with just one forged signature.

After all, I am still in favor of type-2 in the form of (I_L,I_R) := HMAC(c_par,i), k_i := I_L*k_par, c_i := I_R. The two reasons are: 1) it allows something new without breaking anything old, 2) it simplifies the definition.

Doesn't it break type-1 derivations?

Edit: OK, after looking at your (3) and (4) above, I see that you don't break type-1 derivation, you just break the property that'd have the same old chaincodes after replacing the root keypair, because the type-1 derivations will result in new chaincodes.

Edit2: You're hiding the output size of the HMAC there, it should be 512 bits, and if c_i=I_R is 256 bits then you feed less than optimal entropy into the hash function, so the lengths don't work as well as they could. Please write your proposed type-2 HMAC derivation more clearly?

Edit3: I mentioned in post #123 that we possibly do gain more security when the chaincodes also depend on the pubkeys, because usually just the hashed address of a pubkey is publicly known (until the coins get spent). Do you disagree?
member
Activity: 104
Merit: 10
May 05, 2013, 05:13:48 AM
I've updated the specification to use addition instead of multiplication, but now I'd like to make the BIP32 specification final soon.

A few remarks before finalization, taking into account and drawing from the long discussion with iddo:

1. Can we reserve more than one bit of the number i for different kind of derivations, not only the highest bit? This would allow addition other kinds of derivations or tweaks of existing ones in the future.

2. Additive vs multiplicative: I don't know what your status is regarding constant-time implementations. But unless timing attacks discredit the multiplicative variant, I would stay with it because of the issue raised in #155. Note that #155 is a non-issue as long as you can guarantee that no I_L is ever reused. If you can't guarantee that, in particular if you wanted to allow the re-use of chaincodes, then multiplicative would be mandatory.

3. I am having an issue with the type-1 derivation. In current BIP 32 both types of derivation first produce an I_L in their own way and then set k_i := I_L * k_par (or + instead of *). For type-1 I_L depends on k_par. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC? Why the additional multiplication with k_par? This is unnecessarily complicated compared to setting (k_i,c_i) := HMAC(kpar,cpar,i). iddo phrased the desired property for type-1: "assuming that all the chaincodes of the wallet are public, and the privkey k_i leaks, the attacker still cannot find the privkey k_par". It is counter-intuitive to this property that a number I_L with k_i = I_L * k_par is even computed. Besides k_i and k_par, I_L is one more (unnecessary) thing that an implementation has to prevent from being swapped out to disk.

4. A further suggestion for type-1 on top of 3.: (k_i,c_i) := HMAC(S,i) where S := HMAC(kpar,cpar). This achieves that k_i=func(kpar,cpar,i) as before. Plus it has the following properties:
- storing S there is no need to decrypt kpar when deriving children (the new advantage of this)
- children cannot be derived from kpar alone (iddo insisted on this property)
- children cannot be derived from cpar alone (we assume cpar is stored unencrypted)
Storing S is of course optional. The common (default) use would be just to recompute it each time.

5. It is certainly possible to do everything with HMAC-SHA256 alone. For instance, if you need two values like (I_L,I_R) you can do I_L=HMAC(secret,0) and I_L=HMAC(secret,1). The question is, does it reduce dependencies of the code, code review, etc. to be worthwhile? Or does it matter here that SHA256 has by now become the best-tested hash function in the world?     

6. BIP 32 should, for each derivation type, explicitly list all properties that it claims to achieve. So that wallet implementers can rely on these properties instead of digging into the details of the definitions. 

7. I understand that it is not the concern of BIP 32 to specify how clients handle extended pubkeys. However, since this is important for security there should be some guidelines, possibly in a separate document. For example, I am thinking of an "export bit" for extended keys. A type-2 derived extended key would get this bit set for lifetime, and it would prohibit the export of its privkey.

  • Is there a use case to allow updating keys without updating chain codes, that's worth breaking the current spec for?

The supposed use case is described in post #167,#172,#176 (edit: to be self-contained here, the use case is that A gives B some pubkey and the corresponding chaincode so that B could derive the subsequent pubkeys via type-2, then A creates a new keypair and gives the new pubkey to B, to be used with the same old chaincode, therefore there's no need to send a new chaincode via eavesdropping-resistant communication, but the communication still has to be authenticated otherwise anyone could impersonate A).

Let me elaborate on this use case. Think of many B's, for example A is a telcom corp and the B's are millions of dealers in every village on the planet. Each B knows its I_L which is considered as B's random name, issued by A, used to identify B. Now if A publishing a (signed) new root pubkey, this is a single, identical piece of information required by all B's. No individual piece of information is required to be exchanged with each B. The scenario is mainly intended for when the new root pubkey is completely unrelated to the old root pubkey, for example if a new owner takes over A. Think of A's root pubkey being published in a certificate of some PKI. Then each B can download it from keyservers. Regardless of encryption, this is a significantly different kind of communication than of each node in the tree getting updated from its parent.

After all, I am still in favor of type-2 in the form of (I_L,I_R) := HMAC(c_par,i), k_i := I_L*k_par, c_i := I_R. The two reasons are: 1) it allows something new without breaking anything old, 2) it simplifies the definition.

  • Is there a reason to disallow secret derivation after public derivation?

IMHO absolutely not, because I see practical use cases for doing it (posts #122, #203, #205), and I don't see any reason to disallow it.

I think iddo is right. Secret derivation after public derivation allows a wallet implementation to be absolutely strict about the export bit. Because if the user desires to export the privkey of a type-2 derived extended key, then the wallet would just do a secret derivation and export the privkey of the result of that.
member
Activity: 63
Merit: 10
Vires in Numeris
May 04, 2013, 10:41:07 AM
Wouldn't it just be simpler to add the base point incrementally? Why not just start out with a private and public key, store the private one securely, then just keep on adding the base point to the public key to get new addresses.

Are you suggesting k_i=k_par+1, K_i=K_par+G ?
Then the keygen isn't pseudorandom (post #62), if one privkey leaks then the entire wallet immediately leaks too, and there's no hierarchical derivation.
Ah. I see. Nevermind what I said.
sr. member
Activity: 360
Merit: 251
May 04, 2013, 01:41:43 AM
Wouldn't it just be simpler to add the base point incrementally? Why not just start out with a private and public key, store the private one securely, then just keep on adding the base point to the public key to get new addresses.

Are you suggesting k_i=k_par+1, K_i=K_par+G ?
Then the keygen isn't pseudorandom (post #62), if one privkey leaks then the entire wallet immediately leaks too, and there's no hierarchical derivation.
Pages:
Jump to: