Author

Topic: how deep in the tree of address---hd wallet on restore (Read 1070 times)

legendary
Activity: 1456
Merit: 1081
I may write code in exchange for bitcoins.
@fbueller, thanks for the discussion and th elinks.  I'm checking out that thread you linked me to now.

@DH, your description is a much better way to state the problem.  I was mainly just trying to capture the question of whether there's a smart way to know how deep in the hd tree (or chain, if there's just one branch) to generate when loading a wallet.  Seems like the principle here is kinda heurisitc, generate addresses until you find a large sequence of addresses which seem to be unused.  It's a good point that software can also prompt the user for whether to keep going on this (if the balance looks right).

Thank again for the info, folks.
legendary
Activity: 3472
Merit: 4801
Hi guys, I may be missing something crucial here, but I'm sure you guys will correct me soon enough.

HD wallets let a user worry less about making a new backup for change addresses because the change addresses are all deterministically generatable from the seed.  That let me to thinking about restoring from an hd backup.  Say you only have the seed and you've lost everything else.   Your wallet needs to know which addressess are yours when it loads the blockchain, it should show you your balance based on how many addresses on the blockchain have spendable coins and belong to you.  But here's the question, how deep in the tree of generatable addresses should it go when looking?  How can it know beforehand (on a restore) how deep in the tree you went in creating addresses for yourself.

For example, consider receiving an address as you load the blockchain, you ask yourself "is this my address"?  No.  Is this my first change address?  No.  Is this my second change address? No.  ...  When do you stop and say, "okay, this address isn't in my tree of addresses"?

Thanks in advance for the insight!

Honestly, I have no idea how the HD wallets handle recovery from a seed, so take everything I say in the post purely as a guess or as "the way I think I would probably do it if I were writing the software".

The way you seem to be looking at it seems to be backwards.  If I understand correctly you're assuming that you start with the first address in the first block in the blockchain and check it against every possible address that might be generated from the seed. Then you move to the next address in the blockchain and repeat the process, and continue until the entire blockchain has been loaded/processed.

If I were writing the software, I think I'd probably process the entire blockchain as if I didn't have any addresses at all, building up the complete UTXO list.

Next, I'd collect a list of all unique addresses in the UTXO list.

Then, I'd generate the first candidate address from my seed, and check to see if the address exists in the UTXO list. If so, then I've found an address that has some bitcoins associated with it.

I'd continue to generate each possible address from the seed and check for "exists in the UTXO list" until I've encountered some predefined gap with no balances (perhaps 5000 addresses?).

So, if I generate from the seed "gap" number of consecutive candidate addresses without having found any of them in the UTXO, I'd assume I've found all addresses with a balance and I'd be done.

I might ask the user if the resulting balance appears to be correct.  If they answer "yes", then the wallet is completely recovered.  If they answer "no", then I might continue to search for a larger gap.
sr. member
Activity: 412
Merit: 287
Not knowing which address index to use from a backup was a problem I had with copay - one participant lost his wallet, and the other users didn't have the address index he used. We lost bitcoins until Bitpay were able to find it on the server side given the timestamp in the other users wallet.. There was some bug that didn't associate the wallet correctly on a minor release, but it shown me how hard it is without this information. There's certainly no way you can derive the whole tree.
sr. member
Activity: 412
Merit: 287
BIP44 is just one of such schemes - so while it doesn't even tell you which derivations (like cointype, purpose, etc) to follow exactly, knowing it's BIP44 is at least a start.

I'm not sure how the Schildbach wallet works - but I presume it derives addresses along one HD path, eg, m/0'/a/b/x. As long as you know the only number that really changes is x (0'/a/b are wallet specific), you can brute force n addresses, stopping when the last 10 or whatever have not received outputs.

It's even more awkward with HD multisignature accounts. The other participants keys, the # required signatures, and even the order they are specified in the script, are all additional metadata that are required in order to derive the correct address from a backup.

EDIT: Someone is comparing wallets for compatibility between schemes here: https://bitcointalksearch.org/topic/deterministic-wallet-compatibility-matrix-1000544
legendary
Activity: 1456
Merit: 1081
I may write code in exchange for bitcoins.
I guess I need to look more closely at BIP44.  Perhaps the answer to my question is in there.  And, what you said about metadata certainly hints at an answer.  I should clarify, though, it's not that I'm writing software to implement this, it's that I'm just curious how current software does it.  I use, for example, Schildbach's wallet for android.  This wallet has an hd seed and I know what the seed is.  I was simply imagining how I might recover all my transactions in some worst-case scenario where all I had left was my seed.  If I loaded my seed into a wallet, how could I know in principle whether a given address was in my tree of addresses upon loading the blockchain?
sr. member
Activity: 412
Merit: 287
You can't practically brute force all the possible keys, so only derive what you need. Electrum uses a gap limit, on the single HD chain it uses for addresses. You could assume that branches will be sequential, and do something similar for branches.

But you can't do this indefinitely, which is why defined structures like BIP44 exist. They generally require some metadata in addition to your root key.

So it depends how you want to do it!
legendary
Activity: 1456
Merit: 1081
I may write code in exchange for bitcoins.
Hi guys, I may be missing something crucial here, but I'm sure you guys will correct me soon enough.

HD wallets let a user worry less about making a new backup for change addresses because the change addresses are all deterministically generatable from the seed.  That let me to thinking about restoring from an hd backup.  Say you only have the seed and you've lost everything else.   Your wallet needs to know which addressess are yours when it loads the blockchain, it should show you your balance based on how many addresses on the blockchain have spendable coins and belong to you.  But here's the question, how deep in the tree of generatable addresses should it go when looking?  How can it know beforehand (on a restore) how deep in the tree you went in creating addresses for yourself.

For example, consider receiving an address as you load the blockchain, you ask yourself "is this my address"?  No.  Is this my first change address?  No.  Is this my second change address? No.  ...  When do you stop and say, "okay, this address isn't in my tree of addresses"?

Thanks in advance for the insight!
Jump to: