Author

Topic: Technical help needed regarding Electrum mnemonics (Read 212 times)

legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
Yeah I do realize that! But the problem isn't the HMAC function, it instead is what's being passed to this function.

You see unlike BIP-39, Electrum heavily modifies the words in the given string before it passes them to the HMAC function. For example in the case of the test vector linked above all spaces are being removed then it is passed to the HMAC function as "msg". That makes the data part 35 bytes whereas it normally would have somewhere around 80+.
Code:
眼 悲 叛 改 节 跃 衡 响 疆 股 遂 冬 -> 眼悲叛改节跃衡响疆股遂冬

Basically my code was simply missing these 2 methods:
https://github.com/Autarkysoft/Denovo/blob/648419d8a3ccd590051ab390539b0b7147917d4e/Src/Autarkysoft.Bitcoin/ImprovementProposals/ElectrumMnemonic.cs#L234-L258
legendary
Activity: 3710
Merit: 1586
see here:

https://en.wikipedia.org/wiki/HMAC

in electrum it's being used as a fancy hash function. the message is the seed mnemonic and the key is simply the string "seed version" in byte form.
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
legendary
Activity: 3710
Merit: 1586
it's outputting the following when using the functions in electrum:

Code:
1002a06ca7de987ae74c4189cfdc3cf45bb3a836d21019bc19e265d2858e3eccc7b26c9a5509a4f7ccb9f9fc96663d268ca2a1e49dd2d08af832e96fccaf95c5

so it's a valid segwit seed.
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
I'm trying to add Electrum mnemonic recovery option to The FinderOuter and looking at the test_mnemonic.py file I'm having trouble understanding the tests even though I think I understand how the mnemonic.py file works.

The English test vectors in this file and also random keys I created using Electrum work fine meaning after computing HmachSha512(data=words, key="Seed version") I get a digest that starts with SEED_PREFIX (01) or SEED_PREFIX_SW (100). eg. the first digest is 1001bc7d1ea.... which is all Electrum looks for in a valid mnemonic (there is no checksum).

But the rest don't.
Take the Chinese case for example, the digest is
Code:
0f5c4c9ff66e87bcbdde59f06ad540eba48fe06f7bdfa16b1248cb868ae3cd3fe7f6ea08e50bc77fdec1f08d8b5710bd25a9d76427e636feed23cbcb3ec8b8cb
Which is incorrect.
It is worth mentioning that the words are normalized using form KD as is with BIP-39 and return the same bytes as the "words_hex" in test vector.

So what is the problem here?
Jump to: