Pages:
Author

Topic: How do I identify the valid checksums for bip39 if I generate 11/12 of the word? - page 2. (Read 586 times)

legendary
Activity: 1512
Merit: 7340
Farewell, Leo
Thanks but the idea here was for me to learn how to do as much as possible myself
Building the software from scratch requires a certain degree of technical competence. If you don't feel confident with that, I strongly recommend you to either use code that isn't yours and that you've read it, or study software engineering until you do feel confident enough.

That said, can you give me a walk through on how I would do that step manually on windows?  TY
First of all, I want to make it clear that I don't want you to trust me. I want you to verify me. The code isn't difficult to read. Most of it happens in Form1.cs. I make use of the NBitcoin and Bitcoin.Net libraries which are broadly used in other software too.

There are two ways to execute this program. One's to import the source code in Visual Studio 2019, and then have it compiled. The easier way is to download CoinFlippedSeed-v0.3.zip, make sure that the SHA-1 of the zip is 4DA93F3D72A9EB65282650E15D4E3C288A28FD71*, unzip the binaries and run CoinFlippedSeed.exe.

*You can try to skip the integrity verification part (that is the SHA-1 verification) for the moment, just to try out the software, but it's important to do it regularly on most of the software you install. It makes sure that the binaries aren't compromised. Do it if you're about to create a Bitcoin wallet with funds deposited.
member
Activity: 104
Merit: 120
Thanks but the idea here was for me to learn how to do as much as possible myself and to avoid putting trust in any particular software relating to building your own bitcoin wallet offline.  That said, can you give me a walk through on how I would do that step manually on windows?  TY
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
Do you have any suggestions on the best way to do this in an offline Widows machine?
Hash bytes? Sure, but there are programs that let you make a seed yourself completely, not just for this part. That's one I've written: https://github.com/AngeloMetal/CoinFlippedSeed

Also, when you asked "Windows box? You mean Windows Forms in Visual Studio?" I simply meant a Windows PC.  Thanks.
The above program works on Windows.
member
Activity: 104
Merit: 120
Hi BlackHatCoiner,

Thanks for the tips.  When you stated "You need to convert your 128-bit string to bytes, and then hash that. It's just that most libraries do this conversion in the background, which brings some confusion" Do you have any suggestions on the best way to do this in an offline Widows machine?

Also, when you asked "Windows box? You mean Windows Forms in Visual Studio?" I simply meant a Windows PC.  Thanks.
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
2)   Convert the binary 128 bit string to hexadecimal.
3)   Perform a SHA 256 hash of the hexadecimal.
You don't hash the hexadecimal, and that's why you don't need to convert the binaries to hexadecimal. Hash functions take input as bytes. You need to convert your 128-bit string to bytes, and then hash that. It's just that most libraries do this conversion in the background, which brings some confusion.

Alternatively I imagine I could simply roll a16 sided dice to get 32 unique hex values and skip steps 1 and 2
Note that a 16-side dice is likely to be more prone to return less random results than a 6-side dice, and even less than a 2-side coin. You should run a chi-squared test, to check this.

That all said, can anyone here give me some insight with how I would perform steps 2 and 3 on a windows box (ideally offline)?
Windows box? You mean Windows Forms in Visual Studio?
member
Activity: 104
Merit: 120
Thanks all.  So if I'm getting this right and I wanted to simply create my own independent entropy for a BIP 39 12 word seed, I can do it by performing the following steps:

1)   Take 128 bit entropy (i.e. 11 randomly selected BIP 39 words and identifying their 11 bit codes + 7 random bits - or perhaps just 128 coin flips).
2)   Convert the binary 128 bit string to hexadecimal.
3)   Perform a SHA 256 hash of the hexadecimal.
4)   Convert this SHA 256 hex digest to a binary number and take the first 4 bits of this binary number output as the checksum.
5)   Append the checksum identified in step 4 to the entropy from step 1 and deconstruct the 132 bits into 12 groupings of 11 bits to get the BIP 39 12 word lists.

Alternatively I imagine I could simply roll a16 sided dice to get 32 unique hex values and skip steps 1 and 2 but would need to add a step between 4 and 5 above to convert the hex I rolled into binary to append the checksum.

That all said, can anyone here give me some insight with how I would perform steps 2 and 3 on a windows box (ideally offline)?


legendary
Activity: 2380
Merit: 5213
The correct checksum is 0001, so the last word is 11111000001.
The correct checksum is 1001 and the last 11 bits are 11111001001.
I think you made a typo, because the last word is still "Weird" and your final result is correct.
legendary
Activity: 4466
Merit: 3391
1)   I first generated a random 128 bit entropy as such:

1111001010110001011100111100010111010101101010101111111111101011101110000000010 0001001011111111101011111111000100000010101111100
2)   I next performed a hash of the entropy by saving it in a notepad.txt file then performing the following command:  certutil -hashfile test.txt SHA256

3)   The resulting hash is: bc4f595b36de2533832a47bf66535612688d81594449693bed9414180ab7cad4

4)   The first 4 bits of the hash would be 1011.  This is my understanding as I believe that when converting from hexadecimal to binary you must always represent each binary value with four bits.  In this example, b is converted to binary as 1011.  

The correct checksum is 0001, so the last word is 11111000001. The phrase is verify merit vapor prize quiz volume theme lucky young yellow life weird

Everything you did looks, ok except that you cannot use notepad to create the file being hashed because it stores a text version and not the binary itself. If you save a hex value instead of binary with notepad, you may be able to use "CertUtil -decodehex ..." to convert to binary for the sha256 calculation.

You can use this site to check your results: https://iancoleman.io/bip39/
legendary
Activity: 2268
Merit: 18748
(BIP 39 word "west")
As hosseinimr93 has pointed out, your checksum is incorrect. The correct final word should be "weird", not "west".

So if I understand it right then, the only requirement for a valid 12th word for this 12 word BIP 39 phrase would have to contain 1011 at the end of their bit pattern.  That would mean that in addition to the BIP 39 word "west" that I chose two other options could have been either  “earth” number 555 decimal / 1000101011 binary and also the word “maximum” number 1099 / binary 10001001011  Is this correct?
Ignoring the fact you calculated the checksum incorrectly, your understanding here is wrong. There is exactly one word ("weird") which will be a valid final word for the 128 bits of entropy you have selected. There will be other words you could replace "weird" with and still have a valid 12 word seed phrase, but given that the last word contains 7 bits of entropy as well as 4 bits of checksum, then if you choose one of these other words then you will have a different 128 bits of entropy. Further, if you choose one of these other valid words, there is no guarantee that the 4 digit checksum would be the same given you are changing the entropy.

For example, the entropy you have given above encodes this seed phrase:
Code:
verify merit vapor prize quiz volume theme lucky young yellow life weird

This is also a valid seed phrase:
Code:
verify merit vapor prize quiz volume theme lucky young yellow life debris

Weird encodes the following:  11111001001
Debris encodes the following: 00111000011
Checksums are in bold.

Two different valid words, but with different entropy and different checksums.
legendary
Activity: 2380
Merit: 5213
3)   The resulting hash is: bc4f595b36de2533832a47bf66535612688d81594449693bed9414180ab7cad4
Your calculation is wrong.
You need to hash your entropy through SHA256 function as a hex input, not as a text.

First, you need to convert your entropy to a hexadecimal number.
The result is F2B173C5D5AAFFEBB80425FF5FE2057C.

The hex number need to be hashed through SHA256 function.
The result is 931258d717865a310cfc24a9161b21f4c0d02e0bb4cf12894516170a10e72339

If you convert the result to a binary number, the first 4 bits would be 1001
legendary
Activity: 3472
Merit: 10611
Would you be able to give me an idea how I could perform the checksum on a windows box for my entropy example?
Sorry, I have no idea.

Quote
My (apparently mis) understanding from the previous replies was that you take the SHA 256 digest of the 128 bit entropy then use the first 4 bits of that as the checksum occupying the last four bits of the 12th word.
That part is correct. The misunderstanding is after you computed and appended the checksum to the end and when you start changing your entropy.
member
Activity: 104
Merit: 120
Thank you for the reply. I guess I must have misunderstood some of the previous replies. Would you be able to give me an idea how I could perform the checksum on a windows box for my entropy example? My (apparently mis) understanding from the previous replies was that you take the SHA 256 digest of the 128 bit entropy then use the first 4 bits of that as the checksum occupying the last four bits of the 12th word. In this case the first hexadecimal value from said SHA 256 digest was b and when converting b hex into binary it's 1011 which I appended to the end of the original 128 bit entropy. Perhaps I'm not calculating the checksum correctly? Thanks.
legendary
Activity: 3472
Merit: 10611
The original word-list(s) are found here:
https://github.com/bitcoin/bips/blob/master/bip-0039/bip-0039-wordlists.md

Quote
*The 12th word can have several different possible words as all that needs to be present in the last word is the four bits of the 11 bit pattern for the 12th word.
~
So if I understand it right then, the only requirement for a valid 12th word for this 12 word BIP 39 phrase would have to contain 1011 at the end of their bit pattern.  That would mean that in addition to the BIP 39 word "west" that I chose two other options could have been either  “earth” number 555 decimal / 1000101011 binary and also the word “maximum” number 1099 / binary 10001001011  Is this correct?
That's not how it works.
The last 4 bits are the checksum of the 128-bit entropy not arbitrary bits. This means if you change even a single bit inside the 128-bit entropy the 4-bit checksum also changes.

I think you misunderstood the previous comments. They are talking about collision. If choosing "maximum" instead of "west" gives you a correct mnemonic, you are manually brute forcing the words to find a collision. In which case it is not just about the last word, you can change any other bit inside the 128-bit entropy. For example you could change the 5th word and still have the same last word (and same other 10 words).
member
Activity: 104
Merit: 120
Hi everyone and thank you for the excellent feedback.  Just to be sure I understand things properly, I’ve gone ahead and outlined my understanding in a step by step write up for an example of how I believe that one could calculate their 12 word BIP 39 seed.  Please let me know if I got this correct.


A few key points I took away from you all are:

* You need 128 bits of entropy for BIP39 for a 12 word seed phrase

*Each BIP 39 word has an 11 bit code (earth = # 555 or 1000101011 in binary) that I believe is located here: https://github.com/hatgit/BIP39-wordlist-printable-en

The 128 bits of entropy for BIP39 also requires and additional 4 bits for a checksum in the 12th word.  This checksum is placed in the last 4 bits of the 11 bit word. 

*To obtain the additional 4 bits for the checksum you need to perform a SHA256 hash on the 128 bits of entropy and then take the first 4 bits of this hash and append it to the 128 bits which gives you a total of 132 bits. 

*Once this 132 bits has been created, you then deconstruct them into 12, 11 bit groupings and then identify the valid BIP 39 words that correlate to their bit patterns.

*The 12th word can have several different possible words as all that needs to be present in the last word is the four bits of the 11 bit pattern for the 12th word.

With all that said, here is what I did to confirm my understanding of the above.  Please let me know if there are any obvious errors. Note that this is just an example entropy and nothing I will ever use to generate my own seed.  Thank you all for your help!

1)   I first generated a random 128 bit entropy as such:

1111001010110001011100111100010111010101101010101111111111101011101110000000010 0001001011111111101011111111000100000010101111100

2)   I next performed a hash of the entropy by saving it in a notepad.txt file then performing the following command:  certutil -hashfile test.txt SHA256

3)   The resulting hash is: bc4f595b36de2533832a47bf66535612688d81594449693bed9414180ab7cad4

4)   The first 4 bits of the hash would be 1011.  This is my understanding as I believe that when converting from hexadecimal to binary you must always represent each binary value with four bits.  In this example, b is converted to binary as 1011. 

5)   Next I appended the 4 bites derived from the first placeholder of the hexadecimal hash value converted  as follows ENT+CS = 1111001010110001011100111100010111010101101010101111111111101011101110000000010 00010010111111111010111111110001000000101011111001011

6)   Divide the resulting 132 bits into the following lists:

11110010101
10001011100
11110001011
10101011010
10101111111
11110101110
11100000000
10000100101
11111111010
11111111000
10000001010
11111001011 (BIP 39 word "west")

So if I understand it right then, the only requirement for a valid 12th word for this 12 word BIP 39 phrase would have to contain 1011 at the end of their bit pattern.  That would mean that in addition to the BIP 39 word "west" that I chose two other options could have been either  “earth” number 555 decimal / 1000101011 binary and also the word “maximum” number 1099 / binary 10001001011  Is this correct?

Also with respect to the way I computed the the hash of the 128 bits, I did the following:  I entered in all the 1s and 0s into a notepad file and saved in a .txt extension.  I then performed the CertUtil on said file that provided me the above digest in SHA256. Does this produce the correct hash file of the binary stream?  I’m not sure if I did this correctly. Thank you.
legendary
Activity: 2268
Merit: 18748
When you consider 11 fixed words and randomly selecting the 12th word, then yes, the numbers become exact rather than averages, as for any given first 7 bits (not 8 as you have used) of the last word then there is exactly 1 combination of the last 4 bits which is valid.

When approaching the problem from OP's point of view of randomly selecting words and hoping for a valid seed phrase then it becomes an average as if you were to take a 12 word seed phrase and cycle through all possibilities for the first word (for example) there is no guarantee that you would end up with 128 valid seed phrases, due to the unpredictable nature of the checksum.
legendary
Activity: 2380
Merit: 5213
There are (on average) 128 words which will be a valid checksum for a 12 word seed phrase. It is 8 words (on average) for 24 word seed phrases.
Thanks for the correction. I edited that post.
But isn't that exactly 128 words for the 12 word seed phrase and exactly 8 words for the 24 word seed phrase?

Let's say I have the first 11 words of a 12 word seed phrase and the last word is unknown.
There are 256 128 possibilities for the first 8 7 bits of the last word and 16 possibilities for its last 4 bits.

There's 1/256 1/128 chance that the first 8 bits 7 bits of the word I choose are 0000000.
There's 1/256 1/128 chance that the first 8 bits 7 bits of the word I choose are 0000001.
There's 1/256 1/128 chance that the first 8 bits 7 bits of the word I choose are 0000010.
.......
.......
.......


If the first 7 bits are 0000000, there's 1 possibility for the last 4 bits that make the seed phrase valid. The chance is 1/16.
If the first 7 bits are 0000001, there's 1 possibility for the last 4 bits that make the seed phrase valid. The chance is 1/16.
If the first 7 bits are 0000010, there's 1 possibility for the last 4 bits that make the seed phrase valid. The chance is 1/16.
......
......
......



Therefore the chance of having a valid BIP39 seed phrase is always 1/16 (128 out of 2048 words)
legendary
Activity: 2268
Merit: 18748
Anyway, if you have the first 11 words and you want to have valid BIP39 seed phrase, there are 8 words that can be used as the 12th word.
There are (on average) 128 words which will be a valid checksum for a 12 word seed phrase. It is 8 words (on average) for 24 word seed phrases.

Specifically what I'm trying to do is print out a list of the 2048 bip39 words and randomly select 12 to create my own offline generated seed.
Don't do this! It is an incredibly insecure method of generating a seed phrase. You will not and can not choose words randomly, despite your best efforts. Humans are not random. Whatever seed phrase you end up with at the end of this process will not represent 128 bits of entropy.

I'm trying to ensure true ravdsomness in seed creation and this seems to be the only way I can come up with outside of being able to independently verify the code from wallet manufacturers etc.
Do not select words. Instead, flip a fair coin 128 times to create your entropy, calculate and append the 4 bit checksum, and then encode that 132 bit number in to the corresponding words. For each 11 bit section you will need to convert to decimal and then add 1 before looking up the word on the BIP39 word list.
legendary
Activity: 4466
Merit: 3391
1) How does one identify the corresponding bit pattern from the BIP 39 word list?  Is it as simple as finding out full BIP 39 word list and then the patterns are in alphabetical order? For example would I be correct to assume that the first word alphabetically on the BIP 39 list is abandon and so the 11 bit pattern would be 00000000001 whereas the second word alphabetically is ability which should correlate to 00000000010 ?).  

The words in the BIP-39 word lists are in a specific order, but I wouldn't depend on them being in alphabetical order. And, as hosseinimr93 pointed out, the first word is 0. Here are the "official" lists: BIP 39 Word Lists

2) Do you know an easy way to identify the SHA 256 hash of a 128 bit stream offline in a widows PC or an android device?

Most languages have cryptography libraries on Windows and Android that include a variety of hash calculations. Windows has the CertUtil command, if that is what you are looking for.
legendary
Activity: 2380
Merit: 5213
1) How does one identify the corresponding bit pattern from the BIP 39 word list?  Is it as simple as finding out full BIP 39 word list and then the patterns are in alphabetical order? For example would I be correct to assume that the first word alphabetically on the BIP 39 list is abandon and so the 11 bit pattern would be 00000000001 whereas the second word alphabetically is ability which should correlate to 00000000010 ?).  
Yes. Just take note that the first word (abandon) represents 00000000000 and the second word (ability) represents 00000000001.


2) Do you know an easy way to identify the SHA 256 hash of a 128 bit stream offline in a widows PC or an android device?
If you are familiar with python programming, you can use hashlib library.
member
Activity: 104
Merit: 120
Excellent reply!  I do have a few questions for you though:

1) How does one identify the corresponding bit pattern from the BIP 39 word list?  Is it as simple as finding out full BIP 39 word list and then the patterns are in alphabetical order? For example would I be correct to assume that the first word alphabetically on the BIP 39 list is abandon and so the 11 bit pattern would be 00000000001 whereas the second word alphabetically is ability which should correlate to 00000000010 ?). 

2) Do you know an easy way to identify the SHA 256 hash of a 128 bit stream offline in a widows PC or an android device?

Thanks you very much for the excellent reply!
Pages:
Jump to: