Pages:
Author

Topic: Could the BIP39 word list be completely replaced? (Read 467 times)

member
Activity: 239
Merit: 59
a young loner on a crusade
A desired feature of a defined wordlist is interoperability.
The main feature of a wordlist is to prevent writing mistakes. Words are easier to reproduce than long numbers.
"Planet" is on the list, but "plane" isn't. Don't mistake it for "plain". "Brake" and "break" could lead to mistakes, and there are many more similar sounding or looking words. If the BIP39 list would have 30,000 words, spotting writing mistakes becomes much more difficult.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
No, he is right. You can indeed use the word "hello" as an entire seed phrase if you want. Obviously it doesn't follow the BIP39 protocol in terms of length, entropy, checksum, etc., but you can indeed ignore all that, feed "hello" in to the PBKDF2 algorithm, and generate a wallet. In fact, someone has done that already. Using the string "hello" as a BIP39 seed phrase, you can generate the following address at m/44'/0'/0'/0/1:

19ag68hqdbjwC2cLDZs5HRrxRCm4ETr2Wb

This address was used back in 2017.
Yeah, I was confused about the original post as I was under the impression that it was talking about the entropy being used instead of the actual seed. You can also generate an insecure seed using any passphrase using an entropy and a hash to pad it to the appropriate size.
hero member
Activity: 714
Merit: 1010
Crypto Swap Exchange
I wonder if and how many of the insecure brainwallet fails have been revived as initial entropy into BIP39 mnemonic words and wallet derivation. Not that this would make it any secure...

DO NOT use any publicly known words or sentences as input to SHA256 and the result as a private key or as initial entropy for BIP39. You will loose coins...
legendary
Activity: 2268
Merit: 18711
No, that is not what that post meant.
No, he is right. You can indeed use the word "hello" as an entire seed phrase if you want. Obviously it doesn't follow the BIP39 protocol in terms of length, entropy, checksum, etc., but you can indeed ignore all that, feed "hello" in to the PBKDF2 algorithm, and generate a wallet. In fact, someone has done that already. Using the string "hello" as a BIP39 seed phrase, you can generate the following address at m/44'/0'/0'/0/1:

19ag68hqdbjwC2cLDZs5HRrxRCm4ETr2Wb

This address was used back in 2017.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
Yes you can actually use just any random words you like to create a seed phrase but most of those wallets that uses BIP39 standard will not be able to recover that wallet, as for the checksum it can easily be solved and then pick the word that matches it. But it is bad idea to actually manually generate your own seed phrase because it doesn’t create the randomness that your wallet or Machine will pick
No, that is not what that post meant. Sufficient entropy is required to generate the mnemonic using BIP39. The acceptable size of the entropy is between 128bits to 256bits, and hence you can actually just use the SHA256 hash function of hello as the input for the entropy, generating a rather insecure mnemonic.

To calculate the addresses (as well as the corresponding private key), we use a KDF on the mnemonic to come up with the seed. For example, using SHA256 of "hello" as the entropy yields:
Quote
stuff media welcome miracle hair crowd confirm cloud exhibit dust pigeon sauce gym copy truth salad dirt scissors sunny about cable wing opinion cheap
hero member
Activity: 868
Merit: 952

This also means you can use any random garbage as a seed input for BIP39 wallets - you can, and someone already has, used stuff like "hello" as the seed phrase, and it works fine. Though most wallets will probably complain and/or disallow you to use something that doesn't follow the standard wordlist and checksum format.

Yes you can actually use just any random words you like to create a seed phrase but most of those wallets that uses BIP39 standard will not be able to recover that wallet, as for the checksum it can easily be solved and then pick the word that matches it. But it is bad idea to actually manually generate your own seed phrase because it doesn’t create the randomness that your wallet or Machine will pick
full member
Activity: 161
Merit: 230
One thing that I haven't seen mentioned that is worth pointing out is that the sequence of words is the actual seed input used in the HD wallet calculations - the 128 or 256 bits you start with is just a way to create the string of words, the bits are not the HD seed in itself.

This also means you can use any random garbage as a seed input for BIP39 wallets - you can, and someone already has, used stuff like "hello" as the seed phrase, and it works fine. Though most wallets will probably complain and/or disallow you to use something that doesn't follow the standard wordlist and checksum format.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
Even in case of an computer with more than 500 qubits?
Quantum computing doesn't affect private key generation as well as address generation because the speedup for asymmetric cryptography is far larger than the ones for symmetric cryptography. As such, it would be reasonable to assume that QC can potentially crack Bitcoin addresses only after the public key has been exposed, not just the hash and certainly not during the generation of addresses. It wouldn't matter how many words are in your wordlist or what kind of generation you're using.
hero member
Activity: 560
Merit: 1060
When it is about quantum computing the cryptography used in Bitcoin it's obsolete, it need it need a (new) different way of ciphering called Quantum-Proof cryptography!

This is when Bitcoin will need to have a serious fork in order to make it bullet-proof against quantum computing threats. But in this case, cryptography will face issues as a whole. It won't be bitcoin only that will need to change.
legendary
Activity: 2576
Merit: 1248
This means that even if you create an entropy of 2000 bits to produce a private key, there can be someone who will generate the same key by solving the ECDSA algorithm, without messing with the size of the seed phrase at all.

Apart from the fact that seed phrases are used for larger number of private keys !




Even in case of an computer with more than 500 qubits?

When it is about quantum computing the cryptography used in Bitcoin it's obsolete, it need it need a (new) different way of ciphering called Quantum-Proof cryptography!
member
Activity: 74
Merit: 27
The question is, is there a necessity of different world list? Simply no, because this BIP wordlist is safe and there is nothing to worry about.


Even in case of an computer with more than 500 qubits?
hero member
Activity: 560
Merit: 1060

You explained it well with arguments but people still can't understand. They follow very primitive logic that more is better, they can't understand that the whole Thesaurus and a tiny BIP wordlist, both of them are equally safe for generating 12 or 24 words seed phrase. To be frank, no one ever had a problem with it, no one's wallet has ever been hacked by bruteforcing seed phrases and I don't really understand why are people looking for solutions for a problem that doesn't exist.

Maybe people don't understand, maybe people are lazy to dig into the documentation of how BIP39 actually works. I don't know what exactly the problem is. You can read and try to understand BIP39 at various places where it's explained. For me a nice spot is here: https://learnmeabitcoin.com/technical/mnemonic, it's visual, some explanatory code for code nerds and lots of details and links.
Who doesn't understand that?


It's certainly laziness, I can tell you from personal examples that people don't want to bother learning new stuff.

learnmebitcoin.com is indeed a great website. Personally I love reading "Mastering Bitcoin" by A. Antonopoulos which is one of the best books I have ever read, but it is a bit more difficult.

Then again, anyone can read both, but the question is, are they willing to do so?

Most people think what Synchronice says, that the more words, the more secure the phrase is. I would give a point though to those that don't know what a bit is in computer science. Unfortunately (and fortunately too) Bitcoin requires basic computer knowledge. But I find this as a strong positive of Bitcoin's.

Bitcoin is an incredible amalgam of Cryptography, Math and Computer Science. But luckily we are here to help and be helped.
legendary
Activity: 2268
Merit: 18711
Correct. You can see the subset of words from that list used here in Electrum version 1.1: https://github.com/spesmilo/electrum/blob/3760486a6a9279ffbd852f0be43c8f7a823a9427/lib/mnemonic.py#L23

Since the wordlist was only 1626 would this not weaken anything generated by the early version of Elecrum 2011.
No, it didn't. The seed phrases were still 128 bits of entropy, and there was no checksum. 12 words from a list of 1626 gives 128.005 bits. 1626 is precisely the minimum number of words the wordlist would need for 12 words to give a minimum of 128 bits; 1625 words gives 127.99 bits. Once you add the 4 bit checksum as BIP39 did and you want to encode 132 bits, then your wordlist needs to expand to 2048.

Here is the new_seed function from Electrum version 1.1 which as you can see generates a random 128 bit number: https://github.com/spesmilo/electrum/blob/3760486a6a9279ffbd852f0be43c8f7a823a9427/lib/wallet.py#L338

The size of the wordlist doesn't make the security. Fewer words in the wordlist means you need more of the words to represent your chunk of entropy and vice versa.
Exactly this. You could technically have a word list with only two words, it's just that your seed phrase would end up being 132 words long.
hero member
Activity: 714
Merit: 1010
Crypto Swap Exchange
Since the wordlist was only 1626 would this not weaken anything generated by the early version of Elecrum 2011.

Something I have been looking at for some time now but never really seen it brought up in any of the topics around wordlists.

I don't think it matters as long as the as much as possible random entropy that is encoded or represented by the mnemonic words isn't bad, i.e. generated in a weak manner that would allow some sort of successful attack.

We don't have time, money or energy on this planet to break good random 128 bits of a wallet's initial entropy by brute-forcing it (there's no other way than that). And for a 256bit entropy it's not going to be easier, for sure.

In the end it doesn't really matter how you represent this random entropy by any wordlist as long as you can recreate the entropy from your mnemonic words in an unambigous way. What matters is standardisation if you want interoperability between different wallets.


You explained it well with arguments but people still can't understand. They follow very primitive logic that more is better, they can't understand that the whole Thesaurus and a tiny BIP wordlist, both of them are equally safe for generating 12 or 24 words seed phrase. To be frank, no one ever had a problem with it, no one's wallet has ever been hacked by bruteforcing seed phrases and I don't really understand why are people looking for solutions for a problem that doesn't exist.

Maybe people don't understand, maybe people are lazy to dig into the documentation of how BIP39 actually works. I don't know what exactly the problem is. You can read and try to understand BIP39 at various places where it's explained. For me a nice spot is here: https://learnmeabitcoin.com/technical/mnemonic, it's visual, some explanatory code for code nerds and lots of details and links.
Who doesn't understand that?

What people have to understand is the importance of good randomness of the initial entropy that's represented by some well defined procedure and standardizedfor interoperability! wordlist. The size of the wordlist doesn't make the security. Fewer words in the wordlist means you need more of the words to represent your chunk of entropy and vice versa.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
I have always been interested in the wordlist used by a early version of Electrum.

1626 words was the list size and it related to a US patent no 5892470 were each word does not represent a given digit.
Instead, the digit represented by a word is variable, it depends on the previous word.

I'm sure it used the list from http://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/Contemporary_poetry

Since the wordlist was only 1626 would this not weaken anything generated by the early version of Elecrum 2011.

Something I have been looking at for some time now but never really seen it brought up in any of the topics around wordlists.

Some time ago i tried running Electrum 0.3 where it generate 128-bit seed, so assuming it's generated with secure RNG it should be plenty secure. Although i never checked how Electrum convert 128-bit seed to 12 words.

I do not think this will be possible in the future. I do not think it is worth speculating on. But no one knows what will happen in the future.
Why won't that be possible? Just have BIP V1 and BIP V2 and that's all.

Small nitpick, you'll need different BIP number rather than adding version number. For example, Bech32 listed under BIP 173 while Bech32m listed under BIP 350.

[1] https://bitcointalksearch.org/topic/m.58942538
hero member
Activity: 882
Merit: 792
Watch Bitcoin Documentary - https://t.ly/v0Nim
I do not think this will be possible in the future. I do not think it is worth speculating on. But no one knows what will happen in the future.
Why won't that be possible? Just have BIP V1 and BIP V2 and that's all. The question is, is there a necessity of different world list? Simply no, because this BIP wordlist is safe and there is nothing to worry about.

Is it necessary, or is it better? No! the fact that your seed phrase is selected from a set of 2048 words makes it super secure, so anything higher is an overkill.

But, higher entropy doesn't mean more secure? Yes, it does, but when we talk about bitcoin private keys, you can only get a maximum of 128 bits of security. This means that even if you create an entropy of 2000 bits to produce a private key, there can be someone who will generate the same key by solving the ECDSA algorithm, without messing with the size of the seed phrase at all.

Therefore, I believe there is no need for larger seed phrases or more english words in them. We must focus on securing the backups properly and not on trying to increase security in this regard.
You explained it well with arguments but people still can't understand. They follow very primitive logic that more is better, they can't understand that the whole Thesaurus and a tiny BIP wordlist, both of them are equally safe for generating 12 or 24 words seed phrase. To be frank, no one ever had a problem with it, no one's wallet has ever been hacked by bruteforcing seed phrases and I don't really understand why are people looking for solutions for a problem that doesn't exist.
hero member
Activity: 1220
Merit: 612
OGRaccoon
I have always been interested in the wordlist used by a early version of Electrum.

1626 words was the list size and it related to a US patent no 5892470 were each word does not represent a given digit.
Instead, the digit represented by a word is variable, it depends on the previous word.

I'm sure it used the list from http://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/Contemporary_poetry

Since the wordlist was only 1626 would this not weaken anything generated by the early version of Elecrum 2011.

Something I have been looking at for some time now but never really seen it brought up in any of the topics around wordlists.

legendary
Activity: 2268
Merit: 18711
Please, can you let me know: BIP39 wordlist is an univocal list or not?
Technically, no. Practically, pretty much yes.

BIP39 can work with any wordlist. There are multiple wordlists in different languages, and you could even create and use your own wordlist if you wanted (although you definitely shouldn't do this). But because of the way BIP39 works, if you don't know the wordlist used then you cannot verify the checksum of your seed phrase. So if you used a customized wordlist then you would not be able to verify your checksum and might not be able to recover your wallet in any other piece of software. Because of this, every BIP39 wallet uses one of the standardized wordlists, and the vast majority of BIP39 wallets stick to using the English wordlist for maximum compatibility, since you cannot move the same seed phrase between wordlists.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
are a little bit confusing to me.

Please, can you let me know: BIP39 wordlist is an univocal list or not?

Thank you for your patience.
BIP39 has wordlists that are of quite a few different languages. You can find this in the BIP39 repository on Github: https://github.com/bitcoin/bips/commits/master/bip-0039. Notice how for each language, there is only one corresponding wordlist. Having different wordlists of the same language introduces ambiguity, but it is fine to have multiple word lists of the same language as the languages can be treated as the method to differentiate which wordlists to refer to.
hero member
Activity: 560
Merit: 1060

The device generates 128 bits and the output is 12 words (128 digits, 0 or 1), or 256 bits and the output is 24 words (256 digits, 0 or 1), and so on.
So the entropy's output is always a binary number, that can be 128 digit long or 256 digit long.
After this phase, the binary number has to be hashed, and the output will add 4 digits (128 becomes 132, 256 become 260), always taken between 0 and 1.


You are nearly there, but for 256 bits of initial entropy, after hashing it, you will keep the first 8 bits of the hashed value instead of 4 for the 128 bits.

So briefly:

128 bits of entropy + 4 bits checksum = 132 bits split into 12 segments of 11 bits = 12 words

256 bits of entropy + 8 bits checksum = 264 bits split into 24 segments of 11 bits = 24 words

Please refer to this link for more info https://github.com/bitcoinbook/bitcoinbook/blob/develop/ch05.asciidoc#mnemonic-code-words-bip-39
Pages:
Jump to: