Pages:
Author

Topic: Why you cannot enter an arbitrary seed in Electrum - page 5. (Read 65082 times)

legendary
Activity: 1302
Merit: 1008
Core dev leaves me neg feedback #abuse #political
Electrum does not let you use an arbitrary sequence of words as seed. This is because humans are not good at generating really random phrases.

The seed generated by Electrum is a 128-bit random number. It is encoded as a sequence of 12 words, for the purpose of memorization. However, it is important to understand that it has 128-bits of entropy. A phrase generated by a human, or picked from a random book opened at a random page, will in general be much less random, and much more vulnerable to attacks. (and "much more" here means astronomically more).

In this type of attack, time is on the side of the attacker. It is perfectly possible for an attacker to try all the phrases existing in a large database of books, and some variants of those, until they find a wallet. In contrast, it is not possible to do the same with 2^128 random phrases.

As you may have noticed, it is possible to bypass this protection; if you restore your wallet from a hexadecimal string, any string length will be accepted. However, this will only work with hexadecimal inputs. Thus, if you absolutely insist on using an arbitrary phrase as seed, you will need to hex-encode it yourself. Consider this as a protection.

So, I've been trying this... I'm a bit confused...I chose some arbitary words , hex-encoded them , entered that hex code as the seed... and then viewed the seed, and the seed become like 45 words or 100 words even sometimes when viewed in electrum.  is this normal?  

(Even when I chose only words from the electrum passphrase dictionary, it still redid them... i chose 16 words and they become like 60 words)
hero member
Activity: 715
Merit: 500
Bitcoin Venezuela
I understand that the seed in number form is 128bits of entropy. But is the mnemonic 128 bits too? 12 words out of 1600 are 128bits of entropy? Just curious.
The randomness ("entropy") of one word, chosen at random from a list of 1600, is simply 1600, and the number of bits in 1600 is 10.644 because 2^10.644=1600.  So in a string of words chosen at random from a list of 1600, each word contributes 10.644 bits of randomness (entropy).  To get 128 bits of entropy you need 128/10.644=12 words.  1600 is a pretty short list, which is why electrum makes such long seeds.
Diceware uses a longer list of 7776 words.  Log2 (7776)=12.925;  128/12.925=9.9.  10 words in diceware would give 129 bits of entropy.
I generated a spreadsheet with about 55,000 words by collecting scrabble lists and pasting them into excel.  More entropy there.  By using random numbers to select words, I can generate a strong passphrase which is short enough that I have some chance of remembering it.  The scrabble list also has some real oddball words.  Makes for a memorable passphrase.  The extremely commonplace vocabulary words in the electrum and diceware lists generate very bland phrases that are not memorable at all, besides just being very long.
Electrum should have gone with a much bigger word list.


Have you read this?

Quote
The main reason is that the encoding method used in RFC1751 collides with patent US5892470 A.
My personal opinion is of course that this patent is ridiculous and should never have been granted.
However, I did not want to take any risk, because if the site hosting the source code (github, gitorious)
receives a cease and desist letter, they will remove the project rather than hire a lawyer to defend the
case. And if someone decided to target Bitcoin software in general, this patent gives them a reason to
attack Electrum.
 
In order to circumvent that patent, I used a different encoding algorithm and a different dictionary.
Of course I could have changed only the encoding algorithm and kept the same dictionary, but that
would have been a terrible idea, because it means Electrum would have generated RFC 1751 valid
passphrases, but decodes these phrases differently. This would definitely have been considered as a bug.
 
Another reason not to use the same dictionary as in the RFC is that it contains mostly short words,
which are not good for long-term memorization. People often believe that short words are easier to
remember, because they confuse short-term and long-term memory. STM and LTM are separate functions,
that are performed in anatomically distinct parts of the brain (hippocampus and cortex, respectively).
It is true that sequences of short words are easier to store and recall in short term memory (Baddeley
et al 1975), but that does not make them good candidates for long term memory storage. In order
to store a list of words in long term memory, these words must be both familiar and salient (not too
common and with some semantic or emotional load). Another good thing that boosts memory is to
have words from different categories (eg verbs and nouns), as explained in this paper:
http://csjarchive.cogsci.rpi.edu/proceedings/2008/pdfs/p2183.pdf

 
This is why I used words from a poetry list found on Wikimedia; this list contained words that were both
familiar and salient. Starting from this list, I first removed words that I found too short or too common,
and verbs that were conjugated with different tenses. (I also removed nsfw words such as "fuck" and "shit",
although I realize I forgot a few of them). After that, I still had more words than needed, so I ran an
optimization algorithm, in order to select the subset with maximal average Hamming distance between words.
 
cheers
 
Thomas
sr. member
Activity: 304
Merit: 380
I understand that the seed in number form is 128bits of entropy. But is the mnemonic 128 bits too? 12 words out of 1600 are 128bits of entropy? Just curious.
The randomness ("entropy") of one word, chosen at random from a list of 1600, is simply 1600, and the number of bits in 1600 is 10.644 because 2^10.644=1600.  So in a string of words chosen at random from a list of 1600, each word contributes 10.644 bits of randomness (entropy).  To get 128 bits of entropy you need 128/10.644=12 words.  1600 is a pretty short list, which is why electrum makes such long seeds.
Diceware uses a longer list of 7776 words.  Log2 (7776)=12.925;  128/12.925=9.9.  10 words in diceware would give 129 bits of entropy.
I generated a spreadsheet with about 55,000 words by collecting scrabble lists and pasting them into excel.  More entropy there.  By using random numbers to select words, I can generate a strong passphrase which is short enough that I have some chance of remembering it.  The scrabble list also has some real oddball words.  Makes for a memorable passphrase.  The extremely commonplace vocabulary words in the electrum and diceware lists generate very bland phrases that are not memorable at all, besides just being very long.
Electrum should have gone with a much bigger word list.
sr. member
Activity: 302
Merit: 250
Diceware 5 words bit strength:

>>> log(7776**5,2)
64.624

Electrum 12 words bit strength:

>>> log(1626**12,2)
128.005

https://www.google.com/search?q=log(1626**12%2C2)

https://www.google.com/search?q=log(7776**5%2C2)

Entering these into Google calculator give different results - doesn't help, either.
sr. member
Activity: 302
Merit: 250


Diceware 5 words bit strength:

>>> log(7776**5,2)
64.624

Electrum 12 words bit strength:

>>> log(1626**12,2)
128.005


If you or someone could show this quotation with 'normal' mathematical signs (a picture or external link with different numbers are OK), that would be helpful.

I have found this as well: https://security.stackexchange.com/questions/36246/what-is-the-entropy-of-just-1-diceware-passphrase-like-my-passphrase

Still not clear.

What this means: Diceware 5 words is no longer secure. With dictionary size 7776, use at least 7 or 8 words for critical/financial stuff.
If we assume that a flawed(?) electronic random number generator is as random than an actual dice.
legendary
Activity: 3682
Merit: 1580
As you may have noticed, it is possible to bypass this protection; if you restore your wallet from a hexadecimal string, any string length will be accepted. However, this will only work with hexadecimal inputs. Thus, if you absolutely insist on using an arbitrary phrase as seed, you will need to hex-encode it yourself. Consider this as a protection.
I am not a cryptographer (what are some good sources to learn some very basic concepts? Maybe one good article for noobs), so this is a basic question: let's say I used my own passphrase and I am happy with it, my passphrase is (obviously)

the quick brown fox jumps over the lazy dog

then how do I hex-encode it to become an Electrum seed?

Here you go:

https://www.google.com.pk/search?q=letters+to+hex

Quote
and just for fun, can I also convert it for Electrum style 12 words?

Easiest way for you is to create a new wallet and use the restore function. Paste in the hex and then view the seed to get the electrum words.

edit: python is not my strong suit so I am sure this can be done in the electrum console as well. But to do it in the Linux shell you just run python, then do "import electrum", and then "electrum.mnemonic.mn_encode( '34343' )" ofcourse replacing the numbers wiht your hex.

Edit2: Yeah you can do it in the electrum console too. Just type import electrum first and then the rest.
legendary
Activity: 1092
Merit: 1016
760930
What are your thoughts on this: http://www.sendspace.com/file/68tgbd
You have to roll your own seed. 5 dice for each word = 60 rolls, if a roll is invalid (i.e. under certain circumstances not applicable), roll again. IMO this should lead to a truly random seed, which can't be compromised by faulty or limited random number generators implementations. It's a bit cumbersome, but for long-term storage a decent decision.

Any downsides or potential risks I don't see?

This method is well known as Diceware - http://world.std.com/~reinhold/diceware.html

Good question! How Diceware (5 words with dice, dictionary size of 7776) compares to Electrum (12 words, from a dictionary of 1600) for practical purposes; to use as your master password?

Diceware 5 words bit strength:

>>> log(7776**5,2)
64.624

Electrum 12 words bit strength:

>>> log(1626**12,2)
128.005


What this means: Diceware 5 words is no longer secure. With dictionary size 7776, use at least 7 or 8 words for critical/financial stuff.
sr. member
Activity: 302
Merit: 250
As you may have noticed, it is possible to bypass this protection; if you restore your wallet from a hexadecimal string, any string length will be accepted. However, this will only work with hexadecimal inputs. Thus, if you absolutely insist on using an arbitrary phrase as seed, you will need to hex-encode it yourself. Consider this as a protection.
I am not a cryptographer (what are some good sources to learn some very basic concepts? Maybe one good article for noobs), so this is a basic question: let's say I used my own passphrase and I am happy with it, my passphrase is (obviously)

the quick brown fox jumps over the lazy dog

then how do I hex-encode it to become an Electrum seed?

and just for fun, can I also convert it for Electrum style 12 words?
sr. member
Activity: 302
Merit: 250
What are your thoughts on this: http://www.sendspace.com/file/68tgbd
You have to roll your own seed. 5 dice for each word = 60 rolls, if a roll is invalid (i.e. under certain circumstances not applicable), roll again. IMO this should lead to a truly random seed, which can't be compromised by faulty or limited random number generators implementations. It's a bit cumbersome, but for long-term storage a decent decision.

Any downsides or potential risks I don't see?

This method is well known as Diceware - http://world.std.com/~reinhold/diceware.html

Good question! How Diceware (5 words with dice, dictionary size of 7776) compares to Electrum (12 words, from a dictionary of 1600) for practical purposes; to use as your master password?
hero member
Activity: 686
Merit: 500
A pumpkin mines 27 hours a night
What are your thoughts on this: http://www.sendspace.com/file/68tgbd
You have to roll your own seed. 5 dice for each word = 60 rolls, if a roll is invalid (i.e. under certain circumstances not applicable), roll again. IMO this should lead to a truly random seed, which can't be compromised by faulty or limited random number generators implementations. It's a bit cumbersome, but for long-term storage a decent decision.

Any downsides or potential risks I don't see?
hero member
Activity: 784
Merit: 1010
Bitcoin Mayor of Las Vegas
From Thomas...

Quote
The main reason is that the encoding method used in RFC1751 collides with patent US5892470 A.
My personal opinion is of course that this patent is ridiculous and should never have been granted.
However, I did not want to take any risk, because if the site hosting the source code (github, gitorious)
receives a cease and desist letter, they will remove the project rather than hire a lawyer to defend the
case. And if someone decided to target Bitcoin software in general, this patent gives them a reason to
attack Electrum.
 
In order to circumvent that patent, I used a different encoding algorithm and a different dictionary.
Of course I could have changed only the encoding algorithm and kept the same dictionary, but that
would have been a terrible idea, because it means Electrum would have generated RFC 1751 valid
passphrases, but decodes these phrases differently. This would definitely have been considered as a bug.
 
Another reason not to use the same dictionary as in the RFC is that it contains mostly short words,
which are not good for long-term memorization. People often believe that short words are easier to
remember, because they confuse short-term and long-term memory. STM and LTM are separate functions,
that are performed in anatomically distinct parts of the brain (hippocampus and cortex, respectively).
It is true that sequences of short words are easier to store and recall in short term memory (Baddeley
et al 1975), but that does not make them good candidates for long term memory storage. In order
to store a list of words in long term memory, these words must be both familiar and salient (not too
common and with some semantic or emotional load). Another good thing that boosts memory is to
have words from different categories (eg verbs and nouns), as explained in this paper:
http://csjarchive.cogsci.rpi.edu/proceedings/2008/pdfs/p2183.pdf
 
This is why I used words from a poetry list found on Wikimedia; this list contained words that were both
familiar and salient. Starting from this list, I first removed words that I found too short or too common,
and verbs that were conjugated with different tenses. (I also removed nsfw words such as "fuck" and "shit",
although I realize I forgot a few of them). After that, I still had more words than needed, so I ran an
optimization algorithm, in order to select the subset with maximal average Hamming distance between words.
 
cheers
 
Thomas
hero member
Activity: 784
Merit: 1010
Bitcoin Mayor of Las Vegas
Whoops, I was under the impression assumed that Electrum was using RFC 1751 for translating bits to words...

http://tools.ietf.org/html/rfc1751

Can I ask what the reason for not using it and going with a poetry frequency list instead?
member
Activity: 73
Merit: 10
Well, technically you can:
  • Run electrum with parameter: -w fun.bin (to generate new custom wallet named fun.bin)
  • Select [Restore]
  • Enter word "god" 12 times Smiley (or any combination of words from electrum dictionary)
  • You got your own fully functional, funny-seeded and hence very insecure, likely to be cracked by someone wallet Smiley

Gleb
newbie
Activity: 42
Merit: 0
Quote
Of course, Armory uses waaaay more than 128 bits of entropy, but I'll be bringing it down to 128 or 160 in the next release -- I was thinking 160 because I wanted to give a little margin in case your system does not have a high-quality entropy pool at creation time.  This because I totally agree with ThomasV -- 128 bits is a nice, unbreakable value.  Maybe in 1000 years when we have Dyson spheres around a few different stars for the purpose of collecting energy to break my wallet, they might break 128 bits.  

I hope you where exaggerating. 128 bits encryption could be breaked "routinely" in 100 years. Armchair explanation: DES at 56 bits can be breaked "routinely" by NSA/CIA ecc. If Moore's Law is sustainable the number of transistors in a chip will double every 1.5 years. Let's say that every doubling in number of transistors double the speed (because, in the end, cracking a code is a highly parallelizable task, so doubling the number of processors WILL double the speed). So each 1.5 years the number of bits that can be cracked "routinely" is raised by 1 (double speed = +1 bits, because +1 bit doubles the keyspace)... So 72 * 1.5 = 108 years... But note that DES was cracked "routinely" some years ago.

(read for example here. http://en.wikipedia.org/wiki/EFF_DES_cracker , in 1998 EFF brute-force cracked DES in 56 hours for 250,000$. So if Moore Law is sustainable, in 2106 AES128 could be cracked in 56 hours, but note that some years before a resolute cracker with some million $ and a month of time could probably crack it)
full member
Activity: 209
Merit: 148
I understand that the seed in number form is 128bits of entropy. But is the mnemonic 128 bits too? 12 words out of 1600 are 128bits of entropy? Just curious.

Yes. As long as they are *randomly* chosen.
legendary
Activity: 3682
Merit: 1580
I understand that the seed in number form is 128bits of entropy. But is the mnemonic 128 bits too? 12 words out of 1600 are 128bits of entropy? Just curious.
full member
Activity: 150
Merit: 100
Thank you! Thank you! ...
I know a 128-bit seed is good enough to defeat brute force attacks, but then wouldn't it be even better to support a 256-bit seed? Any thoughts on allowing that option?
hero member
Activity: 715
Merit: 500
Bitcoin Venezuela

This is the most basic rule of ECDSA -- use a different random number for each signature.  I'd say that this should be a very difficult mistake to make, but apparently Playstation 3 also had some under-qualified developers in this regard.  

It's nothing new.  It's just the risk of "rolling your own" when dealing with crypto algorithms -- you don't understand the importance of each step, or have any guarantee you did it right.

Even when you think you did it right, you're probably open to things like timing attacks -- where someone gets your system to sign a whole bunch of stuff and collects statistics on the time it took -- which reveals information about the private key.  Proper implementations avoid this.

You answered my question even before I could refresh the page! Good to see it can be avoided taking the right minds to work. It seems that there are still people that don't get it is money what they are playing with.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer

This is the most basic rule of ECDSA -- use a different random number for each signature.  I'd say that this should be a very difficult mistake to make, but apparently Playstation 3 also had some under-qualified developers in this regard.  

It's nothing new.  It's just the risk of "rolling your own" when dealing with crypto algorithms -- you don't understand the importance of each step, or have any guarantee you did it right.

Even when you think you did it right, you're probably open to things like timing attacks -- where someone gets your system to sign a whole bunch of stuff and collects statistics on the time it took -- which reveals information about the private key.  Proper implementations avoid this.
hero member
Activity: 715
Merit: 500
Bitcoin Venezuela
Pages:
Jump to: