Pages:
Author

Topic: Does more seed words equal better security? - page 2. (Read 1106 times)

legendary
Activity: 2268
Merit: 18697
and i still see people beating their chest showing off how much they know about the math of the hashing cycle of sha, ecdsa and ripemd160..
We are having a discussion about the security of different seed phrases. No one is beating their chest about anything.

and thats about the human security of what randomiser/human personal selection entropy
Here you go with the "personal selection" again. Humans are not random. Despite how random you think you are being, you aren't. I'll take 128 bits of properly generated entropy over any of your "human chosen words from a list of 32,000 words" any of the day week. Not to mention that choosing words is the completely wrong way to think about the whole thing. You generate entropy, not words. The words are simply an encoding of the entropy.

remember the question is
"does more seed words"
Are 256 bits of entropy encoded by 24 words more secure than 128 bits of entropy encoded by 12 words? Sure.
Does that result in private keys which are more secure? No.
Is any harebrained scheme where someone picks their own words going to be more secure than either of those? No.
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
remember the question is
"does more seed words"
Don't miss the forest for a tree; the title may say that, but in the original post, 20kevin20 asks if Bitcoin would be more secure if we extended the phrase with additional words. Therefore, we answer that an attacker will prefer computing 2160 hashes rather than a range of mnemonics which exceeds it. Besides that, calculating a RIPEMD-160 hash takes less time than generating a BIP39 seed.

having 10 seed words of 32000 library(d) is more secure than 12seed with with randomiser(b) or personally chosen(c)
Again, if it exceeds the time 2160 hashes would take, then the point is lost.
full member
Activity: 206
Merit: 447
Allow me to rephrase. Yes, finding a valid Electrum seed requires 3 times the hashes of a valid BIP39 seed (assuming it takes a full 4096 attempts to find a valid prefix), but if searching the entire space for a specific seed, then it would be easier with Electrum seeds than with BIP39 seeds, no?

Seems that I got confused. Yes, it would be easier.

On the other hand, while attacking specific seed, it's way more probable to stumble upon another seed before finding it.

The difficulty in derivation is mainly the number of elliptic curve multiplications
Sure, but an additional three HMAC-SHA512s and additions per derivation path is not trivial when considering 2128 seeds.

If my numbers are correct generating one public key from private is ~68 times slower than a single HMAC-SHA512. That's why I assume elliptic curve operations are the slow thing here.

legendary
Activity: 4270
Merit: 4534
its been many posts and many hours. and i still see people beating their chest showing off how much they know about the math of the hashing cycle of sha, ecdsa and ripemd160..

but the question of SEEDS.. is the part pre hashing cycle
and thats about the human security of what randomiser/human personal selection entropy
which can make the difference between 50012 or 204812

yet again. if they want to talk about the 2160 post hash cycle(a)
they are ignoring the less secure(b,c)

a. 2160 =      1461501600000000000000000000000000000000000000000
b. 204812 =                   5444517900000000000000000000000000000000
c. 50012 =                                       488281250000000000000000000000

by the way.
having 10 seed words of 32000 library(d) is more secure than 12seed with with randomiser(b) or personally chosen(c)
d. 320010 =        1125899900000000000000000000000000000000000000

and thats without having to do any gorilla chest beating of whos the smartest and explaining the hashing functions

yep you will have much better luck brute forcing seeds in (b,c,d) than you would by trying all (a) combinations
so. try to keep to the topic of the SEEDs and not the post ripemd160 entropy

remember the question is
"does more seed words"
not
"whats the most combinations post keyhash cycle"
legendary
Activity: 2268
Merit: 18697
Attacking specific Electrum seed is 3 times harder compared to BIP39, if we look at single derivation path.
Allow me to rephrase. Yes, finding a valid Electrum seed requires 3 times the hashes of a valid BIP39 seed (assuming it takes a full 4096 attempts to find a valid prefix), but if searching the entire space for a specific seed, then it would be easier with Electrum seeds than with BIP39 seeds, no? There are fewer valid Electrum seeds as you point out here:
With BIP39 the attack is 2128 PBKDF2, while Electrum is 2121.6 equivalent PBKDF2. After that we have 2128 address derivations for BIP39, and 2119.9 for Electrum.

The difficulty in derivation is mainly the number of elliptic curve multiplications
Sure, but an additional three HMAC-SHA512s and additions per derivation path is not trivial when considering 2128 seeds.
full member
Activity: 206
Merit: 447
So attacking a specific seed is easier for Electrum seeds, but if attacking any used seed then that may not be the case.

The opposite. Attacking specific Electrum seed is 3 times harder compared to BIP39, if we look at single derivation path. Attacking sufficient number of derivation paths (100?) makes the difficulty same.

I looked up PBKDF2 vs Address Derivation timings, and for the usual non-hardened addresses (m/84'/0'/0'/0/0) AD is about 10 times faster than PBKDF2. Hardened only derivation is 30 times faster. Specialized hardware might change the ratio.
Remember as well that Electrum uses simpler derivation paths which would be easier to derive than the BIP39 ones. It uses m/0/0 for the first legacy address, and m/0'/0/0 for the first segwit address.

The difficulty in derivation is mainly the number of elliptic curve multiplications (by scalar), m/0/0 uses 3 multiplications and 2 additions. m/0'/0/0 does exactly the same, so does m/84'/0'/0'/0/0. m/0'/0' is single multiplication. Non-hardened child means we need the parent public key, and have to add it to another public key.

legendary
Activity: 2268
Merit: 18697
What is the proportion of BIP39 seeds versus Electrum ones? If there are 100 times more BIP39 seeds than Electrum, then the chance of stumbling upon BIP39 is higher.
Good point. So attacking a specific seed is easier for Electrum seeds, but if attacking any used seed then that may not be the case. Interestingly, if you assume 100 times more BIP39 than Electrum seeds as you have, then you end up with very similar numbers between the two.

I looked up PBKDF2 vs Address Derivation timings, and for the usual non-hardened addresses (m/84'/0'/0'/0/0) AD is about 10 times faster than PBKDF2. Hardened only derivation is 30 times faster. Specialized hardware might change the ratio.
Remember as well that Electrum uses simpler derivation paths which would be easier to derive than the BIP39 ones. It uses m/0/0 for the first legacy address, and m/0'/0/0 for the first segwit address.
full member
Activity: 206
Merit: 447
Does this not making attacking an Electrum seed theoretically easier? (2132.58 versus 2139 as I stated above?)

Only when doing exhaustive search. But then "ease" of attack depends on other things as well. What is the proportion of BIP39 seeds versus Electrum ones? If there are 100 times more BIP39 seeds than Electrum, then the chance of stumbling upon BIP39 is higher.

I looked up PBKDF2 vs Address Derivation timings, and for the usual non-hardened addresses (m/84'/0'/0'/0/0) AD is about 10 times faster than PBKDF2. Hardened only derivation is 30 times faster. Specialized hardware might change the ratio.


The reason why this does not result in a loss of entropy is that you cannot know in advance if a seed is valid or not prior to checking the seed. You will need to perform a calculation on every seed candidate before ruling it out as not being your seed.

Entropy is a word with many meanings. The Shannon Entropy, measured in bits, does decrease. It has nothing to do with how many calculations are done.

legendary
Activity: 2268
Merit: 18697
someone handpicking 12 words. means their entropy of library might just be 500 words they commonly use and are personal to them..
Someone hand picking words is almost certainly going to be less secure than a randomly generated 12 word seed phrase, regardless if they are picking from a list of 500 words or 32k words or 100k words.

next up is the HUMAN element of when using a randomiser
There should not be a human element at all. You should allow your software to randomly generate entropy for you. As soon as you introduce a human element, then you are far less secure than if you just let the software generate a 12 word seed phrase for you.

so a 20 word of 32k library allows for the most randomness
No, it doesn't. The number of words or the size of the library have no direct correlation with "randomness". I could pick the same word 12, or 20, or 100 times and be far less secure than a standard 12 word BIP39 seed phrase.
legendary
Activity: 1512
Merit: 7340
Farewell, Leo
someone handpicking 12 words. means their entropy of library might just be 500 words they commonly use and are personal to them..
Why would someone handpicked twelve words since he can generate them and ensure that he's made a completely unpredictable choice? As you said, there may be people who won't use specific words, but that doesn't matter that much; it matters the fact that they won't make an unpredictable guess while the whole point of cryptography is to always generate the private key with no human intervention. (Knowing what he's doing)

i honestly thought this topic was about seed word security of DO MORE SEED WORDS EQUAL BETTER SECURITY
seems many want to think its about the edcsa sha ripemd160 process, and the pre to post bit differences either side of that process...
The point of this thread is to give an answer if the more words means the better security and we've explained that an attacker won't have to brute force an extremely long seed phrase; he'll find it easier to successfully find a RIPEMD-160 collision instead.
legendary
Activity: 4270
Merit: 4534
132 bits of entropy only for Electrum seed phrases.
Aren't they 128 too, but with 8 bits of entropy?

EG is it better to have a 12 seed with a library of 32k words
or a 20 seed using a library of 2048
Let's leave the fact that each private key has 128 bits of security; if someone tried to brute force your address, he'd find it easier to go straight by calculating 2160 hashes rather than 3200012 or 204820. They're far larger numbers than the RIPEMD-160's possible outputs.

The twelve words with 2048 words in total is a great choice, but if you feel insecure, your best option would be 15 words that provide 165 bits. Anything longer than that would be an “overdose”.

yet again..
my whole point was..
the HUMAN ELEMENT

someone handpicking 12 words. means their entropy of library might just be 500 words they commonly use and are personal to them..
EG many IT/Network nerds might choose words affiliated with IT/networking. and not even think to uuse words like 'voyage' / vicious

so 12 words of a library of 500 handpicked words is very bad.
(its why a few passphrase wallets got emptied)

next up is the HUMAN element of when using a randomiser
is it better to have 12 words or 24 words of a 2048 library
or a 20 word of a 32k library

and the answer is. most people write down their seeds so human memory is of no issue and so a 20 word of 32k library allows for the most randomness

..
i honestly thought this topic was about seed word security of DO MORE SEED WORDS EQUAL BETTER SECURITY
seems many want to think its about the edcsa sha ripemd160 process, and the pre to post bit differences either side of that process..

but anyways moving on, ive said my peace

answering to below..
(sticking with speaking laymans<-emphasis)
(using basic math of entropy and not the technical anals of acertain wallets prefered method of conversion)
i know you want to obsess about the 2160 to go through all keys..

but for a HUMAN wanting to know his security risk of HIS seed key..
ill lay out the math
how many combinations:
a. 2160 =      1461501600000000000000000000000000000000000000000
b. 204812 =                   5444517900000000000000000000000000000000
c. 50012 =                                       488281250000000000000000000000

a=ripemd160 combinations
b=12 seed with 2048 library+good randomiser
c=manually choosing personalised words from common vocab

his 12 word seed with 2048 library. can be found easier then ripemd160
his personally chosen words from his common vocab can be found even easier

so if a brute forcer was looking for a particular persons seed and knew his vocab preference by scanning all his posts and finding the words he uses.
a bruteforcer could find his seed in 19 less significant figures then bruteforcing all ripemd combinations

it doesnt matter about how many combinations there are in the hash process
because his seed keys have less combinations at the beginning

its never a debate about total combinations a process allowes
its that his key is somewhere in the middle of
5444517900000000000000000000000000000000
or
488281250000000000000000000000
before it even goes though any particular wallets prefered conversion method
legendary
Activity: 2268
Merit: 18697
You mean 3 hex digits (or 3 nibbles).
Fixed, thanks.

Normalizing hashes to PBKDF2. In other words BIP39 is 2048xHMAC + 1xAD, Electrum 6144xHMAC + 1xAD.
Ahh right, I'm with you now. So yes, it is harder to attack a single valid Electrum seed compared to a single valid BIP39 seed, but for a 3 character prefix there are only 2120 valid Electrum seeds compared to 2128 valid seeds for BIP39. Does this not making attacking an Electrum seed theoretically easier? (2132.58 versus 2139 as I stated above?)

The reason why this does not result in a loss of entropy is that you cannot know in advance if a seed is valid or not prior to checking the seed. You will need to perform a calculation on every seed candidate before ruling it out as not being your seed.
This is exactly what I said above:
An attacker does not know in advance which seeds from the 2132 possibilities result in a hash with the necessary prefix. The only way to obtain this data is to brute every one of the 2132 possible seeds.
copper member
Activity: 1624
Merit: 1899
Amazon Prime Member #7
Is this correct?
I believe so.

The probability of a seed having the correct version prefix for a 3 byte prefix is 2-12, which is in 1 in 4096. For those seeds with a correct version prefix, then an attacker must perform a total of 2049 hashes. For the other 4095 possibilities, one hash is sufficient to exclude that seed. This means an average of 1.5 hashes per seed is required, as opposed to 2048 with BIP39, which is indeed a 1365.33... speed up.
The reason why this does not result in a loss of entropy is that you cannot know in advance if a seed is valid or not prior to checking the seed. You will need to perform a calculation on every seed candidate before ruling it out as not being your seed. According to the electrum devs, the cost to rule out a seed candidate as being outright invalid is less than calculating the actual seed. While technically not reducing the number of bits of entropy, it would somewhat reduce the cost of a bruteforce attack with a given n bits of entropy, when compared with a setup in which every seed candidate is valid.

I would compare the above to j2002ba2's above comparison to only accept dice rolls that are a 1 or a 6 on a 6-side dice. In his example, no calculation is needed in advance, the dice is reduced to a coin, with one side being valued as True and the other being valued as False, and the seed is calculated accordingly. An individual attacker may not specifically know you are using the "1" and "6" constraints but may bruteforce with two random numbers in order to have a lower space of possible values, and with many attackers, one will eventually try 1 and 6. 
full member
Activity: 206
Merit: 447
The probability of a seed having the correct version prefix for a 3 byte prefix is 2-12, which is in 1 in 4096.
You mean 3 hex digits (or 3 nibbles).

The cost of computing a single valid seed for BIP39 is 1xPBKDF2 + 1xAD (address derivation), while for Electrum it is 3xPBKDF2 + 1xAD.
I don't follow you here. Why is it 3x PBKDF2 for Electrum?

Normalizing hashes to PBKDF2. In other words BIP39 is 2048xHMAC + 1xAD, Electrum 6144xHMAC + 1xAD.

legendary
Activity: 2268
Merit: 18697
Is this correct?
I believe so.

The probability of a seed having the correct version prefix for a 3 character prefix is 2-12, which is in 1 in 4096. For those seeds with a correct version prefix, then an attacker must perform a total of 2049 hashes. For the other 4095 possibilities, one hash is sufficient to exclude that seed. This means an average of 1.5 hashes per seed is required, as opposed to 2048 with BIP39, which is indeed a 1365.33... speed up.

With BIP39 being 2128 * 2048 hashes, that would be 2139
With Electrum being 2132 * 1.5 hashes, that would be 2132.58, which is the same as your 2121.6 * 2048.

The cost of computing a single valid seed for BIP39 is 1xPBKDF2 + 1xAD (address derivation), while for Electrum it is 3xPBKDF2 + 1xAD.
I don't follow you here. Why is it 3x PBKDF2 for Electrum?
full member
Activity: 206
Merit: 447
The act of randomly choosing one of 2119.9 possibilities cannot magically have more than 119.9 bits entropy, this is the upper limit.
But an attacker does not know the set of 2119.9 you are choosing from.

You are right. Looks like I'm nitpicking.

Entropy is 2119.9, while the attack surface remains 2132.

Thinking about it, one needs to make on average 4096 HMAC-SHA512 to get a correct version, which is twice the HMACs needed by PBKDF2. This looks like speeding up the attack 1365 times ((4096 * 2048) / (4096 + 2048)), compared to doing PBKDF2 on every possible input.

With BIP39 the attack is 2128 PBKDF2, while Electrum is 2121.6 equivalent PBKDF2. After that we have 2128 address derivations for BIP39, and 2119.9 for Electrum.

Is this correct?

EDIT:

The cost of computing a single valid seed for BIP39 is 1xPBKDF2 + 1xAD (address derivation), while for Electrum it is 3xPBKDF2 + 1xAD. So Electrum seeds are harder to attack, it all depends on the cost of AD versus PBKDF2. If AD is the main cost, then it's about the same. When PBKDF2 is the main cost, Electrum seeds are 3 times harder to attack.

So if BIP39 has 128 bit security, Electrum has 128-129.5 bit equivalent security.

legendary
Activity: 2268
Merit: 18697
The act of randomly choosing one of 2119.9 possibilities cannot magically have more than 119.9 bits entropy, this is the upper limit.
But an attacker does not know the set of 2119.9 you are choosing from.

An attacker does not know in advance which seeds from the 2132 possibilities result in a hash with the necessary prefix. The only way to obtain this data is to brute every one of the 2132 possible seeds. Sure, if they find an invalid seed then they do not need to go through the rest of the steps of PBKDF2 and generating private keys and addresses to check for funds, but this does not mean the initial seed has less than 2132 bits of entropy.

Consider the following example: Let's say I impose a prefix requirement that is 130 bits long. By your logic, my entropy has now been reduced to 2, and so my seed would be trivial to brute force. Obviously this is not the case.
full member
Activity: 206
Merit: 447
I read it, and looked at the source. It starts with 132 bits of entropy, but then discards the ones with no matching version, thus reducing the entropy.
It doesn't discard the entropy it increments it[1] and you don't lose entropy by incrementing it. It is basically the same concept as vanity address generators, they too start from a random entropy then increment it until they reach the desired address.

[1] https://github.com/spesmilo/electrum/blob/3bc8ef6651ed9d9aff0531b3597f80eca4886301/electrum/mnemonic.py#L208

The act of randomly choosing one of 2119.9 possibilities cannot magically have more than 119.9 bits entropy, this is the upper limit.

You loose entropy by rejecting a significant portion of the possible input. Yes, incrementing is rejecting. It's like throwing a dice, then rejecting all but numbers 1 and 6, this gives you 1 bit entropy, not 2.58. Or most extreme: tossing a coin, where only heads are valid, getting 0 bits entropy.

Let's for example start with 11 bits entropy, a single word seed. With version "01" we get only 7 possible words: "best" "frequent" "lounge" "spin" "stay" "tone" "true", with input range for each 0-170, 171-741, 742-1058, 1059-1679, 1680-1703, 1704-1826, 1827-1866; 1867-2047 raises exception due to overflow. After applying the entropy formula we get 2.18 bits entropy. Version "100" is way more drastic - no possible words, zero bits entropy.

legendary
Activity: 3472
Merit: 10611
I read it, and looked at the source. It starts with 132 bits of entropy, but then discards the ones with no matching version, thus reducing the entropy.
It doesn't discard the entropy it increments it[1] and you don't lose entropy by incrementing it. It is basically the same concept as vanity address generators, they too start from a random entropy then increment it until they reach the desired address.

[1] https://github.com/spesmilo/electrum/blob/3bc8ef6651ed9d9aff0531b3597f80eca4886301/electrum/mnemonic.py#L208
full member
Activity: 206
Merit: 447
Aren't they 128 too, but with 8 bits of entropy?
I assume you mean 8 bits of checksum, but no. Electrum seed phrases have 132 bits of entropy. It generates 132 bits of entropy, hashes it, checks if the resulting hash starts with the correct version number, and if not, then it increases the entropy by one and hashes it again. It repeats this until it finds a hash starting with the desired version number. One it has, it turns the full 132 bits of entropy in to a seed phrase. This is obviously different from BIP39, which generates 128 bits of entropy, hashes for the checksum, appends it to make 132 bits, and then turns those 132 bits in to the seed phrase.

You can read more about this process here: https://electrum.readthedocs.io/en/latest/seedphrase.html


I read it, and looked at the source. It starts with 132 bits of entropy, but then discards the ones with no matching version, thus reducing the entropy. For "standard" (p2pkh, version starting with "01") the resulting entropy is at most 132-8=124 bits. For "segwit" (version starting with "100") it is even worse 132-12=120 bits. Additionally the BIP39 checksum must fail. For 12-word seed this reduces the entropy slightly as well, by ~0.093 bits. So a newly generated wallet would have entropy about 119.9 bits.

To make it clear: for any 132 bit number the chance of it being a correct "segwit" seed is 1/4096 * 15/16, thus getting only 2119.9 possibilities.
"standard" has a chance 1/256 * 15/16, giving 2123.9 possibilities.

Pages:
Jump to: