Author

Topic: 24 word seed question : is splitting it in half dangerous? (Read 2408 times)

legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
I don't even understand why you talk of CS collision... CS collision is sure: given a known CS, there are about 2^248 combo (2^256 / 2^8) that give the same CS.

The reason I brought it up is that obtaining the second half of the mnemonic with a hypothetically larger seed would reduce its usefulness (while making the first half more useful), but then I realized that the CS is not an independent variable which can be modified without changing the entropy.
newbie
Activity: 3
Merit: 0
In both instances, you still need to bruteforce 12 words to get the actual seed phrase... So, you're working with a 2028^12 search space... which is 5444517870735015415413993718908291383296.
I don't think we can call it the same. If the total words are 12 and you are missing all 12 then the search space is 2128. But if the total words are 24 and you have 12 then you still have half of the entropy and depending on which half it could be a lot simpler. For example if you have the second half of the words then the bulk of the search space is suddenly reduced by roughly 94% because of the checksum.

I think this is incorrect.

When someone knows the last 12 words of the 24 phrase, they know 132 bits. But since 8 bits are for the checksum, they gain only 124 bits of information. So, there are still 256 - 124 = 132 bits left to attack. Now, the brute attacker knows the checksum, so we need to understand how many combinations among the 2^132 generate the same checksum: those that don't can be immediately discarded. The combinations that yield the same CS are about 2^124 (2^132 / 2^8).
Instead for an unknown 12-word phrase, the search space is 2^128. So, the former is slightly less secure, but negligibly so.

That's actually assuming the checksum is 8 bits long. It doesn't necessarily have to be, as the checksum length can be set to an arbitrary value that satisfies the formula in https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki#user-content-Generating_the_mnemonic (where MS=24 is the length of the resulting mnemonic phrase). But as far as I know, all of the wallets I know of use 8 bit checksums, and I haven't heard of a situation where a larger checksum length might be required to prevent checksum collision.

*OK, I realized that even the checksum and thereby the mnemonic phrase length is constrained by the entropy, so this doesn't actually hold, but it does raise the question on whether 8 bits of entropy will be enough to prevent a checksum collision from happening, particularly in a mass-adoption scenario where trillions of wallets are generated per day by businesses, software, apps, etc.

Then there is always the possibility of writing gibberish words in front of the mnemonic to make it appear like it's longer & non-standard.

I don't understand this.
In the documentation it is clearly written: 256 bit of entropy + 8 bit of CS.

I don't even understand why you talk of CS collision... CS collision is sure: given a known CS, there are about 2^248 combo (2^256 / 2^8) that give the same CS.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
In both instances, you still need to bruteforce 12 words to get the actual seed phrase... So, you're working with a 2028^12 search space... which is 5444517870735015415413993718908291383296.
I don't think we can call it the same. If the total words are 12 and you are missing all 12 then the search space is 2128. But if the total words are 24 and you have 12 then you still have half of the entropy and depending on which half it could be a lot simpler. For example if you have the second half of the words then the bulk of the search space is suddenly reduced by roughly 94% because of the checksum.

I think this is incorrect.

When someone knows the last 12 words of the 24 phrase, they know 132 bits. But since 8 bits are for the checksum, they gain only 124 bits of information. So, there are still 256 - 124 = 132 bits left to attack. Now, the brute attacker knows the checksum, so we need to understand how many combinations among the 2^132 generate the same checksum: those that don't can be immediately discarded. The combinations that yield the same CS are about 2^124 (2^132 / 2^8).
Instead for an unknown 12-word phrase, the search space is 2^128. So, the former is slightly less secure, but negligibly so.

That's actually assuming the checksum is 8 bits long. It doesn't necessarily have to be, as the checksum length can be set to an arbitrary value that satisfies the formula in https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki#user-content-Generating_the_mnemonic (where MS=24 is the length of the resulting mnemonic phrase). But as far as I know, all of the wallets I know of use 8 bit checksums, and I haven't heard of a situation where a larger checksum length might be required to prevent checksum collision.

*OK, I realized that even the checksum and thereby the mnemonic phrase length is constrained by the entropy, so this doesn't actually hold, but it does raise the question on whether 8 bits of entropy will be enough to prevent a checksum collision from happening, particularly in a mass-adoption scenario where trillions of wallets are generated per day by businesses, software, apps, etc.

Then there is always the possibility of writing gibberish words in front of the mnemonic to make it appear like it's longer & non-standard.
newbie
Activity: 3
Merit: 0
In both instances, you still need to bruteforce 12 words to get the actual seed phrase... So, you're working with a 2028^12 search space... which is 5444517870735015415413993718908291383296.
I don't think we can call it the same. If the total words are 12 and you are missing all 12 then the search space is 2128. But if the total words are 24 and you have 12 then you still have half of the entropy and depending on which half it could be a lot simpler. For example if you have the second half of the words then the bulk of the search space is suddenly reduced by roughly 94% because of the checksum.

I think this is incorrect.

When someone knows the last 12 words of the 24 phrase, they know 132 bits. But since 8 bits are for the checksum, they gain only 124 bits of information. So, there are still 256 - 124 = 132 bits left to attack. Now, the brute attacker knows the checksum, so we need to understand how many combinations among the 2^132 generate the same checksum: those that don't can be immediately discarded. The combinations that yield the same CS are about 2^124 (2^132 / 2^8).
Instead for an unknown 12-word phrase, the search space is 2^128. So, the former is slightly less secure, but negligibly so.
newbie
Activity: 3
Merit: 0
The chances that hackers can brute force the whole seed phrase from just twelve seed phrase is possible with powerful computional algorithmic tools, very possible tools like btcrecover can be able to do it with high computational power.
No, it isn't.

A 24 word BIP39 phrase has 256 bits of entropy, with 8 bits of checksum. Depending on which 12 words the attacker knows, then, the remaining 12 words have either 132 bits or 124 bits of entropy. Both are still far outside the realms of possibilities, with the time taken to brute force measured in billions of years even with huge amounts of cloud computing dedicated to the task.


Yes, but if 2^132 is the initial searching space when someone knows the last 12 words of a 24 phrase, you have to consider the fact that only 2^124 combinations generate the same known checksum.
In other words the brute attacker can immediately discard 2^8 combinations without derive the addresses to check if they contain some tokens.
legendary
Activity: 3472
Merit: 10611
In both instances, you still need to bruteforce 12 words to get the actual seed phrase... So, you're working with a 2028^12 search space... which is 5444517870735015415413993718908291383296.
I don't think we can call it the same. If the total words are 12 and you are missing all 12 then the search space is 2128. But if the total words are 24 and you have 12 then you still have half of the entropy and depending on which half it could be a lot simpler. For example if you have the second half of the words then the bulk of the search space is suddenly reduced by roughly 94% because of the checksum.
HCP
legendary
Activity: 2086
Merit: 4363
Bit confused with this.  Electrum only has a 12 word seed.  The nano ledger s has a 24 word seed.  So wouldn't just having half of the nano ledger seed which is 12 word seed essentially the same thing as having no word in an electrum seed?
That is essentially correct.

In both instances, you still need to bruteforce 12 words to get the actual seed phrase... So, you're working with a 2028^12 search space... which is 5444517870735015415413993718908291383296.

legendary
Activity: 2268
Merit: 18775
-snip-
In terms of an attacker trying to brute force a 12 word seed or a 24 word seed with 12 words known, jerry0 is correct though.

A 12 word Electrum seed with no known words has 132 bits of entropy needing brute forced.
A 24 word Ledger Nano seed with 12 words known has either 132 or 124 bits of entropy needing brute forced, depending on whether the checksum word is known or not.

There are other differences in regards to derivation path and so on, but broadly speaking, they are comparably difficult to brute force.
legendary
Activity: 3472
Merit: 10611
Bit confused with this.  Electrum only has a 12 word seed.  The nano ledger s has a 24 word seed.  So wouldn't just having half of the nano ledger seed which is 12 word seed essentially the same thing as having no word in an electrum seed?
No, because the entropy size is a power of 2 number which is growing exponentially. And after a certain entropy size the entropy is considered secure. 12 word mnemonic offers 128 bits of entropy which is secure and more importantly it is offering the same exact level of security as a bitcoin private key (key for a 256-bit elliptic curve which has 128 bits of security).

Cutting 12 words in half is cutting your entropy down to 64 which is not as secure anymore.
This is the difference:
264 =                    18446744073709551616
2128 =340282366920938463463374607431768211456
full member
Activity: 1792
Merit: 186
If I have a 24 word seed, and I split it into 2 - 12 and 12
If an attacker somehow finds either the first 12 words or the 2nd 12 words, is there anyway they can use that to easily derive/find the second 12 words? or am I as safe as If I simply had a 12 word seed (which many wallets use?)
The chances that hackers can brute force the whole seed phrase from just twelve seed phrase is possible with powerful computional algorithmic tools, very possible tools like btcrecover can be able to do it with high computational power. It is not good to split your seed phrase, instead you can use shamir sharing for such in a way the seed phrase can be perfectly divided and splitted into secrets which is best for this.


Bit confused with this.  Electrum only has a 12 word seed.  The nano ledger s has a 24 word seed.  So wouldn't just having half of the nano ledger seed which is 12 word seed essentially the same thing as having no word in an electrum seed?


legendary
Activity: 2268
Merit: 18775
The chances that hackers can brute force the whole seed phrase from just twelve seed phrase is possible with powerful computional algorithmic tools, very possible tools like btcrecover can be able to do it with high computational power.
No, it isn't.

A 24 word BIP39 phrase has 256 bits of entropy, with 8 bits of checksum. Depending on which 12 words the attacker knows, then, the remaining 12 words have either 132 bits or 124 bits of entropy. Both are still far outside the realms of possibilities, with the time taken to brute force measured in billions of years even with huge amounts of cloud computing dedicated to the task.

If we are talking about a 15 or 18 word BIP39 phrase on the other hand, then the remaining entropy in those cases ranges from 28 bits to 66 bits, which is somewhere in the range of "very easy" to "possible in a few weeks/months", depending on the computing power involved.

Yes, brute forcing 12 words is exponentially easier than brute forcing 24, but it is still impossible for the time being.
legendary
Activity: 1652
Merit: 1208
Gamble responsibly
If I have a 24 word seed, and I split it into 2 - 12 and 12
If an attacker somehow finds either the first 12 words or the 2nd 12 words, is there anyway they can use that to easily derive/find the second 12 words? or am I as safe as If I simply had a 12 word seed (which many wallets use?)
The chances that hackers can brute force the whole seed phrase from just twelve seed phrase is possible with powerful computional algorithmic tools, very possible tools like btcrecover can be able to do it with high computational power. It is not good to split your seed phrase, instead you can use shamir sharing for such in a way the seed phrase can be perfectly divided and splitted into secrets which is best for this.
legendary
Activity: 2268
Merit: 18775
Yet in this video, Andreas positively hates the idea: https://www.youtube.com/watch?v=p5nSibpfHYE
It seems his biggest issue here is that without using a system like Shamir's, finding one of the shares reveals two thirds of the secret, and reduces the remaining entropy from 2256 to 280. Whereas finding anything less than m in a m-of-n Shamir's scheme reveals nothing about the secret, every additional share in a m-of-n non-Shamir's scheme makes brute forcing the remaining unknown words progressively easier.

He's not incorrect that brute forcing 280 is exponentially easier than brute forcing an entire seed phrase, and he's also not wrong that brute forcing 280 is potentially possible in the not too distant future, and for those reasons I agree that a Shamir's secret sharing scheme is a better mechanism. However, if you are concerned about your back up being discovered and cannot physically store it in a more secure fashion, than a 2-of-3 scheme is still better than no scheme at all.

And with any secret splitting scheme, you have to take in to account how you would deal with one or more of your shares being lost or destroyed. If using any 2-of-3 scheme, I would make at least 2 copies of each share, so you could lose at least 3 shares and still be able to recover your coins.
newbie
Activity: 1
Merit: 0
I wanted to re-up this topic in case developments since it was last discussed alter people's opinions.

Splitting a 24 word mnemonic into two parts seems like a good way of performing at least one backup.

And the seems to have garnered moderate approval here in this thread.

Yet in this video, Andreas positively hates the idea: https://www.youtube.com/watch?v=p5nSibpfHYE

I'm not sure if that's because he's against having 16/24 in one place and whether his concerns would be assuaged in a 12/24 set-up.
hero member
Activity: 761
Merit: 606
I wish I had done this.

When I was arrested they searched my home and office, found my backup with all 24 words, and confiscated all of my Bitcoins.  So it can be dangerous to have all 24 words where they can be found by homeland security.

That is why I like my Trezor.  Even finding my Trezor SEED words leaves them lacking an enormously long hidden wallet passphrase (in my head only).  I keep a short decoy passphrase to "throw them off the scent" if I ever get in that situation.  It may cost me 2-4 BTC but that would be it.  There is no proof I have a hidden wallet either.

I have always hated reading about your situation.
legendary
Activity: 2646
Merit: 1138
All paid signature campaigns should be banned.
I wish I had done this.

When I was arrested they searched my home and office, found my backup with all 24 words, and confiscated all of my Bitcoins.  So it can be dangerous to have all 24 words where they can be found by homeland security.
full member
Activity: 204
Merit: 100
OK thanks for replies.

What if someone knows the checksum word - can that help at all, because it means certain combinations are not possible - or are the combinatorics still too vast?
legendary
Activity: 1806
Merit: 1164
Hi,

If I have a 24 word seed, and I split it into 2 - 12 and 12
If an attacker somehow finds either the first 12 words or the 2nd 12 words, is there anyway they can use that to easily derive/find the second 12 words? or am I as safe as If I simply had a 12 word seed (which many wallets use?)

Basically do the 2 halves in any way relate / help to derive each other, or do I retain half the entropy with half the seed? and does it matter if its the half including the checksum (can that be used in anyway to derive / guess what words it could be?)

Thanks  

*if this is wrong forum / somewhere better to ask please advise.

Having half the seed does not help an attacker. What could harm you is storing the halves separately and losing access. Put the seed someplace safe offline. How safe depends on how much bitcoin you own. If you are talking hardware wallets you can also add a layer of security by using a password to protect the seed. If someone gained access to your Trezor or Ledger seed it would be useless to them, they could not reconstruct your accounts without the password.
legendary
Activity: 3472
Merit: 10611
to really get the feel of the chances go look at some of the "Electrum Seed recovery tools" out there. you can even download one from Github and try recovering a dummy seed from a new empty wallet or using this:
Quote
constant forest adore false green weave stop guy fur freeze giggle clock
from documents. try with 1 missing word then with 2 and so on and see how long it takes.
hero member
Activity: 761
Merit: 606
From a math perspective logic would say that of course having the first 12 words of a 24 word puzzle is better than having none of them.  However; the math is so complicated that its still beyond any plausible range of calculation that I have ever seen or read about.

As an example (this is my example and not mathematically prove-able by me);  I can give you my btc address (view my signature as an example) so you have the entire address numbers.  Now I tell you to "sign" that account without me giving you the matching private key.  It can't be done. 
full member
Activity: 204
Merit: 100
Hi - thanks. I understand the risks in terms of splitting up paper etc, just wondering about the technical side of it.

So basically if someone finds half of my 12 word seed, I am still as protected as if I just had a normal 12 word seed? They still have to brute force something that is not brute forceable?
hero member
Activity: 761
Merit: 606
Having the first 12 words will not mathematically assist someone in discovering the remaining 12 words.  However; you have to consider that you are now twice as likely to lose your seed words since you are protecting two sheets of paper instead of one.  Of course it is obvious that it would be more difficult for a "bad guy" to get the correct 24 words from total scratch.  I just mean the words are not related where the first 12 go in a formula to work out the last 12.  It might be easier to swap out 3 of the 24 words with fake but still in the legal wallet list of words.  Now you simply swap the correct 3 words back to restore the wallet.  I don't like any of these ideas because they are error prone.
full member
Activity: 204
Merit: 100
Hi,

If I have a 24 word seed, and I split it into 2 - 12 and 12
If an attacker somehow finds either the first 12 words or the 2nd 12 words, is there anyway they can use that to easily derive/find the second 12 words? or am I as safe as If I simply had a 12 word seed (which many wallets use?)

Basically do the 2 halves in any way relate / help to derive each other, or do I retain half the entropy with half the seed? and does it matter if its the half including the checksum (can that be used in anyway to derive / guess what words it could be?)

Thanks  

*if this is wrong forum / somewhere better to ask please advise.
Jump to: