@jonald_fyookball
> We should note that when brute forcing seeds, none of the preceding letters are known.
You don't need to /know/ any actual element of the password. The speed up comes from knowing - for English - the conditional probabilities of certain lexical items or characters, so you don't have to run an exhaustive search over all combinations over the dictionary, or rather: you can try the more likely combinations first, then the slightly less likely ones, etc.
> It's like trying to brute force guess all the moves of billions of chess games played between relatively strong engines.
Not really. It's closer to having a slightly better algorithm for finding the optimal move in a chess game than the next best alternative algorithm. Which sounds exactly like something you'd want to use if you plan to win at chess.
I said it before, I'm not an expert on password cracking algorithms, and I can't say what speedup to expect in exact numbers. But I know enough about (statistical) language models to say that I'm pretty sure it could make quite a difference if implemented right, and if the assumption is correct that you try to find not a random sequence, but something generated by "English", or "close to English".
Now then, let's see what a quick Google search comes up with...
(1)
The result of which is (usually) a more efficient way of cracking passwords. So instead of guessing every possible combination of characters incrementally, it uses a statistical model where the most common characters are used first. 'C' followed by 'a' or 'e' for example, or 'q' followed by 'u'.
from:
https://www.trustwave.com/Resources/SpiderLabs-Blog/Hashcat-Per-Position-Markov-Chains/Which describes (from what I can tell) an application to password cracking of Shannon's insight mentioned above.
(2)
The result is a series of statistically generated brute-force attacks based on a mathematical system known as Markov chains. Hashcat makes it simple to implement this method. By looking at the list of passwords that already have been cracked, it performs probabilistically ordered, per-position brute-force attacks.
from:
http://www.wired.co.uk/news/archive/2013-05/28/password-cracking/page/2That one is not even based on any underlying "English grammar", but it's the same principle: there's a set of conditional probabilities they can work with given that the sequence hasn't been chosen at random.
In a sense, the "grammar" here is the "grammar of previously discovered passwords".
Super slick, by the way, must admit that.
(3)
Both Figure 4-6 and Figure 4-7 indicate that the Markov Chains method recovers passwords faster than Brute-force.
from:
https://www.ma.rhul.ac.uk/static/techrep/2013/MA-2013-07.pdf on page 38
This one's probably the closest to what I had in mind. Password cracking based on Markov Chains that encode some form of "English knowledge" to guide the search. And, who would have thought, it's faster than brute forcing.
Sorry if this comes across as rude, but that was the last message on this topic from me.
I've made the point I believe is the one that needs to be mentioned in the context of this discussion, and that this point itself is not matter of discussion, but a mathematical certainty:
Entropy of English or near English phrases is lower than that of randomly generated sequences.
Now, admittedly, whether you think the above is worth making it more difficult for users to remember their password is a different matter. But that's a 'weighing off' decisions then, between usability and (guaranteed) safety.
And all these things considered, I think that Thomas V found an
excellent solution: by default, seed generation is random, because on average, humans suck at coming up with randomness.
If however you have shown that you have a modicum of technical knowledge, you can enter your own seed, and then it's your own responsibility to ensure it is good enough.
Think of it like a "You must be this tall to ride" sign at the entrance of a roller coaster, and even placing a pair of walking stilts next to it.
If you're sure you want to roll your own, you can already do so. I don't see any need to ask Thomas to
invite everyone to come up with their own seed, because the likely result is that average seed quality would decrease.
(EDIT) One thing, to be clear: I
agree with you guys that it's hard, if not impossible, for most people to memorize the random seed. That's why you should probably write it down or print it, and find a way to store it away.
Hell, if your funds warrant that level of security, put it into a sealed envelope and place that one into an insured bank vault. The same principles of storing
anything physical of great value applies here, only that in our favor (a) you rarely if ever need to get the item (only to recover your keys), and (b) the item is small, so hiding it or renting some safety box is easier than having to do so for a larger object.