Shorter words for seed wordlist | Bitcointalksearch.org

Quote

The main reason is that the encoding method used in RFC1751 collides with patent US5892470 A.
My personal opinion is of course that this patent is ridiculous and should never have been granted.
However, I did not want to take any risk, because if the site hosting the source code (github, gitorious)
receives a cease and desist letter, they will remove the project rather than hire a lawyer to defend the
case. And if someone decided to target Bitcoin software in general, this patent gives them a reason to
attack Electrum.

In order to circumvent that patent, I used a different encoding algorithm and a different dictionary.
Of course I could have changed only the encoding algorithm and kept the same dictionary, but that
would have been a terrible idea, because it means Electrum would have generated RFC 1751 valid
passphrases, but decodes these phrases differently. This would definitely have been considered as a bug.

Another reason not to use the same dictionary as in the RFC is that it contains mostly short words,
which are not good for long-term memorization. People often believe that short words are easier to
remember, because they confuse short-term and long-term memory. STM and LTM are separate functions,
that are performed in anatomically distinct parts of the brain (hippocampus and cortex, respectively).
It is true that sequences of short words are easier to store and recall in short term memory (Baddeley
et al 1975), but that does not make them good candidates for long term memory storage. In order
to store a list of words in long term memory, these words must be both familiar and salient (not too
common and with some semantic or emotional load). Another good thing that boosts memory is to
have words from different categories (eg verbs and nouns), as explained in this paper:
http://csjarchive.cogsci.rpi.edu/proceedings/2008/pdfs/p2183.pdf

This is why I used words from a poetry list found on Wikimedia; this list contained words that were both
familiar and salient. Starting from this list, I first removed words that I found too short or too common,
and verbs that were conjugated with different tenses. (I also removed nsfw words such as "fuck" and "shit",
although I realize I forgot a few of them). After that, I still had more words than needed, so I ran an
optimization algorithm, in order to select the subset with maximal average Hamming distance between words.

cheers

Thomas

From here Why you cannot enter an arbitrary seed in Electrum

Topic: Shorter words for seed wordlist (Read 593 times)