(this is a proposal based on a discussion with gmaxwell and etotheipi, and inspired by a proposed extension to the Casascius mini private key format)
1. AbstractThere is currently a lot of talk about deriving keys or wallets from a relatively short seed in a deterministic way ("brainwallets"). As your wallet's keys are exposed to the whole world, it is essential to make sure it is hard to mount a brute-force attack against potentially easy-to-guess seeds (in particular when the seeds are user-chosen passwords). This is a proposal for a standard for doing this derivation, which has some nice properties.
2. IntroductionTypically, key strengthening is used to increase the time necessary for trying a single seed. This does however require either fixing the number of iterations (which should increase over time as hardware gets faster), or forcing the user to remember the number of iterations. It also lacks a way to verify a seed was correct, as every seed results in a valid key.
An alternative is using a checksumming system. For example, requiring that SHA256("BLAH" + seed) starts with a given number of zero bits, and using SHA256(seed) as key. With N required zero bits, this also slows down testing of seeds by 2
N per valid seed. It does provide a way to verify a seed was correct, is very fast to do the actual derivation, but also introduces some variation in the seed generation.
Neither solution however addresses the fact that it requires the derivation count or checksum strength to be known in advance (either as part of the specification, or by making the user remember). However, by combining both strengthening and checksumming, we can have seeds that encode their own strength, without compromising security.
3. Proposed algorithmThis is the proposed derivation algorithm:
- Calculate H(H(H(...(seed)...))), with 257 iterations, and H = HMAC-SHA512.
- If the 257th iteration results in a hash that starts with 8 zero bits, the 256th iteration hash is the output (the key).
- If not, keep doing iterations
- If the 513th iteration results in a hash that starts with 9 zero bits, the 512th iteration hash is the output.
- If the 1025th iterations results in a hash that starts with 10 zero bits, the 1024th iteration hash is the output.
- ...
This results in a variable-strength keying that requires 2
16, 2
18, 2
20, ... iterations of H per valid seed (both for a legitimate user creating a new seed, and an attacker trying to attack them). However, an attacker who does not know which strength the user chose, cannot give up quickly after a limited number of iterations. In what follows, we will however assume an attacker does now, so the security of this scheme does not depend on that property.
What this does essentially, is split the generation difficulty equally over making it take 2
N steps to generate a key, and at the same time, making only 1 in 2
N seed valid. This has some nice properties:
- Generating a key of a given strength S, requires 216+2*S iterations on average
- An attacker that has an oracle which can iterate all valid seeds of a given strength in constant time, still needs as many elements as there are in the original seed space to test all valid keys.
Note that HMAC-SHA512 does take a salt as input. This salt can be an application-defined default (such as the fixed string "Bitcoin key derivation"), but the application can also prompt the user for some extra random but non-secret bits (such as an email address, a birthdate, ...). There have been threads above such a salt alone, if I recall correctly.
4. ImprovementsThe potential difficulties (number of iterations necessary to a generate a valid seed) offered by the scheme described above are relatively far apart: a factor 4. We can however tweak the algorithm a bit.
Instead of checking for a particular number of zero bits at the start of a hash, we can interpret the hash as a number, and compare it directly to some predefined constant. This is similar to how the proof-of-work check happens in Bitcoin itself. Assume we want the difficulties to start at 2
16 still, but have them be a factor 2 instead of a factor 4 apart, we can use this algorithm:
- Calculate H(H(H(...(seed)...))), with 257 iterations, and H = HMAC-SHA512.
- If the 257th iteration results in a hash whose first 32 bits are below 0x101000, the 256th iteration hash is the output (the key).
- If not, keep doing iterations
- If the 363th iteration results in a hash whose first 32 bits are below 0xB60183, the 362th iteration hash is the output.
- If the 513th iteration results in a hash whose first 32 bits are below 0x80C1A2, the 512th iteration hash is the output.
- If the 726th iteration results in a hash whose first 32 bits are below 0x5B2143, the 725th iteration hash is the output.
- ...
(more constants and mathematical model to obtain them can be found
here; the numbers are found by a numerical approximation).
This results in difficulties very close to 2
16, 2
17, 2
18, ...
Another potential improvement is using a different H function, such as something based on scrypt.
5. ImplementationAn example implementation in Javascript can be found
here. It does use a modified set of constants, which result in difficulties around 2
10, 2
11, 2
12, ... instead, as difficulties around 2
16 are already unusably slow (though possible) in Javascript.