Self-descriptive strengthened keying: a standard for deriving keys from seeds - page 2.

(this is a proposal based on a discussion with gmaxwell and etotheipi, and inspired by a proposed extension to the Casascius mini private key format)

1. Abstract

There is currently a lot of talk about deriving keys or wallets from a relatively short seed in a deterministic way ("brainwallets"). As your wallet's keys are exposed to the whole world, it is essential to make sure it is hard to mount a brute-force attack against potentially easy-to-guess seeds (in particular when the seeds are user-chosen passwords). This is a proposal for a standard for doing this derivation, which has some nice properties.

2. Introduction

Typically, key strengthening is used to increase the time necessary for trying a single seed. This does however require either fixing the number of iterations (which should increase over time as hardware gets faster), or forcing the user to remember the number of iterations. It also lacks a way to verify a seed was correct, as every seed results in a valid key.

An alternative is using a checksumming system. For example, requiring that SHA256("BLAH" + seed) starts with a given number of zero bits, and using SHA256(seed) as key. With N required zero bits, this also slows down testing of seeds by 2^N per valid seed. It does provide a way to verify a seed was correct, is very fast to do the actual derivation, but also introduces some variation in the seed generation.

Neither solution however addresses the fact that it requires the derivation count or checksum strength to be known in advance (either as part of the specification, or by making the user remember). However, by combining both strengthening and checksumming, we can have seeds that encode their own strength, without compromising security.

3. Proposed algorithm

This is the proposed derivation algorithm:

Calculate H(H(H(...(seed)...))), with 257 iterations, and H = HMAC-SHA512.
If the 257th iteration results in a hash that starts with 8 zero bits, the 256th iteration hash is the output (the key).
If not, keep doing iterations
If the 513th iteration results in a hash that starts with 9 zero bits, the 512th iteration hash is the output.
If the 1025th iterations results in a hash that starts with 10 zero bits, the 1024th iteration hash is the output.
...

This results in a variable-strength keying that requires 2¹⁶, 2¹⁸, 2²⁰, ... iterations of H per valid seed (both for a legitimate user creating a new seed, and an attacker trying to attack them). However, an attacker who does not know which strength the user chose, cannot give up quickly after a limited number of iterations. In what follows, we will however assume an attacker does now, so the security of this scheme does not depend on that property.

What this does essentially, is split the generation difficulty equally over making it take 2^N steps to generate a key, and at the same time, making only 1 in 2^N seed valid. This has some nice properties:

Generating a key of a given strength S, requires 2^16+2*S iterations on average
An attacker that has an oracle which can iterate all valid seeds of a given strength in constant time, still needs as many elements as there are in the original seed space to test all valid keys.

Note that HMAC-SHA512 does take a salt as input. This salt can be an application-defined default (such as the fixed string "Bitcoin key derivation"), but the application can also prompt the user for some extra random but non-secret bits (such as an email address, a birthdate, ...). There have been threads above such a salt alone, if I recall correctly.

4. Improvements

The potential difficulties (number of iterations necessary to a generate a valid seed) offered by the scheme described above are relatively far apart: a factor 4. We can however tweak the algorithm a bit.

Instead of checking for a particular number of zero bits at the start of a hash, we can interpret the hash as a number, and compare it directly to some predefined constant. This is similar to how the proof-of-work check happens in Bitcoin itself. Assume we want the difficulties to start at 2¹⁶ still, but have them be a factor 2 instead of a factor 4 apart, we can use this algorithm:

Calculate H(H(H(...(seed)...))), with 257 iterations, and H = HMAC-SHA512.
If the 257th iteration results in a hash whose first 32 bits are below 0x101000, the 256th iteration hash is the output (the key).
If not, keep doing iterations
If the 363th iteration results in a hash whose first 32 bits are below 0xB60183, the 362th iteration hash is the output.
If the 513th iteration results in a hash whose first 32 bits are below 0x80C1A2, the 512th iteration hash is the output.
If the 726th iteration results in a hash whose first 32 bits are below 0x5B2143, the 725th iteration hash is the output.
...

(more constants and mathematical model to obtain them can be found here; the numbers are found by a numerical approximation).

This results in difficulties very close to 2¹⁶, 2¹⁷, 2¹⁸, ...

Another potential improvement is using a different H function, such as something based on scrypt.

5. Implementation

An example implementation in Javascript can be found here. It does use a modified set of constants, which result in difficulties around 2¹⁰, 2¹¹, 2¹², ... instead, as difficulties around 2¹⁶ are already unusably slow (though possible) in Javascript.

Topic: Self-descriptive strengthened keying: a standard for deriving keys from seeds - page 2. (Read 6221 times)