Pages:
Author

Topic: Self-descriptive strengthened keying: a standard for deriving keys from seeds - page 2. (Read 6202 times)

legendary
Activity: 1526
Merit: 1134
I would definitely think about a function like scrypt here.

Despite this algorithm, I'm still missing / failing to understand what stops you calculating rainbow tables for common passwords given sufficient CPU time (it only has to be done once). Is the only defence against this telling users "choose a strong password or die"? What happens if technology improves and over time, functions that seemed strong become weaker (as happened with MD5/SHA for password hashing) - it results in money being stolen.
legendary
Activity: 1072
Merit: 1181
(this is a proposal based on a discussion with gmaxwell and etotheipi, and inspired by a proposed extension to the Casascius mini private key format)

1. Abstract

There is currently a lot of talk about deriving keys or wallets from a relatively short seed in a deterministic way ("brainwallets"). As your wallet's keys are exposed to the whole world, it is essential to make sure it is hard to mount a brute-force attack against potentially easy-to-guess seeds (in particular when the seeds are user-chosen passwords). This is a proposal for a standard for doing this derivation, which has some nice properties.

2. Introduction

Typically, key strengthening is used to increase the time necessary for trying a single seed. This does however require either fixing the number of iterations (which should increase over time as hardware gets faster), or forcing the user to remember the number of iterations. It also lacks a way to verify a seed was correct, as every seed results in a valid key.

An alternative is using a checksumming system. For example, requiring that SHA256("BLAH" + seed) starts with a given number of zero bits, and using SHA256(seed) as key. With N required zero bits, this also slows down testing of seeds by 2N per valid seed. It does provide a way to verify a seed was correct, is very fast to do the actual derivation, but also introduces some variation in the seed generation.

Neither solution however addresses the fact that it requires the derivation count or checksum strength to be known in advance (either as part of the specification, or by making the user remember). However, by combining both strengthening and checksumming, we can have seeds that encode their own strength, without compromising security.

3. Proposed algorithm

This is the proposed derivation algorithm:
  • Calculate H(H(H(...(seed)...))), with 257 iterations, and H = HMAC-SHA512.
  • If the 257th iteration results in a hash that starts with 8 zero bits, the 256th iteration hash is the output (the key).
  • If not, keep doing iterations
  • If the 513th iteration results in a hash that starts with 9 zero bits, the 512th iteration hash is the output.
  • If the 1025th iterations results in a hash that starts with 10 zero bits, the 1024th iteration hash is the output.
  • ...

This results in a variable-strength keying that requires 216, 218, 220, ... iterations of H per valid seed (both for a legitimate user creating a new seed, and an attacker trying to attack them). However, an attacker who does not know which strength the user chose, cannot give up quickly after a limited number of iterations. In what follows, we will however assume an attacker does now, so the security of this scheme does not depend on that property.

What this does essentially, is split the generation difficulty equally over making it take 2N steps to generate a key, and at the same time, making only 1 in 2N seed valid. This has some nice properties:
  • Generating a key of a given strength S, requires 216+2*S iterations on average
  • An attacker that has an oracle which can iterate all valid seeds of a given strength in constant time, still needs as many elements as there are in the original seed space to test all valid keys.

Note that HMAC-SHA512 does take a salt as input. This salt can be an application-defined default (such as the fixed string "Bitcoin key derivation"), but the application can also prompt the user for some extra random but non-secret bits (such as an email address, a birthdate, ...). There have been threads above such a salt alone, if I recall correctly.

4. Improvements

The potential difficulties (number of iterations necessary to a generate a valid seed) offered by the scheme described above are relatively far apart: a factor 4. We can however tweak the algorithm a bit.

Instead of checking for a particular number of zero bits at the start of a hash, we can interpret the hash as a number, and compare it directly to some predefined constant. This is similar to how the proof-of-work check happens in Bitcoin itself. Assume we want the difficulties to start at 216 still, but have them be a factor 2 instead of a factor 4 apart, we can use this algorithm:
  • Calculate H(H(H(...(seed)...))), with 257 iterations, and H = HMAC-SHA512.
  • If the 257th iteration results in a hash whose first 32 bits are below 0x101000, the 256th iteration hash is the output (the key).
  • If not, keep doing iterations
  • If the 363th iteration results in a hash whose first 32 bits are below 0xB60183, the 362th iteration hash is the output.
  • If the 513th iteration results in a hash whose first 32 bits are below 0x80C1A2, the 512th iteration hash is the output.
  • If the 726th iteration results in a hash whose first 32 bits are below 0x5B2143, the 725th iteration hash is the output.
  • ...
(more constants and mathematical model to obtain them can be found here; the numbers are found by a numerical approximation).

This results in difficulties very close to 216, 217, 218, ...

Another potential improvement is using a different H function, such as something based on scrypt.

5. Implementation

An example implementation in Javascript can be found here. It does use a modified set of constants, which result in difficulties around 210, 211, 212, ... instead, as difficulties around 216 are already unusably slow (though possible) in Javascript.
Pages:
Jump to: