Pages:
Author

Topic: SHA-256 as a boolean function - page 2. (Read 11314 times)

legendary
Activity: 1050
Merit: 1000
You are WRONG!
December 28, 2011, 07:59:50 AM
#4
More-or-less so. The aim is, indeed, to 'reduce needless duplication of work'. However, it's not possible to process all inputs at once.

Generally speaking, main part of SHA-256 is a function of 512 boolean variables. We're going to change only 32 boolean variables to look for certain result, so we can assume other 480 variables fixed. So for a certain block header we can reduce SHA-256 to a function of 32 variables instead of 512 ones, optimizing it in process.

Furthermore, as we're using hardware which can do 32, 64, 128 or 256 binary operations at once, we can utilize this by considering many combinations at once. Let's say for a 64-bit hardware we can, essentially, vary 6 bits at once. Thus we already have a function of (32-6) = 26 variables, so we need only 2^26 combinations. Furthermore, it is possible to optimize things a bit for each such combination.

Processing 64 (or more) combinations at once merely makes it competitive with traditional implementation, it does not reduce number of bit ops done by CPU. Number of bin ops should be reduced at a step when we assume certain bits fixed.


you are forgetting the SHA also is having a internal state...
legendary
Activity: 1022
Merit: 1033
December 28, 2011, 07:52:27 AM
#3
More-or-less so. The aim is, indeed, to 'reduce needless duplication of work'. However, it's not possible to process all inputs at once.

Generally speaking, main part of SHA-256 is a function of 512 boolean variables. We're going to change only 32 boolean variables to look for certain result, so we can assume other 480 variables fixed. So for a certain block header we can reduce SHA-256 to a function of 32 variables instead of 512 ones, optimizing it in process.

Furthermore, as we're using hardware which can do 32, 64, 128 or 256 binary operations at once, we can utilize this by considering many combinations at once. Let's say for a 64-bit hardware we can, essentially, vary 6 bits at once. Thus we already have a function of (32-6) = 26 variables, so we need only 2^26 combinations. Furthermore, it is possible to optimize things a bit for each such combination.

Processing 64 (or more) combinations at once merely makes it competitive with traditional implementation, it does not reduce number of bit ops done by CPU. Number of bin ops should be reduced at a step when we assume certain bits fixed.

donator
Activity: 266
Merit: 252
I'm actually a pineapple
December 28, 2011, 07:21:53 AM
#2
It seems like the main benefit from your proposed approach is that you're considering the function over a bunch of (uniformly distributed) inputs simultaneously, so you'd essentially be trying to reduce needless duplication of work across separate calls to the hash function? At first glance, it does seem like it could be a fruitful direction to pursue.
legendary
Activity: 1022
Merit: 1033
December 28, 2011, 06:58:00 AM
#1
I'm experimenting with an alternative way to compute SHA-256 -- through tracing boolean functions.

Here's the idea:

1. Represent SHA-256 as a huge-ass DAG (directed acyclic graph) with nodes being boolean operations -- AND, OR, NOT, XOR. (One SHA256 block expansion + transform has 121k nodes after some optimization.)

2. Since we go through ~4 billion different nonces it makes sense to assume all other variables constant and optimize some nodes away. I get 86k nodes with 32 variables left.

3. Pre-compute fringe nodes with few dependencies via binary decision diagrams, normal forms or whatever. Optimize out parts of graph which do not need to be computed with certain configurations of nonce.

4. Now we can compute SHA-256 with about as many CPU instructions as there were nodes in DAG. It is possible to compute many combinations at once in parallel, e.g. on CPU which can do 64-bit operations in parallel it is possible to compute 64 combinations.

I've got some preliminary results, without precomputation it, theoretically, should be about as fast as traditional implementation of SHA-256 on CPUs, probably a bit slower due to required load/store operations. With pre-computation and other optimizations it might be faster. Or maybe not. I don't know.

Anybody wants to discuss? I'm not an expert in discrete math, so maybe some SAT solver or boolean function optimization techniques I don't know would be useful here.

Here's discussion in sci.crypt: http://groups.google.com/group/sci.crypt/browse_thread/thread/06ffe40905f725fe

Theoretical considerations:

 * traditional implementation makes advantage of native integer addition, it is very expensive to express addition in form of boolean ops
 * boolean function implementation, however, does not need rotateright -- it is optimized away at graph building stage
 * further, boolean function implementation can take advantage of pre-computation (I see some GPU implementations pre-compute some things, but they cannot go deep).

So this might be of interest for implementation on GPUs which do not have ROTR.
Pages:
Jump to: