Potentially faster method for mining on the CPU - page 3.

jlcooke

newbie

Activity: 46

Merit: 0

Quote from: smolen on August 15, 2013, 03:03:01 PM

Quote from: JoelKatz on August 14, 2013, 02:39:04 PM

Are you saying that an algorithm that cannot be simplified cannot exist?

I'd be happy with good approximation of double SHA-2

Quote from: jlcooke on August 15, 2013, 02:42:17 PM

Even if you find that a (SHA-256)^2 function can be implemented with depth=2 and width=1,000,000,000 - it would be useful work that could be built upon.

Quite unrealistic, but if done - SAT-solvers will step in.

Hehe. I like encouraging people to explore this. Good learning "exorcize" (pun). 32 bit addition logic for botnet: http://jlcooke.ca/btc/sha256_logred.html

Just do that about 768 times and reduce! Tongue

smolen

hero member

Activity: 524

Merit: 500

Quote from: JoelKatz on August 14, 2013, 02:39:04 PM

Are you saying that an algorithm that cannot be simplified cannot exist?

I'd be happy with good approximation of double SHA-2

Quote from: jlcooke on August 15, 2013, 02:42:17 PM

Even if you find that a (SHA-256)^2 function can be implemented with depth=2 and width=1,000,000,000 - it would be useful work that could be built upon.

Quite unrealistic, but if done - SAT-solvers will step in.

jlcooke

newbie

Activity: 46

Merit: 0

Quote from: botnet on August 15, 2013, 01:02:03 AM

Quote from: jlcooke on August 13, 2013, 10:37:27 AM

Let's let the OP try to write up the logic formula for SHA256's first 32 bits of input and 1 bit of output.

Thanks for the links.

I rewrote the code so that it performs the double SHA256 first, constructing an in-memory tree of all the operations and variables (without actually evaluating them, arithmetically or symbolically.) So basically for each bit in the final hash, I have a tree representing all the operations that have to occur (using an in-order walk) to compute that bit.

...

If everything looks good, the longer term plan is to assign branch simplification to different datacenter machines, so that I can have a few thousand of them simplifying in parallel instead of 8 on my desktop.

Fun! This work could be published in crypto journals, you know. Even if you find that a (SHA-256)^2 function can be implemented with depth=2 and width=1,000,000,000 - it would be useful work that could be built upon. And you code to do this could be modified to examine SHA-3, whirlpool, and other hash functions.

For the record - (SHA-256)^2 has 64+64=128 rounds with 6 32-bit additions per round.

botnet

newbie

Activity: 35

Merit: 0

Quote from: jlcooke on August 13, 2013, 10:37:27 AM

Let's let the OP try to write up the logic formula for SHA256's first 32 bits of input and 1 bit of output.

Thanks for the links.

I rewrote the code so that it performs the double SHA256 first, constructing an in-memory tree of all the operations and variables (without actually evaluating them, arithmetically or symbolically.) So basically for each bit in the final hash, I have a tree representing all the operations that have to occur (using an in-order walk) to compute that bit.

What this allows me to do is simplify different branches of the trees independently, and in parallel on different threads. The other thing it gives me is the ability to see the current depth of the tree I'm trying to simplify, which will give me some "order of magnitude" guess of how large the final equation will be.
There's thousands of these branches to simplify and they do take a very long time, but watching them in the debugger, things are reducing and it's not ballooning out of control (yet.)

If everything looks good, the longer term plan is to assign branch simplification to different datacenter machines, so that I can have a few thousand of them simplifying in parallel instead of 8 on my desktop.

JoelKatz

legendary

Activity: 1596

Merit: 1012

Democracy is vulnerable to a 51% attack.

Quote from: BitcoinBarrel on August 14, 2013, 09:49:55 AM

There HAS to be a mathematical shortcut that simplifies the encryption algorithm instead of randomly cracking with brute force. It wouldn't surprise me if there was already a solution out there.

The greatest mathematical minds on the planet specifically designed the algorithm so that there wouldn't be. Are you saying that an algorithm that cannot be simplified cannot exist? (If you think about it, that's obviously incorrect.)

OnkelPaul

legendary

Activity: 1039

Merit: 1005

Quote from: BitcoinBarrel on August 14, 2013, 09:49:55 AM

There HAS to be a mathematical shortcut...

Wishing does not work well in reality unless you address someone who can fulfill your wish. Adressing the universe wishing for a mathematical solution to a problem that's *designed* to have no easily computable mathematical solution won't work.

Onkel Paul

BitcoinBarrel

legendary

Activity: 2062

Merit: 1035

Fill Your Barrel with Bitcoins!

There HAS to be a mathematical shortcut that simplifies the encryption algorithm instead of randomly cracking with brute force. It wouldn't surprise me if there was already a solution out there.

jlcooke

newbie

Activity: 46

Merit: 0

Quote from: JoelKatz on August 12, 2013, 07:32:04 PM

The big problem is that there isn't enough space on all the computers in the universe to store the equation for just one of the bits, even if fully simplified. That and even if every particle in the universe were a supercomputer, the age of the universe wouldn't be long enough to solve the equation. The equations are not linear and so they don't simplify very much.

Cryptographers know of this symbolic reduction technique - and SHA-256 and AES and RSA and about eveyr single ASIC design process uses SymRed to optimize their circuits.

Let's let the OP try to write up the logic formula for SHA256's first 32 bits of input and 1 bit of output. Then try to use Karnaugh Maps (http://en.wikipedia.org/wiki/Karnaugh_map) or other technique to simplify it. I predict it will educate them on how hard a problem this is and what kinds of things cryptographers think of when designing crypto functions.

Edit: Also look at: http://en.wikipedia.org/wiki/Circuit_minimization

JoelKatz

legendary

Activity: 1596

Merit: 1012

Democracy is vulnerable to a 51% attack.

The big problem is that there isn't enough space on all the computers in the universe to store the equation for just one of the bits, even if fully simplified. That and even if every particle in the universe were a supercomputer, the age of the universe wouldn't be long enough to solve the equation. The equations are not linear and so they don't simplify very much.

dell123

newbie

Activity: 28

Merit: 0

I have to say on a CPU brute force mode mining will always be king in my eyes. I like that you're trying to innovate, never stop looking for ways to improve. But your method may need some tweaking if it's going to be the best way to mine for you. You raise some extremely interesting points though, we should chat some time.

botnet

newbie

Activity: 35

Merit: 0

a colleague of mine noted the symbolic addition follows a recurrence relation pattern and hence might be simplified by these means: http://en.wikipedia.org/wiki/Recurrence_relation

bill86

full member

Activity: 159

Merit: 100

@botnet

Quote from: botnet on August 11, 2013, 12:42:17 AM

Quote from: bill86 on August 07, 2013, 08:31:35 PM

Now I understand your idea better. But you are possibly wrong: The main complexity of your approach is not because of the degree of the polynomials but because of the referencing of the literals (you called them 'variables') to each other.

At this point you have just two choices:
1. Try random input (you know this solution under the (wrong) name 'brute force')
2. Try to approximate to a good solution (but then your result would only be partly correct: Who is interested in a 90% correct nonce to the given block and targeted result?) and it is not proven in this case that an approximation exists.

I think there is some misunderstanding of the technique that I will try and clear up with an example. It should in fact give an exact solution, not an approximate one.

What I saw is that your problem is possibly some sort of NP-hard. So I referenced to the only viable solution to get feasible results out of your algorithm: random input or approximation.

Quote from: botnet on August 11, 2013, 12:42:17 AM

In my proposed approach, this is done symbolically. First, symbolically do the addition and Xor operations to produce a set of equations that represents each of the 8 bits of 'target' in terms of the 8 bits of 'nonce'

t0: (!n0)
t1: (Xor[1, n0, n1])
t2: (Xor[n2, n1 && n0])
t3: (Xor[1, !n3, n2 || (n1 && n0)])
t4: (Xor[n4, n3 || n2 || (n1 && n0)])
t5: (Xor[1, n4 && (n3 || n2 || (n1 && n0)), !n5])
t6: (Xor[1, n6, n5 || (n4 && (n3 || n2 || (n1 && n0)))])
t7: (Xor[n6 && (n5 || (n4 && (n3 || n2 || (n1 && n0)))), !n7])

I do not have the time right now to write a proof. So I feel not fully confident about the hardness of your problem at the moment. But I see at the first glance a starting point to prove that your algorithm could solve the 'Linear Equations Problem' (often referenced as 'LIN'), too.

In this case your algorithm would be at least NP-hard and you are out of luck to find a good exact solution for it.

This kind of algorithms (if your algorithm turns out to be NP-hard) is a little bit creepy: If you have only few (testing) data to solve they feel like some sort of good solution. But if you feed the 'real data' workload to them then they will not terminate in about a couple of years (or sometimes millions of years). Surely they will deliver a result (if correctly planned and programmed, i.e. they terminate) but nobody will ever profit from the output.

A good example is the Travelling Salesperson Problem (TSP) (yeah, I know that TSP is NP-complete and not only NP-hard, but this doesn't make a difference at this point). If you have just five or six cities in your input then it has good solutions and is acceptable fast. But if you give all the cities of a small country to it then it will not terminate within some hundred of years waiting time.

Please keep in mind that there is a tiny chance that I could have overseen anything and your algorithm is in fact not NP-hard. Only a polynomial reduction from LIN to your algorithm (or from SAT to your algorithm) can exclude all possibility of doubt.

ZirconiumX

full member

Activity: 286

Merit: 100

OK, I see what you are trying to do here.

There are still some problems, though, namely:

a) Parsing is going to be a b****
b) Not everyone happens to have a copy of Wolfram Mathematica lying around
c) Precomputation is still very computationally intensive.

In response to a) and b), IMO, you might as well include the symbolic solver in the miner, since you really don't need the computational power of Mathematica to check if a number is 1 or 0.

Matthew:out

botnet

newbie

Activity: 35

Merit: 0

Quote from: bill86 on August 07, 2013, 08:31:35 PM

Now I understand your idea better. But you are possibly wrong: The main complexity of your approach is not because of the degree of the polynomials but because of the referencing of the literals (you called them 'variables') to each other.

At this point you have just two choices:
1. Try random input (you know this solution under the (wrong) name 'brute force')
2. Try to approximate to a good solution (but then your result would only be partly correct: Who is interested in a 90% correct nonce to the given block and targeted result?) and it is not proven in this case that an approximation exists.

I think there is some misunderstanding of the technique that I will try and clear up with an example. It should in fact give an exact solution, not an approximate one.

As mentioned, traditional bitcoin mining performs target=SHA256(SHA256(header+nonce)), looking for a very specific solution (top 4 bytes of 'target' are 0.) Mining today uses a brute force approach, iterating through every possible nonce value in the 32 bit address space, performing the double SHA256, and looking at the result to check if those top 4 bytes are 0.

In my example here, rather than the function target=SHA256(SHA256(header+nonce)), let's start with a very simple function:

target = (nonce + 0xAD) ^ 0x6E;

(Assume we're working with 8 bit numbers for this example.)

Just like mining, let's try to find a value for 'nonce' such that target = 0; Mining today would try and brute force try all possible values for 'nonce' until that condition is satisfied, e.g:

for(nonce = 0; nonce <= 255; nonce++)
{
target = (nonce + 0xAD) ^ 0x6E;
if(target == 0)
return nonce;
}

In my proposed approach, this is done symbolically. First, symbolically do the addition and Xor operations to produce a set of equations that represents each of the 8 bits of 'target' in terms of the 8 bits of 'nonce'

t0: (!n0)
t1: (Xor[1, n0, n1])
t2: (Xor[n2, n1 && n0])
t3: (Xor[1, !n3, n2 || (n1 && n0)])
t4: (Xor[n4, n3 || n2 || (n1 && n0)])
t5: (Xor[1, n4 && (n3 || n2 || (n1 && n0)), !n5])
t6: (Xor[1, n6, n5 || (n4 && (n3 || n2 || (n1 && n0)))])
t7: (Xor[n6 && (n5 || (n4 && (n3 || n2 || (n1 && n0)))), !n7])

Now we can solve the system of equations, which will give us values for n0...n7 which satisfy the condition that t0...t7 == 0; This will not be an approximation, we will get exact values for each bit.

As mentioned earlier, we convert the logical operations to arithmetic operations, modulo 2:

t0: (1+n0)
t1: (1+n0+n1)
t2: (n0*n1+n2)
t3: (n2+n0*n1*(1+n2)+n3)
t4: (n2+n3+n2*n3+n0*n1*(1+n2)*(1+n3)+n4)
t5: ((n2+n3+n2*n3+n0*n1*(1+n2)*(1+n3))*n4+n5)
t6: (1+n5+(n2+n3+n2*n3+n0*n1*(1+n2)*(1+n3))*n4*(1+n5)+n6)
t7: (1+(n5+(n2+n3+n2*n3+n0*n1*(1+n2)*(1+n3))*n4*(1+n5))*n6+n7)

And now we can solve (done using wolfram mathematica):

In[1] := Solve[(1 + n0) == 0 && (1 + n0 + n1) == 0 && (n0*n1 + n2) == 0 && (n2 + n0*n1*(1 + n2) + n3) == 0 && (n2 + n3 + n2*n3 + n0*n1*(1 + n2)*(1 + n3) + n4) == 0 && ((n2 + n3 + n2*n3 + n0*n1*(1 + n2)*(1 + n3))*n4 + n5) == 0 && (1 + n5 + (n2 + n3 + n2*n3 + n0*n1*(1 + n2)*(1 + n3))*n4*(1 + n5) + n6) == 0 && (1 + (n5 + (n2 + n3 + n2*n3 + n0*n1*(1 + n2)*(1 + n3))*n4*(1 + n5))*n6 + n7) == 0 && n0*n0 - n0 == 0 && n1*n1 - n1 == 0 && n2*n2 - n2 == 0 && n3*n3 - n3 == 0 && n4*n4 - n4 == 0 && n5*n5 - n5 == 0 && n6*n6 - n6 == 0 && n7*n7 - n7 == 0, {n0, n1, n2, n3, n4, n5, n6, n7}, Modulus -> 2]

Out[1] = {{n0 -> 1, n1 -> 0, n2 -> 0, n3 -> 0, n4 -> 0, n5 -> 0, n6 -> 1, n7 -> 1}}

You can see this produced the exact solution: 11000001 == 0xC1 == 193, which does in fact validate:

(0xC1 + 0xAD) ^ 0x6E == 0

Now we just expand this for bitcoin mining. Do everything I did above, but with the symbolic version of SHA256(SHA256(block_header)) rather than my simple example. Take the first 32 equations from this result set (corresponding to the 4 bytes we want to be 0), and fill in literal values from the block header for everything except for the 32 bits of nonce we want to solve for. Now solve the system of equations; we get the exact bits for nonce that produces a target with the first 4 bytes zeroed.

Again, the symbolic SHA256 stuff can all be computed offline and saved - the computation part of mining now will just be solving that system of 32 equations, which mathematica (or another symbolic math module) can do incredibly fast. Much faster than my GPU can brute-force every possible nonce combination.

bill86

full member

Activity: 159

Merit: 100

@botnet

Quote from: botnet on August 06, 2013, 11:14:42 PM

Quote from: bill86 on August 05, 2013, 06:39:12 AM

What do you mean with the word PolynomialReduce? Do you mean polynomial reduction?

In my case I literally mean the PolynomialReduce function in Wolfram Mathematica: http://reference.wolfram.com/mathematica/ref/PolynomialReduce.html

So my TUInt32 class creates a symbolic representation of every mathematical or logical operator that is applied to it, and after each operation, it feeds that equation through Wolfram Mathematica to reduce and simplify the equation, for example: FullSimplify[PolynomialReduce[a*b, a0^2 - a0, b0^2 - b0, Modulus -> 2]]

(recall that logical operations are being transformed to numeric operations in the modulo 2 number ring)

Now I understand your idea better. But you are possibly wrong: The main complexity of your approach is not because of the degree of the polynomials but because of the referencing of the literals (you called them 'variables') to each other.

At this point you have just two choices:
1. Try random input (you know this solution under the (wrong) name 'brute force')
2. Try to approximate to a good solution (but then your result would only be partly correct: Who is interested in a 90% correct nonce to the given block and targeted result?) and it is not proven in this case that an approximation exists.

@c789

Quote from: c789 on August 07, 2013, 12:47:16 AM

1.21 Gigawatts!

I thank you. My tip address is below.

YMMD!

Quote from: c789 on August 07, 2013, 12:47:16 AM

Really, what level of course material are you guys talking here? I only had the first level of CS classes and I'm having a tough time following. Looks cool though.

In my case: Halting problem, P, NP, some other part of the complexity zoo and SAT were basic stuff. Second year.
ILP came in higher classes.

Are you trying to get a degree in computer science?

@cheater123

Quote from: cheater123 on August 07, 2013, 03:39:52 AM

I am just new here and just watching all about mining what is this but currently so confused here

Welcome to the forum!

A very short introduction to mining
To find a new block the network gives every miner a challenge (i.e. a target value). The higher the difficulty the lower the desired value. Now every miner reads the last block and tries to guess a nonce for it. If the computed result is lower than the target value then a new block has been found and the next challenge uses the newly found block.

Maybe one could try addition as basic algorithm for mining:
lastblock + nonce = result
But that would be pointless: Subtraction would find the nonce by simple computation.
result - lastblock = nonce
That is because of the fact that addition and subtraction are each fast to compute.

So the chosen basic algorithm is SHA256 which is known for a speciality: computing a result by
sha256(original_input)=result
is fast to compute.
But computing the original_input by using only the result is really time consuming.

These algorithms are known as one way function (with trap door).

For reversing to the nonce in a feasible time one would need a theoretical machine which guesses the right value in the first step. Unfortunately there is no known way to produce such a machine in reality.

So every miner uses a random value as nonce and computes:
sha256(sha256(nonce, last_block))=result
and evaluates whether result < target_value.
If so then a new block was found. Otherwise the miner tries another random value as nonce and evaluates again.

cheater123

newbie

Activity: 28

Merit: 0

I am just new here and just watching all about mining what is this but currently so confused here

c789

hero member

Activity: 850

Merit: 1000

1.21 Gigawatts!

I thank you. My tip address is below.

Really, what level of course material are you guys talking here? I only had the first level of CS classes and I'm having a tough time following. Looks cool though.

botnet

newbie

Activity: 35

Merit: 0

Quote from: bill86 on August 05, 2013, 06:39:12 AM

What do you mean with the word PolynomialReduce? Do you mean polynomial reduction?

In my case I literally mean the PolynomialReduce function in Wolfram Mathematica: http://reference.wolfram.com/mathematica/ref/PolynomialReduce.html

So my TUInt32 class creates a symbolic representation of every mathematical or logical operator that is applied to it, and after each operation, it feeds that equation through Wolfram Mathematica to reduce and simplify the equation, for example: FullSimplify[PolynomialReduce[a*b, a0^2 - a0, b0^2 - b0, Modulus -> 2]]

(recall that logical operations are being transformed to numeric operations in the modulo 2 number ring)

bill86

full member

Activity: 159

Merit: 100

@botnet

Quote from: botnet on August 01, 2013, 01:44:06 AM

These will obviously be huge, but you can PolynomialReduce after each step of the hashing to reduce the equation size.

What do you mean with the word PolynomialReduce? Do you mean polynomial reduction?

I did not take the tour to understand your thoughts fully because it looks like some sort of ILP problem to me (at least there seems to be a polynomial reduction from your problem to the ILP).

Generally speaking: If there would be a known feasible solution for something like:
sha256^(-1)(sha256^(-1)(result)) = (nonce, previous_block)
then it would be silly to call sha2 an one way function.

SAT and ILP have feasible solutions only under the assumption that P=NP.

ZirconiumX

full member

Activity: 286

Merit: 100

Right. A 32-bit x86 add operation takes 1 cycle.

Your currently 7-bit symbolic add operation has (roughly - don't trust my counting skills) 66 XOR ops and 67 AND ops. You could have calculated the 133rd fibonacci number by the time you had added 7 bits together.

Another advantage brute-force has is parallelism. Because of the varying brackets, you can't even execute most of your instructions in parallel.

Matthew:out

Topic: Potentially faster method for mining on the CPU - page 3. (Read 14431 times)