Hi,
I maintain the
Open Source FPGA Bitcoin Miner, so I may be able to help. Feel free to ask questions and I'll keep an eye on this thread.
I can see how it might be possible to determine a stop condition before the entire hash is computed. But it's not obvious.
There are no "stop conditions" with SHA-256 (1). You have to compute the whole thing, otherwise the hash is incomplete. As you have found, however, there is a small shortcut because of the way mining works.
And the basic idea is to vary a small postion of x (the nonce) until we find a final hash that is smaller than a certain level (i.e. there's a certain number of leading zeros).
For most miners, all you care about is finding what is called a difficulty 1 share (2). That means there are 32 bits of leading zeros. So, for optimized mining code we only care that the last 32-bits of the final hash are 0. As you saw, those last 32-bits are actually calculated in round 61, and get shifted 3 times in rounds 62, 63, and 64. We don't care about anything calculated in those last three rounds. Hence, the optimization is to skip them, and check the 5th 32-bit word of the hash produced after the 61st round.
You can see that in
the code here. For the second run of SHA-256 it only uses 61 rounds, and reads the 5th word from the output (the code uses IDX(4), because the index starts from 0).
Instead the golder nonce is considered found when a portion of the second hash equals 0xa41f32e7.
There is another shortcut here. In SHA-256, after all 64 rounds have been computed, one last operation is performed. The individual 32-bit words of the input state are added to the individual 32-bits words of the last round's output. You can see that in
The SHA2 wikipedia article, where it says "Add this chunk's hash to result so far."
Now, we're looking to see if the last word is equal to zero:
out[`IDX(7)] + input_state[`IDX(7)] == 0x00000000 ???
The input state to the second run of SHA-256 is a known value. The 8th word of which is 0x5be0cd19, so ...
out[`IDX(7)] + 0x5be0cd19 == 0x00000000 ???
Which is equivalent to:
out[`IDX(7)] == 0x00000000 - 0x5be0cd19 ???
Which is equivalent to:
out[`IDX(7)] == 0xa41f32e7 ???
I hope that explains those two optimizations. Again, feel free to ask questions if you're confused, or if you have other questions. I more or less wrote this quickly, so may have skipped some details.
NOTES(1) I'm not sure what connotation you are giving to "stop conditions," so I will explain a little further to help clear any confusion. When mining, you are given a set of data to perform hashes on. As you noted, this is done by manipulating the nonce and hashing each time the nonce changes. What you really want to do is check
all possible nonces. Even if you've already found a "golden nonce" (one which gives you a hash starting with 32 zeros), you need to
keep searching for more. There could be anywhere between 0 and 2^32 solutions to a given block of work, so it is in your best interest to keep looking for more. Hence, there are no stop conditions in the sense of when to stop running your algorithm, other than having exhausted all possible nonces (at which point, you would get more work).
(2) This is because most miners are connected to a pool server, which accepts any hash meeting at least Difficulty 1 (32 leading zeros). It should be noted, however, that software designed around this optimization will still work for solo mining directly against bitcoind, because bitcoind will merely reject any share that doesn't meet the block difficulty (and every Difficulty 1 share has a possibility of doing so). Also, most FPGA or GPU miners have some high-level controlling software (written in Python or C) which can perform the real difficulty check if necessary. Having the hardware check for shares is really just a way of reducing bandwidth from the hardware to the controlling software.