Well I've been talking to a few people about this but got no real response from anyone, that it was possible ...
(Woke up with this idea back on the 4th of August ...)
So I guess I need to post in a thread where someone works on a CL kernel and just let them implement it if they don't already do it
I've written it in pseudo-code coz I still don't follow how the CL file actually does 2^n checks and returns the full list of valid results.
Yeah I've programmed in almost every language known to man (except C# and that's avoided by choice) but I still don't quite get the interface from C/C++ to the CL and how that matches what happens
What I am discussing, is the 2nd call to SHA256 with the output of the first call (not the first call)
Anyway, to explain, here's the end of the SHA256 pseudo code from the wikipedia:
==================
for i from 0 to 63
s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22)
maj := (a and b) xor (a and c) xor (b and c)
t2 := s0 + maj
s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25)
ch := (e and f) xor ((not e) and g)
t1 := h + s1 + ch + k[ i] + w[ i]
h := g
g := f
f := e
e := d + t1
d := c
c := b
b := a
a := t1 + t2
Add this chunk's hash to result:
h0 := h0 + a
h1 := h1 + b
h2 := h2 + c
h3 := h3 + d
h4 := h4 + e
h5 := h5 + f
h6 := h6 + g
h7 := h7 + h
Then test if h0..h7 is a share (CHECK0, CHECK1, ?)
==================
Firstly, I added that last line of course.
I understand that with current difficulty, if h0 != 0 then we don't have a share (call this CHECK0)
If h0=0 then check some leading part of h1 based on the current difficulty (call this CHECK1)
... feel free to correct this anyone who knows better
If a difficulty actually gets to checking h2 then my optimisation can be made even better by going back one more step (adding an i := 61) in the pseudo code shown below
A reasonably simple optimisation of the end code for when we are about to check if h0..h7 is a share (i.e. only the 2nd hash)
==================
for i from 0 to 61
s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22)
maj := (a and b) xor (a and c) xor (b and c)
t2 := s0 + maj
s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25)
ch := (e and f) xor ((not e) and g)
t1 := h + s1 + ch + k[ i] + w[ i]
h := g
g := f
f := e
e := d + t1
d := c
c := b
b := a
a := t1 + t2
i := 62
s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22)
maj := (a and b) xor (a and c) xor (b and c)
t2 := s0 + maj
s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25)
ch := (e and f) xor ((not e) and g)
t1 := h + s1 + ch + k[ i] + w[ i]
tmpa := t1 + t2
tmpb := h1 + tmpa (this is the actual value of h1 at the end)
if CHECK1 on tmpb then abort - not a share
(i.e. return false for a share)
h := g
g := f
f := e
e := d + t1
d := c
c := b
b := a
a := tmpa
i := 63
s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22)
maj := (a and b) xor (a and c) xor (b and c)
t2 := s0 + maj
s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25)
ch := (e and f) xor ((not e) and g)
t1 := h + s1 + ch + k[ i] + w[ i]
tmpa := h0 + t1 + t2 (this is the actual value of h0 at the end)
if CHECK0 on tmpa then abort - not a share
(i.e. return false for a share)
h := g
g := f
f := e
e := d + t1
d := c
c := b
Add this chunk's hash to result:
h0 := tmpa
h1 := tmpb
h2 := h2 + c
h3 := h3 + d
h4 := h4 + e
h5 := h5 + f
h6 := h6 + g
h7 := h7 + h
Its a share - unless we need to test h2?
==================
Firstly the obvious (as I've said twice above):
This should only be done when calculating a hash to be tested as a share.
Since the actual process is a double-hash, the first hash should not, of course, do this.
In i=62:
If the tmpb test (CHECK1) says it isn't a share it avoids an entire loop (i=63), the 'e' calculation at i=62 and any unneeded assignments after that
and also we don't care about the actual values of h0-h7 so there is no need to assign them anything (or do the additions) except whatever is needed to affirm the result is not a share (e.g. set h0=-1 if h0..h7 must be examined later - or just return false if that is good enough - I don't know which the code actually needs)
CHECK1's probability of failure is high so it easily cover the issue of an extra calculation (h1 + tmpa) to do it.
In i=63:
If the tmpa test (CHECK0) says it isn't a share it avoids the 'e' calculation at i=63 and any unneeded assigments after that
and also we don't care about the actual values of h0-h7 so there is no need to assign them anything (or do the additions) except whatever is needed to affirm the result is not a share (e.g. set h0=-1 if h0..h7 must be examined later - or just return false if that is good enough - I don't know which the code actually needs)
P.S. any and all mistakes I've made - oh well but the concept is there anyway
Any mistakes? Comments?