Here's a puzzle for coding experts.
I was testing with both sph and 4way running side by side and comparing the hash.
Everything was fine. Then I started cleaning up the code and the hash broke. What remains
is the last bit of code I can't remove without breaking the hash. I left a couple of comented out
lines for context (no pun intended). The code as presented works. If I remove the line indicated
the hash breaks and it only submits invalid shares that are rejected. It should be noted that
blake_ctx was never initialized nor was sph_blake256 run before blake256_close so close is
running with random data. Both variables are local and are not referenced anywhere else.
I would suspect local stack corruption but in reverse. Instead of code corrupting the stack,
removing code does.
The input data is 4 80 byte streams interleaved for blake_4way.
vhash is 4 32 byte hash streams returned from blake_4way interleaved.
hash0..3 is vhash deinterleaved for lyra2 to be run serially.
hash and ctx_blake are not in any way involved in the proper functioning of the code.
I'm stumped. Anyone have any insight?
Edit:
I tried nulling sph256_close but it failed. It seems to be dependent on actually running the code
in the function.
I moved the funky code to the end of the function and everything still works. But still, if I remove it
the returned hash is invalid. SPH is stable code and not likely to be accessing data it shouldn't.
Even if it did it would break something, not fix it. It's not even being used properly. There should be
no interactions between the sph code and the 4way code, they have their own data structures and supporting
functions and don't share anything
I'm even more stumped.
void lyra2z_hash_4way( void *state, const void *input )
{
uint32_t hash0[8] __attribute__ ((aligned (32)));
uint32_t hash1[8] __attribute__ ((aligned (32)));
uint32_t hash2[8] __attribute__ ((aligned (32)));
uint32_t hash3[8] __attribute__ ((aligned (32)));
uint32_t vhash[8*4] __attribute__ ((aligned (64)));
blake256_4way_context ctx __attribute__ ((aligned (64)));
uint32_t _ALIGN(64) hash[8];
sph_blake256_context ctx_blake __attribute__ ((aligned (64)));
//memcpy( &ctx_blake, &lyra2z_blake_mid, sizeof lyra2z_blake_mid );
//sph_blake256( &ctx_blake, input + 64, 16 );
// removing the following line breaks the hash
sph_blake256_close( &ctx_blake, hash );
memcpy( &ctx, &ctx_mid, sizeof ctx_mid );
blake256_4way( &ctx, input + (64<<2), 16 );
blake256_4way_close( &ctx, vhash );
m128_deinterleave_4x32( hash0, hash1, hash2, hash3, vhash, 256 );
LYRA2Z( lyra2z_wholeMatrix, hash0, 32, hash0, 32, hash0, 32, 8, 8, 8);
LYRA2Z( lyra2z_wholeMatrix, hash1, 32, hash1, 32, hash1, 32, 8, 8, 8);
LYRA2Z( lyra2z_wholeMatrix, hash2, 32, hash2, 32, hash2, 32, 8, 8, 8);
LYRA2Z( lyra2z_wholeMatrix, hash3, 32, hash3, 32, hash3, 32, 8, 8, 8);
memcpy( state , hash0, 32 );
memcpy( state+32, hash1, 32 );
memcpy( state+64, hash2, 32 );
memcpy( state+96, hash3, 32 );
}