Thanks for the tips everyone. Removing the unroll's made the difference, but now I have a compiled (and running) version I'll keep playing with the compiler options and retrying with the pragma unroll's back in.
Incidentally when compiled with sm_35 the miner is reporting a MH/s value that I think should be KH/s...
[MASTER] Coinshield Network: New Block 29091
367528.1 MH/s | 0 Blks ACC=0 REJ=0 | Height = 29091 | Diff = 35 0-bits | 00:03:17
365121.1 MH/s | 0 Blks ACC=0 REJ=0 | Height = 29091 | Diff = 35 0-bits | 00:03:28
When compiled with sm_30 it reports correctly around the 28 MH/s mark, but with sm_35 it gives this. Should I be concerned?
Your GPU's compute capability is not set correctly in the Makefile and the kernel is not being launched correctly. please see the link below:
https://developer.nvidia.com/cuda-gpus
Also please note that the application currently only works on GPUs with compute capability 3.5 or greater.
[MASTER] Coinshield Network: New Block 29127
24.0 MH/s | 0 Blks ACC=0 REJ=0 | Height = 29127 | Diff = 36 0-bits | 00:04:09
24.0 MH/s | 0 Blks ACC=0 REJ=0 | Height = 29127 | Diff = 36 0-bits | 00:04:20
24.0 MH/s | 0 Blks ACC=0 REJ=0 | Height = 29127 | Diff = 36 0-bits | 00:04:31
ehmmm... you missed:
you'll not find any block ever...