Maybe one of you guys that know CUDA much better then I can explain this to me:
1) CudaMiner is setup to use: compute_10,sm_10 (CUDA code generation) out of the ZipFile
This compiles fine
2) I can change Code generation to: compute_30,sm_30;compute_35,sm_35
This too compiles fine in CudaMiner
3) I use said code in nvMiner with: compute_30,sm_30;compute_35,sm_35 and get this:
Error 15 error : Instruction 'shf.l' requires .target sm_35 or higher C:\CCMiner\nvminer\ptxas Debug\nv_kernel2.compute_30.ptx, line 4555; nvminer
Anyone know why? I haven't dug into it yet but was hoping somebody with experience in CUDA could give me a hint where to look or what to do!
If I change Code generation in nvMiner to : compute_35,sm_35 I can compile/link (great first start) into an EXE that will immediately crash due to the following.
Seems that between the different program versions we have three different versions of device_config:
int device_config[8][2];
char *device_config[8];
int *device_config[8];
These are not compatible of course. Taking a break for a few before resolving this and moving on to the next issue that will pop up.
Just wanted to post some progress and see if anyone could maybe tell me why I'm seeing the problem described with shf.l and maybe what to do about it?
Sounds a bit like you dropped a #if __CUDA_ARCH__ >= 350 or #if __CUDA_ARCH__ < 350 somewhere along the way. Shf.l is funnelshift and is available only from 3.5 up, apparently you're ending up compiling it unconditionally regardless of arch and the compiler gives you the finger.