Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 1167. (Read 2347641 times)

legendary
Activity: 1510
Merit: 1003
1.5.36(sp-MOD) is available here: (5-feb-2015)

...

Post your stats here. Card name/gpu clock/memclock

gtx 750 (1470gpu/1550mem)

x11    2847 khs
quark 5610 khs
lyra     853 khs
fresh  3435 khs
x13 don't work
x14    2200 khs
x15    1920 khs

note: some of my prev records was with more oc
legendary
Activity: 1510
Merit: 1003
1.5.36(sp-MOD) is available here: (5-feb-2015)

fixed x11

you removed "h_found[thr_id][0] = 0xffffffff;" from x11.cu but added
Code:
foundNonce = 0xffffffff;
if (foundNonce != 0xffffffff)
...
to x13.cu
x13 don't work now as far as i see ...
legendary
Activity: 1512
Merit: 1000
quarkchain.io
Hi , mate , r36 still giving me only booos on qubit..
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
1.5.36(sp-MOD) is available here: (5-feb-2015)

fixed x11
fixed anime(klaust)
faster quark, x11 etc

https://github.com/sp-hash/ccminer/releases/tag/1.5.36

The sourcecode is available here:

https://github.com/sp-hash/ccminer

Post your stats here. Card name/gpu clock/memclock
legendary
Activity: 1510
Merit: 1003
i love it )))
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
try to remove this commit. (the MyStreamSynchronize calls)

https://github.com/sp-hash/ccminer/commit/40858b1e817c4ea40cf254fe292b3a3c7ca328f2

I think quark will be faster, but your computer will be slow an unusable if you use the graphic card for other work. The syncronization code was added between release 31 and 32


int  the latest versions (release 33 and up) it is also possible to increase the cpu usage for more hash. This was feature was merged from the tpruvot/ccminer fork.

--cpu-affinity    set process affinity to specific cpu core(s) mask
--cpu-priority    set process priority (default: 0 idle, 2 normal to 5 highest)
try --cpu-priority 5
my display is attached to built-in graphics, so this is not a problem. I will give it a try
And i always run ccminer with realtime priority. After --cpu-priority appearing i use it too.
upd: strange ... i didn't find "MyStreamSynchronize(NULL, 4, thr_id)" on line 237 or nearby in the quarkcoin.cu.
Lines
"MyStreamSynchronize(NULL, 1, thr_id)"
"MyStreamSynchronize(NULL, 2, thr_id)"
"MyStreamSynchronize(NULL, 3, thr_id)"
was found and commented.
Now building ...

yes I have moved them around, there is one more further down.. "MyStreamSynchronize(NULL, 4, thr_id)" comment out this one as well


legendary
Activity: 1510
Merit: 1003
try to remove this commit. (the MyStreamSynchronize calls)

https://github.com/sp-hash/ccminer/commit/40858b1e817c4ea40cf254fe292b3a3c7ca328f2

I think quark will be faster, but your computer will be slow an unusable if you use the graphic card for other work. The syncronization code was added between release 31 and 32


int  the latest versions (release 33 and up) it is also possible to increase the cpu usage for more hash. This was feature was merged from the tpruvot/ccminer fork.

--cpu-affinity    set process affinity to specific cpu core(s) mask
--cpu-priority    set process priority (default: 0 idle, 2 normal to 5 highest)



try --cpu-priority 5

my display is attached to built-in graphics, so this is not a problem. I will give it a try
And i always run ccminer with realtime priority. After --cpu-priority appearing i use it too.


upd: strange ... i didn't find "MyStreamSynchronize(NULL, 4, thr_id)" on line 237 or nearby in the quarkcoin.cu.

Lines
"MyStreamSynchronize(NULL, 1, thr_id)"
"MyStreamSynchronize(NULL, 2, thr_id)"
"MyStreamSynchronize(NULL, 3, thr_id)"
was found and commented.
Now building ...

upd2: you are a wizard Wink A new quark record for my poor GTX750 - 5650KH, better then R31 Wink
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
try to remove this commit. (the MyStreamSynchronize calls)

https://github.com/sp-hash/ccminer/commit/40858b1e817c4ea40cf254fe292b3a3c7ca328f2

I think quark will be faster, but your computer will be slow an unusable if you use the graphic card for other work. The syncronization code was added between release 31 and 32


int  the latest versions (release 33 and up) it is also possible to increase the cpu usage for more hash. This was feature was merged from the tpruvot/ccminer fork.

--cpu-affinity    set process affinity to specific cpu core(s) mask
--cpu-priority    set process priority (default: 0 idle, 2 normal to 5 highest)



try --cpu-priority 5
legendary
Activity: 1510
Merit: 1003

This code will force the kernal to use 64 registers. bether on the 750ti worse on the 970/980.
What I usually do is to case on the compute version and run different configurations for different compute version.
I think many of the kernals haven't been tweaked for a while, and hash can be gained.
If you see my Bitcoin change, all I did was to change the kernal launch configutration for a 17% speedup.

here is the commit:

https://github.com/sp-hash/ccminer/commit/c79f622969393f52f3462e2c3e967777cae1d7d3
thanks for explanation
R35 after this few mods is best for me, maybe except for quark (r31 is 20-30 kh better), but it's a matter of measurement error Wink
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
With vardiff I get around 25khash bether in 35 than in 33/34 , but I lost around 25 KHASH after merging the fork with klaus_t (from  32 ->  33).
Will take another look tonight.
But it could be this commit:
https://github.com/sp-hash/ccminer/commit/704e609b9003620dd29a74b2f43924209e193d24
Might work good on the 960-980 cards, but not on the 750ti.
Not many changes though.
Yes, i put back those "__global__ __launch_bounds__(256, 4)" etc and got some near measurement error gain of ~20kh

This code will force the kernal to use 64 registers. bether on the 750ti worse on the 970/980.
What I usually do is to case on the compute version and run different configurations for different compute version.
I think many of the kernals haven't been tweaked for a while, and hash can be gained.
If you see my Bitcoin change, all I did was to change the kernal launch configutration for a 17% speedup.

here is the commit:

https://github.com/sp-hash/ccminer/commit/c79f622969393f52f3462e2c3e967777cae1d7d3
legendary
Activity: 1510
Merit: 1003

With vardiff I get around 25khash bether in 35 than in 33/34 , but I lost around 25 KHASH after merging the fork with klaus_t (from  32 ->  33).

Will take another look tonight.


But it could be this commit:

https://github.com/sp-hash/ccminer/commit/704e609b9003620dd29a74b2f43924209e193d24

Might work good on the 960-980 cards, but not on the 750ti.

Not many changes though.

Yes, i put back those "__global__ __launch_bounds__(256, 4)" etc and got some near measurement error gain of ~20kh
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
upd: now it works. ~2850 kh (the same speed as in r31 & r32)

With vardiff I get around 25khash bether in 35 than in 33/34 , but I lost around 25 KHASH after merging the fork with klaus_t (from  32 ->  33).

Will take another look tonight.


But it could be this commit:

https://github.com/sp-hash/ccminer/commit/704e609b9003620dd29a74b2f43924209e193d24

Might work good on the 960-980 cards, but not on the 750ti.

Not many changes though.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Ahh, I found the problem now
remove the line
h_found[thr_id][0] = 0xffffffff;
in x11.cu
Cheesy Cheesy Cheesy
yeah, the next line "if (h_found[thr_id][0] != 0xffffffff)" is slightly not compatible with it Wink

I added the line in order to speed test the simd kernal alone, but forgot to remove it. :/ I am away from my dev computer for a while, so will fix it on github later tonight.
legendary
Activity: 1510
Merit: 1003

Ahh, I found the problem now

remove the line

h_found[thr_id][0] = 0xffffffff;

in x11.cu

Cheesy Cheesy Cheesy

yeah, the next line "if (h_found[thr_id][0] != 0xffffffff)" is slightly not compatible with it Wink

upd: now it works. ~2850 kh (the same speed as in r31 & r32)
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Seems to work for others. (yaamp.com)

Wich algos are broken? Did you try fresh or quark? Might be an issue with cuda 7.0 (wich is only released in beta version)


only x11 is broken. All others work fine.
I compiled with cuda 6.5 preset (as default). Cuda 7 installed is not an issue.
You should recheck a code. I didn't see any working Release 35 clients on yaamp's x11 stats ...

Ahh, I found the problem now

remove the line

h_found[thr_id][0] = 0xffffffff;

in x11.cu
legendary
Activity: 1510
Merit: 1003
Seems to work for others. (yaamp.com)

Wich algos are broken? Did you try fresh or quark? Might be an issue with cuda 7.0 (wich is only released in beta version)


only x11 is broken. All others work fine.
I compiled with cuda 6.5 preset (as default). Cuda 7 installed is not an issue.
You should recheck a code. I didn't see any working Release 35 clients on yaamp's x11 stats ...
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Seems to work for others. (yaamp.com)

Wich algos are broken? Did you try fresh or quark? Might be an issue with cuda 7.0 (wich is only released in beta version)



sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
You need cuda 6.5
legendary
Activity: 1510
Merit: 1003
Strange, it works here.

Make sure you have the latest driver.

I might have copied the wrong exe file though. If you build it from github it should work.
just recompiled. The same. No accepts. No boos. Driver is from CUDA 7.0 pack as always 347.12
Previous releases run x11 normally
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Strange, it works here.

Make sure you have the latest driver.

I might have copied the wrong exe file though. If you build it from github it should work.
Jump to: