Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 1228. (Read 2347641 times)

sr. member
Activity: 255
Merit: 250
-i 19 = 1 << 19 (left shift) = 524288

Will be in next 1.4.7 (-i 0 is the default, fixed for each algo, often based on the 750 Ti best perf)

Here is a shared spreadsheet to complete for 970/980 users. To be able to tune the default values for these cards :

https://docs.google.com/spreadsheets/d/1dI1Cc3JhhsA-UdIRlndvQ9huX8FmoJT2TXFDB0wdp54/edit?usp=sharing

i will link the 1.4.7 test version here soon

Here it is : https://dl.dropboxusercontent.com/u/31689596/ccminer/ccminer-rel1.4.7-test-x86.7z

Thanks for this. Is there any difference between ccminer-50 and ccminer-50-52 when running on a 750ti?
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
-i 19 = 1 << 19 (left shift) = 524288

Will be in next 1.4.7 (-i 0 is the default, fixed for each algo, often based on the 750 Ti best perf)

Here is a shared spreadsheet to complete for 970/980 users. To be able to tune the default values for these cards :

https://docs.google.com/spreadsheets/d/1dI1Cc3JhhsA-UdIRlndvQ9huX8FmoJT2TXFDB0wdp54/edit?usp=sharing

i will link the 1.4.7 test version here soon

Here it is : https://dl.dropboxusercontent.com/u/31689596/ccminer/ccminer-rel1.4.7-test-x86.7z
legendary
Activity: 1400
Merit: 1050
what i parameter is the best for 750 ti ?
the smallest value which gives you 99-100% gpu usage (I am speaking about the throughput not exactly the i parameter) also must be a multiple of 256.
hero member
Activity: 789
Merit: 501
what i parameter is the best for 750 ti ?
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
weird results with your last changes, so i will do nothing more on groestl Wink
I added the -i param to prevent these useless throughput commits
https://github.com/tpruvot/ccminer/commit/9f62014690a479976c6ba6af1ed5f51c07f7e86a

This is nice. Note that also the throughput should be editable in the commandline as well. Quark is running 10% faster after I multiplied the througput by 10. (750ti/970/980) Same for NIST-5. The only problem is when mining with low diff.

-i param = command line :/
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Been working on Neoscrypt, so still at 9.6MH/s on my 290X. 980 hasn't caught up yet, though.

My checkins today improves the X11 hash another 50-100 KHASH on the stock clocked 980.(7850-7900 KHASH). With watercooling perhaps we can reach 9200 KHASH.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
weird results with your last changes, so i will do nothing more on groestl Wink
I added the -i param to prevent these useless throughput commits
https://github.com/tpruvot/ccminer/commit/9f62014690a479976c6ba6af1ed5f51c07f7e86a

This is nice. Note that also the throughput should be editable in the commandline as well. Quark is running 10% faster after I multiplied the througput by 10. (750ti/970/980) Same for NIST-5. The only problem is when mining with low diff.
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
weird results with your last changes, so i will do nothing more on groestl Wink

I added the -i param to prevent these useless throughput commits
https://github.com/tpruvot/ccminer/commit/9f62014690a479976c6ba6af1ed5f51c07f7e86a
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
I found the way to enhance a bit your groestl change (commited)
EDIT: hmm in fact not exactly, but... its hard to compare
I didn't get any speedups when testing your changes, but I improved my groestl  now should be 2-3% faster.

this:
int andmask1 = ((~((threadIdx.x & 0x03) - 3)) & 0xffff0000);

rewritten to this:

uint32_t andmask1 =-((threadIdx.x & 0x03) == 3) & 0xffff0000;

and some small other changes.

The % operator (modulo) is more expensive than an and.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Of all the betas I have sendt out, version 6 seems to be the most stable. Been running stable all night with +150 on the gpu clock and no booos.



sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
perf is reduced on the 750Ti (linux), else i found the way to enhance a bit your groestl change (commited)
EDIT: hmm in fact not exactly, but... its hard to compare

This version was tweaked to run fast on the 980. Launchbounds/threads/code. I think I will need to do a test for compute 50, and    do seperate code for the 750ti some places.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
You get this error if you try to run the compute52.exe on a 750ti. The 5.2 is for the 970 and 980 cards only.
full member
Activity: 170
Merit: 100
Code:
@ECHO off
setx GPU_MAX_ALLOC_PERCENT 100
ccminer.exe -a x11 -o stratum+tcp://us1.coinking.io:6666 -u Travis9x.ASRockX11 -p x -D
PAUSE
Code:
SUCCESS: Specified value was saved.
*** ccMiner for nVidia GPUs by Christian Buchner and Christian H. ***
         This is the forked version 1.4.7.SP (sp-hash@github)
          Built with VC++ 2013 and nVidia CUDA SDK 6.5

          based on pooler-cpuminer 2.3.2 (c) 2010 Jeff Garzik, 2012 pooler
            and HVC extension from http://hvc.1gh.com/

        Cuda additions Copyright 2014 Christian Buchner, Christian H.

        Include some of djm34 additions, cleaned by Tanguy Pruvot
                  Optimized Kernals By SP^Cryptoburnes

[2014-11-08 19:18:16] 2 miner threads started, using 'x11' algorithm.
[2014-11-08 19:18:16] Starting Stratum on stratum+tcp://us1.coinking.io:6666
[2014-11-08 19:18:16] Binding thread 0 to cpu 0
[2014-11-08 19:18:16] Binding thread 1 to cpu 1
[2014-11-08 19:18:17] Failed to get Stratum session id
[2014-11-08 19:18:17] Stratum difficulty set to 0.004
[2014-11-08 19:18:17] DEBUG: job_id=45ecd6d dc92 xnonce2=00000000 time=19:11:57
[2014-11-08 19:18:17] us1.coinking.io:6666 sent x11 block 2849
[2014-11-08 19:18:17] sleeptime: 500 ms
[2014-11-08 19:18:17] job 45ecd6d dc92 target change: f9ff060000 (1.0)
[2014-11-08 19:18:17] sleeptime: 500 ms
[2014-11-08 19:18:17] job 45ecd6d dc92 00000000
[2014-11-08 19:18:17] job 45ecd6d dc92 target change: f9ff060000 (1.0)
[2014-11-08 19:18:17] job 45ecd6d dc92 7fffffff
Cuda error in file 'C:/code/ccminer-sp/x11/cuda_x11_simd512.cu' in line 647 : in
valid texture reference.
Press any key to continue . . .
Any thoughts about this texture reference error?   Huh
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
Checked in some more modded kernals:

x11 +300KHASH (980) NIST5 boost, faster x13,x15

https://github.com/sp-hash/ccminer

perf is reduced on the 750Ti (linux), else i found the way to enhance a bit your groestl change (commited)
EDIT: hmm in fact not exactly, but... its hard to compare

my current version for x11 on a 750Ti / linux (2800kH) :
Code:
Time(%)      Time     Calls       Avg       Min       Max  Name
 20.75%  3.64387s        93  39.181ms  39.064ms  41.831ms  x11_echo512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
 18.75%  3.29341s        94  35.036ms  34.963ms  39.033ms  quark_groestl512_gpu_hash_64_quad(int, unsigned int, unsigned int*, unsigned int*)
 12.87%  2.26079s        93  24.310ms  24.180ms  27.077ms  x11_shavite512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
 11.11%  1.95157s        93  20.985ms  20.926ms  23.382ms  x11_simd512_gpu_expand_64(int, unsigned int, unsigned long*, unsigned int*, uint4*)
  7.24%  1.27073s        94  13.518ms  13.483ms  15.056ms  x11_cubehash512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
  5.32%  933.86ms        94  9.9347ms  9.8739ms  11.096ms  quark_jh512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
  5.04%  884.74ms        94  9.4122ms  9.2574ms  10.502ms  x11_luffa512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
  3.08%  540.45ms        94  5.7494ms  5.7279ms  6.3724ms  quark_bmw512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
  3.07%  539.01ms        93  5.7958ms  5.7538ms  5.8993ms  x11_simd512_gpu_compress2_64(int, unsigned int, unsigned long*, unsigned int*, uint4*, int*)
  2.80%  491.53ms        93  5.2852ms  5.1886ms  5.4446ms  x11_simd512_gpu_compress1_64(int, unsigned int, unsigned long*, unsigned int*, uint4*, int*)
  2.76%  484.47ms        94  5.1540ms  5.1358ms  5.7421ms  quark_blake512_gpu_hash_80(int, unsigned int, void*)
  2.71%  475.78ms        94  5.0615ms  5.0070ms  5.6117ms  quark_skein512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
  2.60%  456.75ms        94  4.8591ms  4.8225ms  5.4034ms  quark_keccak512_gpu_hash_64(int, unsigned int, unsigned long*, unsigned int*)
  1.61%  283.04ms        93  3.0434ms  3.0159ms  3.3809ms  x11_simd512_gpu_final_64(int, unsigned int, unsigned long*, unsigned int*, uint4*, int*)
  0.28%  49.941ms        93  537.00us  534.29us  543.03us  cuda_check_gpu_hash_64(int, unsigned int, unsigned int*, unsigned int*, unsigned int*)
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
A tip for windows users using Chrome :

type chrome://flags/ and disable WebGL

Chrome will be faster if you mine on the GPU (and the miner too Wink

This made no difference to hash rate on my win 8 mining rigs.

Its not for rigs, i guess you dont use chrome on them, its more for normal users which like me have seen decreased perfs recently on all algos (when chrome is open)...
legendary
Activity: 2716
Merit: 1116
sr. member
Activity: 255
Merit: 250
A tip for windows users using Chrome :

type chrome://flags/ and disable WebGL

Chrome will be faster if you mine on the GPU (and the miner too Wink

This made no difference to hash rate on my win 8 mining rigs.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Checked in some more modded kernals:

x11 +300KHASH (980) NIST5 boost, faster x13,x15

https://github.com/sp-hash/ccminer
Jump to: