Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 1239. (Read 2347601 times)

legendary
Activity: 1512
Merit: 1000
quarkchain.io
My 970 and 980 are wotking with latest miner on X13 for several hours, but report about 3.5-3.8 % of losses
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Today I worked on x13 (fugue)

      x0 = ((c0 ^ r0) & SPH_C32(0xFF000000)) \
         | ((c1 ^ r1) & SPH_C32(0x00FF0000)) \
         | ((c2 ^ r2) & SPH_C32(0x0000FF00)) \
         | ((c3 ^ r3) & SPH_C32(0x000000FF)); \
      x1 = ((c1 ^ (r0 << Cool) & SPH_C32(0xFF000000)) \
         | ((c2 ^ (r1 << Cool) & SPH_C32(0x00FF0000)) \
         | ((c3 ^ (r2 << Cool) & SPH_C32(0x0000FF00)) \
         | ((c0 ^ (r3 >> 24)) & SPH_C32(0x000000FF)); \
      x2 = ((c2 ^ (r0 << 16)) & SPH_C32(0xFF000000)) \
         | ((c3 ^ (r1 << 16)) & SPH_C32(0x00FF0000)) \
         | ((c0 ^ (r2 >> 16)) & SPH_C32(0x0000FF00)) \
         | ((c1 ^ (r3 >> 16)) & SPH_C32(0x000000FF)); \
      x3 = ((c3 ^ (r0 << 24)) & SPH_C32(0xFF000000)) \
         | ((c0 ^ (r1 >> Cool) & SPH_C32(0x00FF0000)) \
         | ((c1 ^ (r2 >> Cool) & SPH_C32(0x0000FF00)) \
         | ((c2 ^ (r3 >> Cool) & SPH_C32(0x000000FF)); \

Replaced with:
      t0 = __byte_perm(c0, c1, 0x0145);\
      t1 = __byte_perm(c0, c1, 0x2367);\
      t2 = __byte_perm(c2, c3, 0x0145);\
      t3 = __byte_perm(c2, c3, 0x2367);\
      t4 = __byte_perm(t0, t3, 0x0347);\
      t6 = __byte_perm(t1, t2, 0x4703);\
      t7 = __byte_perm(c1, c2, 0x0505);\
      t8 = __byte_perm(c0, c3, 0x6363);\
      t9 = __byte_perm(t7, t8, 0x0145);\
      t10 = __byte_perm(c0, c3, 0x4141);\
      t11 = __byte_perm(c1, c2, 0x1717);\
      t12 = __byte_perm(t10, t11, 0x0145);\
      t13 = __byte_perm(r0, r1, 0x0505);\
      t14 = __byte_perm(r2, r3, 0x3737);\
      t15 = __byte_perm(t13, t14, 0x0145);\
      t16 = __byte_perm(r0, r1, 0x1616);\
      t17 = __byte_perm(r2, r3, 0x3434);\
      t18 = __byte_perm(t16, t17, 0x0145);\
      t19 = __byte_perm(r0, r1, 0x2727);\
      t20 = __byte_perm(r2, r3, 0x0505);\
      t21 = __byte_perm(t19, t20, 0x0145);\
      t22 = __byte_perm(r0, r1, 0x3434);\
      t23 = __byte_perm(r0, r2, 0x5151);\
      t24 = __byte_perm(t22, t23, 0x0145);\
      x0 = t4^t15;\
      x1 = t9^t18;\
      x2 = t6^t21;\
      x3 = t12^t24;\

There is a bug somewhere in the perms so the code doesn't hash correctly. PTX code shows half the assembly instructions used, but the speed is the same/perhaps a little bit faster. seems to be a dead end. :/
legendary
Activity: 2716
Merit: 1116
All ccminers crash sometime (random) when I try to close the 980cards miner, anybody have a clue about this?

Only the 980 crashs, 750Ti closes the miners window OK.

Cheers
sr. member
Activity: 285
Merit: 250
Guys, can you please post your configs and OCs?
On my EVGA SC GTX970 ACX 1.0 I can't get more than:
X11 -  5000-5200
X13 -  4000-4500

It's driving me nuts!
Restart your computer first
use this ccminer: https://github.com/tpruvot/ccminer/releases
there is no real 'config' u just set pool and set x11 as algo
you should get 5600 stock and 6100 if you OC your clock by 200 and leave memory untouched, using EVGA Precision X 16
sr. member
Activity: 285
Merit: 250
Thanks for testing guys. I managed to remove the bugs in quark and jackpotcoin.
I will compile a version with compute 5.2 wich will give boosts on 970 and 980

excited for this Smiley
As i mentioned earlier in my testing report for GTX970, no improvements ono x11
legendary
Activity: 3164
Merit: 1003
Thanks for testing guys. I managed to remove the bugs in quark and jackpotcoin.
I will compile a version with compute 5.2 wich will give boosts on 970 and 980
Your welcome , I do have a lot of booo's. Please let me know when you have it compiled for 750ti. thank you
sr. member
Activity: 271
Merit: 251
Guys, can you please post your configs and OCs?
On my EVGA SC GTX970 ACX 1.0 I can't get more than:
X11 -  5000-5200
X13 -  4000-4500

It's driving me nuts!
legendary
Activity: 3248
Merit: 1070
https://github.com/tpruvot/ccminer/releases - this miner seems more optimized on 900s
GTX970 on x13 - 5000 kH/s
GTX980 on x13 - 5850 kH/s

how about consumption?
Hmm I dont have any Watmeter , but MSI AB showes power usage between 85-90% on both cards

still room for improvement then, i think a 970 could reach 6k on x13, at peak maybe
legendary
Activity: 2716
Merit: 1116
3x980 + 1x750Ti:

subefotos
sr. member
Activity: 271
Merit: 251
driver version 344.16
"-q -r 3 -R 10 -a" as config
no OC - I click on the DEFAULT in PrecisionX
Win7 x64 4GB
I have some experience mining, but you could say I'm a newbie in cudaminig.
legendary
Activity: 1512
Merit: 1000
quarkchain.io
I have the EVGA SC 970 ACX 1.0 and only getting 5100 at X11.
Using the https://github.com/tpruvot/ccminer/releases
Default clocks and everything.
Which build are you guys using and what clocks?
Tell us more , driver version ? OC settings , miner cfg?
sr. member
Activity: 271
Merit: 251
I have the EVGA SC 970 ACX 1.0 and only getting 5100 at X11.
Using the https://github.com/tpruvot/ccminer/releases
Default clocks and everything.
Which build are you guys using and what clocks?
legendary
Activity: 1512
Merit: 1000
quarkchain.io
Hmm I sam something else - with this boost version 970 almost cant get any accepted share - strange...
EDIT: I lowered OC and started accepting with 4970 kH/s
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Thanks for testing guys. I managed to remove the bugs in quark and jackpotcoin.
I will compile a version with compute 5.2 wich will give boosts on 970 and 980
legendary
Activity: 1512
Merit: 1000
quarkchain.io
https://github.com/tpruvot/ccminer/releases - this miner seems more optimized on 900s
GTX970 on x13 - 5000 kH/s
GTX980 on x13 - 5850 kH/s

how about consumption?
Hmm I dont have any Watmeter , but MSI AB showes power usage between 85-90% on both cards
legendary
Activity: 3248
Merit: 1070
https://github.com/tpruvot/ccminer/releases - this miner seems more optimized on 900s
GTX970 on x13 - 5000 kH/s
GTX980 on x13 - 5850 kH/s

how about consumption?
legendary
Activity: 1512
Merit: 1000
quarkchain.io
https://github.com/tpruvot/ccminer/releases - this miner seems more optimized on 900s
GTX970 on x13 - 5000 kH/s
GTX980 on x13 - 5850 kH/s
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
i dont see any improvements on X11 against my version :

Yours :
Code:
==30576== Profiling application: ccminersp.exe -a x11 --benchmark
==30576== Profiling result:
Time(%)      Time     Calls       Avg       Min       Max  Name
 20.95%  14.5236s       296  49.066ms  48.325ms  52.197ms  x11_echo512_gpu_hash_64(int, unsi
 19.27%  13.3595s       297  44.982ms  44.554ms  50.817ms  quark_groestl512_gpu_hash_64_quad
 11.74%  8.13434s       296  27.481ms  26.976ms  30.969ms  x11_shavite512_gpu_hash_64(int, u
 10.63%  7.36593s       296  24.885ms  24.556ms  27.590ms  x11_simd512_gpu_expand_64(int, un
  6.99%  4.84835s       296  16.380ms  16.199ms  18.494ms  x11_cubehash512_gpu_hash_64(int,
  5.29%  3.66619s       297  12.344ms  12.201ms  13.939ms  quark_jh512_gpu_hash_64(int, unsi
  5.02%  3.47754s       296  11.748ms  11.322ms  13.248ms  x11_luffa512_gpu_hash_64(int, uns
  4.24%  2.93967s       296  9.9313ms  9.4073ms  10.335ms  x11_simd512_gpu_compress2_64(int,
  3.51%  2.43131s       296  8.2139ms  8.0358ms  8.5024ms  x11_simd512_gpu_compress1_64(int,
  2.87%  1.99201s       297  6.7071ms  6.3100ms  7.6564ms  quark_bmw512_gpu_hash_64(int, uns
  2.69%  1.86559s       297  6.2814ms  6.1864ms  7.1012ms  quark_skein512_gpu_hash_64(int, u
  2.66%  1.84484s       297  6.2116ms  6.1330ms  7.0152ms  quark_keccak512_gpu_hash_64(int,
  2.38%  1.65098s       297  5.5589ms  5.4953ms  6.2752ms  quark_blake512_gpu_hash_80(int, u
  1.47%  1.01983s       296  3.4454ms  3.3832ms  3.6530ms  x11_simd512_gpu_final_64(int, uns
  0.28%  195.46ms       295  662.57us  652.23us  699.94us  cuda_check_gpu_hash_64(int, unsig
  0.00%  875.13us       295  2.9660us  2.4000us  12.479us  [CUDA memset]
  0.00%  627.25us       295  2.1260us  1.3430us  9.0870us  [CUDA memcpy DtoH]
  0.00%  63.389us        82     773ns     512ns  1.2480us  [CUDA memcpy HtoD]

and mine :
Code:
Time(%)      Time     Calls       Avg       Min       Max  Name
 20.80%  6.36219s       146  43.577ms  42.978ms  44.897ms  x11_echo512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
 18.86%  5.77071s       146  39.525ms  39.119ms  44.727ms  quark_groestl512_gpu_hash_64_quad(int, unsigned int, unsigned int*, unsigned int*)
 11.66%  3.56600s       146  24.425ms  24.071ms  27.458ms  x11_shavite512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
 10.59%  3.24123s       146  22.200ms  21.875ms  24.802ms  x11_simd512_gpu_expand_64(int, unsigned int, __int64*, unsigned int*, uint4*)
  7.53%  2.30443s       146  15.784ms  15.575ms  17.894ms  x11_cubehash512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
  5.30%  1.62276s       146  11.115ms  10.976ms  12.522ms  quark_jh512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
  4.99%  1.52627s       146  10.454ms  10.235ms  11.892ms  x11_luffa512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
  4.21%  1.28899s       146  8.8287ms  8.4681ms  9.1675ms  x11_simd512_gpu_compress2_64(int, unsigned int, __int64*, unsigned int*, uint4*, int*)
  3.48%  1.06443s       146  7.2906ms  6.9558ms  7.5275ms  x11_simd512_gpu_compress1_64(int, unsigned int, __int64*, unsigned int*, uint4*, int*)
  3.05%  934.36ms       147  6.3562ms  6.2786ms  7.1199ms  quark_bmw512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
  2.68%  820.56ms       146  5.6203ms  5.5330ms  6.3415ms  quark_skein512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
  2.64%  807.37ms       146  5.5299ms  5.4526ms  6.2354ms  quark_keccak512_gpu_hash_64(int, unsigned int, __int64*, unsigned int*)
  2.63%  804.03ms       147  5.4696ms  5.4003ms  6.1889ms  quark_blake512_gpu_hash_80(int, unsigned int, void*)
  1.46%  446.72ms       146  3.0597ms  3.0059ms  3.1561ms  x11_simd512_gpu_final_64(int, unsigned int, __int64*, unsigned int*, uint4*, int*)
  0.11%  33.535ms       145  231.28us  214.59us  265.21us  cuda_check_gpu_hash_64(int, unsigned int, unsigned int*, unsigned int*, unsigned int*)
  0.00%  428.88us       145  2.9570us  2.4000us  9.4720us  [CUDA memset]
  0.00%  320.11us       145  2.2070us  1.3440us  9.4390us  [CUDA memcpy DtoH]
  0.00%  77.406us       110     703ns

My version is compiled with compute 5.0 only. The next version will be compiled with 5.2 as well. Something strange with your numbers though. I didn't modify luffa512, but my version is slower in your test. Perhaps compute 5.2 compilation will help
newbie
Activity: 9
Merit: 0
after some trouble getting it started i got this:

x11: 2750 - 2850 khash (nvminer1.2u-d8: 2740 - 2780 khash)
x13: 2100 - 2180 khash (nvminer1.2u-d8: 2090 - 2110 khash)
x15: 1880 - 1910 khash (nvminer1.2u-d8: 1840 - 1860 khash)

750ti, Core: 1332 Mhz, Ram: 3425 Mhz
legendary
Activity: 3164
Merit: 1003
Using CCMINER -djm34-m7v7, average of 5 x Gigabyte GTX 750 Ti the black edition (GV-N75TWF2BK-2GI) with 50/50 OC, not solomining.
And ccminer-Maxwell.7z, the same 5 x Gigabyte GTX 750 Ti the black edition (GV-N75TWF2BK-2GI) with 50/50 OC, not solomining.
With different pools.
X11 ~2675,~2725  X13 ~2050,~2155  X15 ~1825,~1910 nist5  ~8250,~8360  quark ~4100,~5280

Im going to try over clocking more, i may get more.
   
   
   
Jump to: