Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 240. (Read 3426985 times)

newbie
Activity: 48
Merit: 0
any Virus&Trojans inside Cudaminer software ?

Claymore made 100 000$ on the monero miner with 5% tip...

Hidden tip or a legit fee for using his apps?
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
any Virus&Trojans inside Cudaminer software ?

Claymore made 100 000$ on the monero miner with 5% tip...
legendary
Activity: 1223
Merit: 1000
any Virus&Trojans inside Cudaminer software ?
sr. member
Activity: 311
Merit: 250
Hey,

I'm using ccminer 2.1 and keep getting the error "abnormal hashes, exiting with code 211!" when mining x11 on 4 750tis. Any advice as to what may be causing this? CCminer just shuts down after the error which occurs after a few hours mining.



Thanks


Friend uses ccminer v.1.1 use it here I have not poblema not
member
Activity: 65
Merit: 10
Hey,

I'm using ccminer 2.1 and keep getting the error "abnormal hashes, exiting with code 211!" when mining x11 on 4 750tis. Any advice as to what may be causing this? CCminer just shuts down after the error which occurs after a few hours mining.



Thanks
hero member
Activity: 675
Merit: 514
Cry I am already bound to ancient versions of ccminer. Because I still have compute 2.1....

Such areas really can't be fixed / done with #ifdef and also kept a not-so-good version for older GPUs?
But maybe I wish too much. Obviously everybody only improves the versions mostly for own needs....
Sure, that's possible.
Example for rotate function in sha256:
Code:
#if __CUDA_ARCH__<350
#define rrot(x, bits) ((x >> bits) | (x << (32 - bits)))
#else
#define rrot(x, bits) __funnelshift_r(x, x, bits)
#endif
But usually there are other reasons for not supporting older cards
full member
Activity: 266
Merit: 100
I've update to the Beta drivers 340.88 and the hash increased slightly. Now getting 7.1 Mh or so but still not the 7.8/7.9 Mh you are getting. I also tried with minep.it pool but as I was expecting it didn't change anything.

Any more ideas?

I have not followed the whole conversation, but I am wondering if one of you is running on risers and the other not?

That's an interesting theory. I read somewhere that indeed it could affect the hash rate http://cryptomining-blog.com/1276-first-impressions-from-a-6-card-mining-rig-using-geforce-gtx-750-ti-gpus/

I, on my side am using risers. Powered for that matter. The difference it rather significant: ~800/900 Kh

I am also using ASUS, not Gigabyte. Perhaps that could also be the reason.

Good observation.  No risers of any kind for me.

Well I don't think it's a riser issue. I've connected the 2 GPUs directly to the slots of the mobo and I get the same hashrate as with risers... back to square one.
legendary
Activity: 1400
Merit: 1050
I won't have the source available until after the weekend as I want to compile/test on Linux first and need to get the makefiles setup properly.
However, if you want to start playing now pull down the latest from dsm34 and use this to test with. https://github.com/djm34/ccminer
If I remember correctly I have tried to compile compute5.0 under CUDA 6 and it broke something in the FRESH algo.  Because of this I went back to compute 3/3.5 on Cuda 5.5 for the last release.
So point being after trying your changes, please make sure to test every algo to make sure something else doesn't brake under Cuda 6 and compute 5 (if you try this).
BTW, are you doing your testing on Windows or Linux?

Ok. I'll start with the dsm34 branch. But in order to test all the algo's I need the unified sourcecode. I will be developing on windows.
If possible, try to put changes in cuda_helper.h rather than breaking everybody else code...
hero member
Activity: 868
Merit: 1000
Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

   define TWEAK(a0,a1,a2,a3,j)\
    a0 = (a0<<(j))|(a0>>(32-j));\
    a1 = (a1<<(j))|(a1>>(32-j));\
    a2 = (a2<<(j))|(a2>>(32-j));\
    a3 = (a3<<(j))|(a3>>(32-j));
   
#define MIXWORD(a0,a4)\
    a4 ^= a0;\
    a0  = (a0<<2) | (a0>>(30));\
    a0 ^= a4;\
    a4  = (a4<<14) | (a4>>(18));\
    a4 ^= a0;\
    a0  = (a0<<10) | (a0>>(22));\
    a0 ^= a4;\
    a4  = (a4<<1) | (a4>>(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) << 7) | ((a) >> 25))
#define ROTATEUPWARDS11(a) (((a) << 11) | ((a) >> 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster)  (Compute maxwell / 5.0+)

Hope u succeed in optimizing code
The problem is that it may break definitively the compatibility with other versions

5.0 is the future. Need to get started and path the way for 800 series. By the time 800 series is out, there will be perfectly optimized 5.0 
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
I won't have the source available until after the weekend as I want to compile/test on Linux first and need to get the makefiles setup properly.
However, if you want to start playing now pull down the latest from dsm34 and use this to test with. https://github.com/djm34/ccminer
If I remember correctly I have tried to compile compute5.0 under CUDA 6 and it broke something in the FRESH algo.  Because of this I went back to compute 3/3.5 on Cuda 5.5 for the last release.
So point being after trying your changes, please make sure to test every algo to make sure something else doesn't brake under Cuda 6 and compute 5 (if you try this).
BTW, are you doing your testing on Windows or Linux?

Ok. I'll start with the dsm34 branch. But in order to test all the algo's I need the unified sourcecode. I will be developing on windows.
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
Cry I am already bound to ancient versions of ccminer. Because I still have compute 2.1....


Such areas really can't be fixed / done with #ifdef and also kept a not-so-good version for older GPUs?
But maybe I wish too much. Obviously everybody only improves the versions mostly for own needs....
full member
Activity: 168
Merit: 100
Cuda 6.0


Check:

Version features and specifications

The funnel shift is available for compute 3.5 and higher.

http://en.wikipedia.org/wiki/CUDA

How to inline CUDA Assembly:
http://docs.nvidia.com/cuda/inline-ptx-assembly/index.html#axzz37iRLSsMj

Instruction set.


docs.nvidia.com/cuda/parallel-thread-execution/index.html#logic-and-shift-instructions

8.7.5.1. Logic and Shift Instructions:
(8.7.5.6. Logic and Shift Instructions: shf)


So the following macro should be converted to something like:

a0 = (a0<<(j))|(a0>>(32-j)); --> shf.l.wrap.b32 a0,a0,j,c
...

I have some time in the weekend to do the full implementation. Just give me the latest branch to work on...



I won't have the source available until after the weekend as I want to compile/test on Linux first and need to get the makefiles setup properly.

However, if you want to start playing now pull down the latest from dsm34 and use this to test with. https://github.com/djm34/ccminer

If I remember correctly I have tried to compile compute5.0 under CUDA 6 and it broke something in the FRESH algo.  Because of this I went back to compute 3/3.5 on Cuda 5.5 for the last release.

So point being after trying your changes, please make sure to test every algo to make sure something else doesn't brake under Cuda 6 and compute 5 (if you try this).

BTW, are you doing your testing on Windows or Linux?
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Christian has already broken compabillity for any hardware below compute 3.0. In the Killer Groestl implementation he use Compute 3.0 + instructions. like perm:

static __device__ uint32_t cuda_swab32(uint32_t x)
{
return __byte_perm(x, 0, 0x0123);
}

I will implement the changes by using a compilerflag like this:

#if __CUDA_ARCH__ >= 130
return (uint32_t)__double2hiint(__longlong_as_double(x));
#else
return (uint32_t)(x >> 32);
#endif

So if you compile with compute 5.0 you will get maxwell funnelshift instead.

The current CC miner runs at the same speed for compute 3.0, 3,5 and 5.0. This is about to change.

full member
Activity: 168
Merit: 100
I've update to the Beta drivers 340.88 and the hash increased slightly. Now getting 7.1 Mh or so but still not the 7.8/7.9 Mh you are getting. I also tried with minep.it pool but as I was expecting it didn't change anything.

Any more ideas?

I have not followed the whole conversation, but I am wondering if one of you is running on risers and the other not?

That's an interesting theory. I read somewhere that indeed it could affect the hash rate http://cryptomining-blog.com/1276-first-impressions-from-a-6-card-mining-rig-using-geforce-gtx-750-ti-gpus/

I, on my side am using risers. Powered for that matter. The difference it rather significant: ~800/900 Kh

I am also using ASUS, not Gigabyte. Perhaps that could also be the reason.

Good observation.  No risers of any kind for me.
legendary
Activity: 1400
Merit: 1050
Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

   define TWEAK(a0,a1,a2,a3,j)\
    a0 = (a0<<(j))|(a0>>(32-j));\
    a1 = (a1<<(j))|(a1>>(32-j));\
    a2 = (a2<<(j))|(a2>>(32-j));\
    a3 = (a3<<(j))|(a3>>(32-j));
   
#define MIXWORD(a0,a4)\
    a4 ^= a0;\
    a0  = (a0<<2) | (a0>>(30));\
    a0 ^= a4;\
    a4  = (a4<<14) | (a4>>(18));\
    a4 ^= a0;\
    a0  = (a0<<10) | (a0>>(22));\
    a0 ^= a4;\
    a4  = (a4<<1) | (a4>>(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) << 7) | ((a) >> 25))
#define ROTATEUPWARDS11(a) (((a) << 11) | ((a) >> 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster)  (Compute maxwell / 5.0+)

Hope u succeed in optimizing code
The problem is that it may break definitively the compatibility with other versions
hero member
Activity: 868
Merit: 1000
Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

   define TWEAK(a0,a1,a2,a3,j)\
    a0 = (a0<<(j))|(a0>>(32-j));\
    a1 = (a1<<(j))|(a1>>(32-j));\
    a2 = (a2<<(j))|(a2>>(32-j));\
    a3 = (a3<<(j))|(a3>>(32-j));
   
#define MIXWORD(a0,a4)\
    a4 ^= a0;\
    a0  = (a0<<2) | (a0>>(30));\
    a0 ^= a4;\
    a4  = (a4<<14) | (a4>>(18));\
    a4 ^= a0;\
    a0  = (a0<<10) | (a0>>(22));\
    a0 ^= a4;\
    a4  = (a4<<1) | (a4>>(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) << 7) | ((a) >> 25))
#define ROTATEUPWARDS11(a) (((a) << 11) | ((a) >> 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster)  (Compute maxwell / 5.0+)

Hope u succeed in optimizing code
newbie
Activity: 23
Merit: 0

Not that I can see either.
All closed, and binaries for Windows only :-(

Looking forward to trying something when (if?) the source does get released though.

This weekend I am going to try KopiemTu 1.4 and see what that is like - reading good things about it, and I like the idea of trying some tweaking of the card on linux.
https://litecointalk.org/index.php?topic=16800.0

No sourcecode available in nvminer.zip.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Cuda 6.0


Check:

Version features and specifications

The funnel shift is available for compute 3.5 and higher.

http://en.wikipedia.org/wiki/CUDA

How to inline CUDA Assembly:
http://docs.nvidia.com/cuda/inline-ptx-assembly/index.html#axzz37iRLSsMj

Instruction set.


docs.nvidia.com/cuda/parallel-thread-execution/index.html#logic-and-shift-instructions

8.7.5.1. Logic and Shift Instructions:
(8.7.5.6. Logic and Shift Instructions: shf)


So the following macro should be converted to something like:

a0 = (a0<<(j))|(a0>>(32-j)); --> shf.l.wrap.b32 a0,a0,j,c
...

I have some time in the weekend to do the full implementation. Just give me the latest branch to work on...

legendary
Activity: 1400
Merit: 1050
Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

   define TWEAK(a0,a1,a2,a3,j)\
    a0 = (a0<<(j))|(a0>>(32-j));\
    a1 = (a1<<(j))|(a1>>(32-j));\
    a2 = (a2<<(j))|(a2>>(32-j));\
    a3 = (a3<<(j))|(a3>>(32-j));
   
#define MIXWORD(a0,a4)\
    a4 ^= a0;\
    a0  = (a0<<2) | (a0>>(30));\
    a0 ^= a4;\
    a4  = (a4<<14) | (a4>>(18));\
    a4 ^= a0;\
    a0  = (a0<<10) | (a0>>(22));\
    a0 ^= a4;\
    a4  = (a4<<1) | (a4>>(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) << 7) | ((a) >> 25))
#define ROTATEUPWARDS11(a) (((a) << 11) | ((a) >> 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster)  (Compute maxwell / 5.0+)
which cuda version ?
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
No sourcecode available in nvminer.zip.
Jump to: