Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1064. (Read 3426991 times)

newbie
Activity: 33
Merit: 0
So my results for the two new version. I am using a 560 GTX a little OC with a i5 4670K.

01-12 => 07-12 : I lost 10khash/s (140 => 130)

01-12 => 07-12 : I lost 5khash/s (140 => 135)

I am using the x64 version with this options :

Code:
cudaminer.exe -o stratum+tcp://azertyuiop.com:1234 -O ME.1:1 -H 1 -i 0

My launch config on the 01-12 version which give me my best results (142khash/s)

Code:
 -l F111x2 

01-12 => 07-12 & with -H 2 : I lost 7khash/s (140 => 133)

Am I doing something wrong?

Thanks !
newbie
Activity: 20
Merit: 0
The x86 binary of the new build is a tad faster on my GTX580. Still using the -H1 switch, since my oldie i7-920 is way too fast to be even bothered into a full power state anyway.
newbie
Activity: 28
Merit: 0
Tired of heavy CPU load? Change your -H 1 switch to -H 2 (or just remove the switch, as 2 is now the default). Your GPU will do ALL the work - it may run a little hotter though. This release is now suitable for 1 MHash/s mining rigs or even bigger,  running on cheap mainboards with low end CPUs - even an Intel Atom would do Wink Of course the required nVidia GPUs doing this kind of hash rates aren't cheap...

Download the 2013-12-10 release. I also cleaned up the Readme a bit, fixed a bug that negatively affected hash validation on some cards.

With -H 2 (full offloading to GPU) it may be more efficient to run the x86 binary of cudaminer as the x64 version has increased register pressure in some CUDA kernels, leading to slightly lower hash rates sometimes. Because the cudaminer binary is mostly idling now, there's almost no use running the more bloated x64 binary.

Christian

I just tried the newest version, both x64 and x86 with the various settings for -H. The x64 builds used to be the most efficient for my GTX560, but the 12-10 x86 build gives an all-around 4% hashrate improvement over the 12-07 x64 builds. This holds for all the -H settings. Excellent stuff!
member
Activity: 98
Merit: 10
2013-12-10 release

Using the new -H 2 option, with either x64 or x86, I'm sitting a few (3-4) khash lower than using -H 1. Obviously it's likely to have a much different effect on a lower end system like an Atom.
sr. member
Activity: 283
Merit: 250
Wow, great work Buchner. Again I love my twin GTX 680s a bit more, which isn't so little to begin with!
hero member
Activity: 756
Merit: 502
Tired of heavy CPU load? Change your -H 1 switch to -H 2 (or just remove the switch, as 2 is now the default). Your GPU will do ALL the work - it may run a little hotter though. This release is now suitable for 1 MHash/s mining rigs or even bigger,  running on cheap mainboards with low end CPUs - even an Intel Atom would do Wink Of course the required nVidia GPUs doing this kind of hash rates aren't cheap...

Download the 2013-12-10 release. I also cleaned up the Readme a bit, fixed a bug that negatively affected hash validation on some cards.

With -H 2 (full offloading to GPU) it may be more efficient to run the x86 binary of cudaminer as the x64 version has increased register pressure in some CUDA kernels, leading to slightly lower hash rates sometimes. Because the cudaminer binary is mostly idling now, there's almost no use running the more bloated x64 binary.

Christian


newbie
Activity: 11
Merit: 0
I also thought so, but I have libcudart5.0 installed...

edit: finally, it compiled successfully, I don't understant why Smiley
hero member
Activity: 756
Merit: 502

probably not using the CUDA 5.0 SDK? this function was added for Kepler type devices and probably came with CUDA release 5...
newbie
Activity: 11
Merit: 0
I am sorry, what I am doing wrong if I get this kind of output?

make[2]: Entering directory `/home/wizzard/CudaMiner'
g++  -g -O2   -o cudaminer -pthread -L/usr/local/cuda/lib64 cudaminer-cpu-miner.o cudaminer-util.o cudaminer-sha2.o cudaminer-scrypt.o salsa_kernel.o spinlock_kernel.o legacy_kernel.o fermi_kernel.o test_kernel.o titan_kernel.o -L/usr/lib/x86_64-linux-gnu -lcurl compat/jansson/libjansson.a -lpthread  -lcudart -fopenmp 
salsa_kernel.o: In function `find_optimal_blockcount(int, KernelInterface*&, bool&, int&)':
/home/wizzard/CudaMiner/salsa_kernel.cu:286: undefined reference to `cudaDeviceSetSharedMemConfig'
collect2: error: ld returned 1 exit status
make[2]: *** [cudaminer] Error 1
make[2]: Leaving directory `/home/wizzard/CudaMiner'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/wizzard/CudaMiner'
make: *** [all] Error 2
hero member
Activity: 756
Merit: 502
the github repo contains something quite significant: SHA256 was moved onto the GPU! It isn't fully optimized code yet, but it works. Not only does it give an effective speed-up, it also lowers the CPU load to near zero.

Now when I upgrade the PSU on my dedicated mining rig to 1.1 kW, I should be able to power 3 GTX 780 TI cards, hopefully. At the moment the "800W" PSU craps out when I run two cards under load. The 12V Rail of that thing is just so under-dimensioned, it's not funny.

I also found the cause for validation problems on some newer cards (the code to detect the card's ability for overlapping kernel execution was broken)

the brave can try github.... The not so brave have to wait for a new release...

-H 0 : single threaded CPU SHA256 hashing
-H 1 : multi threaded CPU SHA256 hashing
-H 2 : GPU based SHA256 hashing (now  the default)

I also found out that my code to overlap memory transfers and kernels was completely NOT working. Which is why moving the SHA256 part to the GPU results in an effective speed-up (there's now only memory copies from the GPU to the CPU - and it is much less data!). I will fix mentioned problem when I am in a fixing mood Wink

Christian
newbie
Activity: 11
Merit: 0
Thank you very much, but it does not work as I expected. After installing the necessary libcudart5.5 i386 from debian and libgomp i386 from Ubuntu, it crashes (segmentation fault) with previous error GPU #0:  with compute capability 0.0
member
Activity: 89
Merit: 10
Any binary for Linux x64/x32, please? Cannot compile it in my KUbuntu x64 system.

A 32 bit, cudaminer version 2013-11-20 (alpha), compiled on Ubuntu 12.04.3 LTS

download 'whitecuda.jpg' image from http://postimg.org/image/ep07wmzjh/

verify the downloaded size, should be 4391396 Bytes

now do this:
Code:
$ dd if=whitecuda.jpg of=cudaminer bs=1 skip=1784

and there is your cudaminer binary - size 4389612 Bytes.

BTW: you really shouldn't trust binaries from the Internet...
newbie
Activity: 11
Merit: 0
Any binary for Linux x64/x32, please? Cannot compile it in my KUbuntu x64 system.
newbie
Activity: 43
Merit: 0
Problem is, with capital letters it's just "unknown", with non-capital the program just crashes, even with "f", which is the correct one for my card..

you'll get a warning that the launch config is unrecognized, but it does indeed work (assuming you use uppercase letters)

But you need to watch out for minimum requirements regarding compute capability.

Christian



Oh ok cool! Will try to experiment a bit then.
Do you think you could change the text there to be a bit more obvious about it actually working? I mean I understand this isn't exactly something everyone will do, but should be a minor code change I guess with just some text?
hero member
Activity: 756
Merit: 502
EDIT: Hmmm, just restarted my computer for the first time today, immediately opened up the cudaminer upon getting back to my desktop and noticed the same problem where the GPU utilization is only at 50% or so (getting 114Khash/s). By the time I opened this browser and typed the first sentence of this edit, the GPU load was already at 97% and my hashrate climbed to 209Khash/s. Anyone else having similar issues?

What OS are you running on? I have experienced this kind of issue on Windows Server 2012 R2, which is somewhat similar to Windows 8.1, I suppose.
newbie
Activity: 8
Merit: 0
EDIT: Hmmm, just restarted my computer for the first time today, immediately opened up the cudaminer upon getting back to my desktop and noticed the same problem where the GPU utilization is only at 50% or so (getting 114Khash/s). By the time I opened this browser and typed the first sentence of this edit, the GPU load was already at 97% and my hashrate climbed to 209Khash/s. Anyone else having similar issues?

I am having the same sort of issue. I am using -i 0 flag which typically results in 99% utilization for my GTX 760. On the prior cudaminer release I was not getting more than 75%, usually at 50%. The odd thing with the 12-07 release is I get 99% utilization when I am using my PC. If I leave the PC I see the utilization jump up and down quite often. After I leave the system running untouched I return to see that my hash rate goes between 155-201.

Also, I really do not understand the varying use of kernals. Is there a list of the kernals we can try? Does this equate to optimizations available in the CUDA architecture for each generation of silicon?
hero member
Activity: 756
Merit: 502
Problem is, with capital letters it's just "unknown", with non-capital the program just crashes, even with "f", which is the correct one for my card..

you'll get a warning that the launch config is unrecognized, but it does indeed work (assuming you use uppercase letters)

But you need to watch out for minimum requirements regarding compute capability.

Christian

newbie
Activity: 43
Merit: 0
Yeah it's the whole -l F / -l f (and that's even the correct kernel for my card, 580) etc thing that doesn't work, at least as far as I can understand the readme we're supposed to be able to do something like that, and it will autotune with my choosen settings, and for the kernel that I specify from the list of L/F/K/T.
Only result I get though is either "Given launch config 'x' does not validate" or the program crashes so yeah.. Otherwise I'm using the given launch config of F16x14 which I got out of normal autotune run with -D yes.
newbie
Activity: 28
Merit: 0
So it does pick a configuration when you use (capital) F, but you don't know which? If so, use the -D flag for debugging mode. It will give you more information about the auto-tuning process.

Seems like using just a K for me shows "Given launch config 'K' does not validate".

Interesting aside, it seems the 98x2 I've been using doesn't show up in the debug for autotune, despite being the best config I've found so far. This debug panel gives a few more leads to check out, though.

I was convinced that using -l F had worked for me in the past. However, I just checked to make sure and I can confirm what you have said. It does not work for me either.

EDIT for your EDIT: Yes, that does seem to be the case. What I did is run autotune a bunch of times to pick a few candidates. I ran it until I was convinced that no new configurations would pop up. Then I let the candidates run for a while and finally I picked the one with the best hashrate.
member
Activity: 98
Merit: 10
So it does pick a configuration when you use (capital) F, but you don't know which? If so, use the -D flag for debugging mode. It will give you more information about the auto-tuning process.

Seems like using just a K for me shows "Given launch config 'K' does not validate".

Interesting aside, it seems the 98x2 I've been using doesn't show up in the debug for autotune, despite being the best config I've found so far. This debug panel gives a few more leads to check out, though.

EDIT: Looking through the results of a couple tries, it seems the results of autotune are pretty inconsistent. Is each configuration being tested only once? Maybe an average more tests per config would work better as an option for people who are going to use the results to find an ideal configuration to set themselves instead of as a final configuration.
Jump to: