Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1122. (Read 3427002 times)

sr. member
Activity: 252
Merit: 254
Is there any way to limit the temperature the cards reach? The first version had my 580gtx's sit at around 80degrees but they're now rising way past that.

Go to techpowerup.com and grab nvidiainspector.  It will let you adjust the fan speeds on the cards manually (I just set mine at 100).  It will also let you set clockspeeds and voltages through command line (I made a batch file to do all of the above on my 3 nvidia cards).

newbie
Activity: 47
Merit: 0
Is there any way to limit the temperature the cards reach? The first version had my 580gtx's sit at around 80degrees but they're now rising way past that.
hero member
Activity: 756
Merit: 502
I removed it, and installed libcurl4-openssl-dev:i386, which changed a huge number of things to 32-bit. Your binary now fails on a different shared library: libcudart.so.5.0.

Help on using gchill's binary, or on compiling cudaMiner under Ubuntu will be greatly appreciated!

There's not much missing now

libcudart.so.5.0  is on your system, but you need to set LD_LIBRARY_PATH so it will find the 32 bit version of it first.

try /usr/lib32/nvidia-current (if you installed the driver using apt-get)

otherwise go look for the file using find / -name libcudart.so.5.0

if you installed the CUDA toolkit 5.0 (which I presume you have) then you might also find this library somewhere under /usr/local/cuda-5.0
hero member
Activity: 756
Merit: 502
cudaminer -m 1,1 -i 1,1 -l auto,auto -o http://127.0.0.1:8332 -O usename:password

Let's start with single memory allocation mode (SLI'ed cards don't like my chunked memory allocation scheme at all), interactive mode so the autotune finishes faster, and -l auto,auto can later be replaced with known good kernel launch configurations.

If that stuff still won't work, try running on a single device at first using -d 0

Christian
full member
Activity: 196
Merit: 100
Christian,

I know you are busy.  I have a dedicated mining LAN full of AMD cards that I use for hashing.

However, I think this is a great idea you are working on - in fact my wife's sister is a CUDA developer doing GPGPU work on human brain modeling for a major university and I have been bugging her about writing a CUDA optimized mining client for my TITANS to see what they could really do.  She has so far declined.

....and then I fortuitously stumbled upon your thread.

It seems like great work so far  - however I have had trouble getting my two SLIed cards to actually mine.

The program will now start, connect through stratum mining proxy to the pool, optimize, then sit there and do no visible work.

If you had two TITANS in SLI, what version, and what exact command line, with options, would you input, to get this to work on TITAN?

For instance - cudaminer -a scrypt -o http://127.0.0.1:8332 -O usename:password etc etc?
newbie
Activity: 33
Merit: 0
Trust me, you want to compile this for 32 bit.


Why can't it be 64-bit? I can only chroot to gentoo-prefix (which does not support multilib) to compile cudaminer.

A 32-bit compile on a 64-bit host is possible but very tricky thanks mainly to libcurl.  Here's a link to a 32-bit Linux binary from the April 14 code.  Give it a try.  Hopefully you won't run into shared library issues...
https://www.dropbox.com/s/twjdug4l0z4rnz0/cudaminer_414.gz

I was unable to compile cudaMiner myself (Ubuntu 12.04), so I tried your build. It fails with a shared library error on libcurl.

I installed libcurl4-gnutls-dev, which had no effect.

I removed it, and installed libcurl4-openssl-dev:i386, which changed a huge number of things to 32-bit. Your binary now fails on a different shared library: libcudart.so.5.0.

Help on using gchill's binary, or on compiling cudaMiner under Ubuntu will be greatly appreciated!
hero member
Activity: 756
Merit: 502
Is there a reason CPU usage is at higher levels when using interactive mode as opposed to not using it? I'm at about 25-30% usage in interactive mode and around <12% when not.

I believe that something weird is happening INSIDE the CUDA APIs, possibly due to some required synchronization between kernel launches. To prove that I will have to do application profiling. And then I would have to find workarounds.

With the latest release I find that the mining speeds in interactive mode have greatly improved for me. I did try some changes in the order I issue CUDA API calls. That seems to have helped a lot.
hero member
Activity: 756
Merit: 502
because of the crash I can't get the cock speed to return to normal levels and it runs at 512Mhz

Issues with cock speed, hear hear...
newbie
Activity: 27
Merit: 0
I created a google doc to track card hash performance with cudaMiner, might be easier than sifting through the thread.

https://docs.google.com/spreadsheet/lv?key=0AjMqJzI7_dCvdG9fZFN1Vjd0WkFOZmtlejltd0JXbmc


Thanks. Added my GTX 480 & GTS 450 to the mix.
newbie
Activity: 8
Merit: 0
I created a google doc to track card hash performance with cudaMiner, might be easier than sifting through the thread.

https://docs.google.com/spreadsheet/lv?key=0AjMqJzI7_dCvdG9fZFN1Vjd0WkFOZmtlejltd0JXbmc
hero member
Activity: 756
Merit: 502
Following the computation on total memory used, what is the reason that not even half of the memory available on the Titan is being used? Up until the last release today (4/14) autotune has always picked a configuration of 300-307x2 (now 263x2) which comes out to about 2.5GB of used memory at most. Since scrypt is a space/computation tradeoff algorithm isn't there potential for almost a doubling in hash performance if the full 6GB could be used?

At the moment there is no trade-off. I use the full scratchpad size per thread. So required memory size scales with the number of threads.

A trade-off happens when you try to reduce the scratchpad size at the cost of increased computation. Trying this is on my TODO list.

Christian
hero member
Activity: 756
Merit: 502
But if you ctrl-c during the auto-tuning, it waits for the auto tuning to finish before shutting down.

The question is, how much further can you take this? Wink

I'll add a check for the abort flag into the autotuning loop.

I haven't explored using inline assembly code yet. We may see yet another speed boost from that.

Christian
hero member
Activity: 756
Merit: 502
Is there some reason the -C 1 flag should not be used then when it seems to work and does give a speedboost of several kh/s.

For me the results don't validate with -C 1, and my speed breaks down by factor 2-3.

Christian
newbie
Activity: 19
Merit: 0
with a gtx 660 Ti i now get 155 kh/s , using the flags -l 70x2 -C 1 -m 1 -i 1

these settings give the highest hashrate, determined via the --benchmark flag .

next to that the cpu i7 3770 running minerd with 6 threads (which is the most optimal in combo with the gpu) @ 50 kh/s .

gtx 660 Ti running @ 1058 corespeed, 6008 memspeed, 56°c coretemp, 125W power usage. (triple display)

hasn't crashed yet.

Is there some reason the -C 1 flag should not be used then when it seems to work and does give a speedboost of several kh/s.

No positive results found  Roll Eyes  Grin

Thx for your work on this.



newbie
Activity: 20
Merit: 0
Is this normal?

http://puu.sh/2wVmH

If so, thats my 560 Ti

Did you get this figured out? I'm seeing the same issue with p2pool
newbie
Activity: 22
Merit: 0
560Ti 900 core clock:
2013-04-09: [2013-04-11 03:21:01] GPU #0:  107.78 khash/s with configuration 32x4
2013-04-10: [2013-04-11 03:32:00] GPU #0:  138.84 khash/s with configuration  64x2

Sweeet!

2013-04-14: [2013-04-15 01:45:32] GPU #0:  146.28 khash/s with configuration  16x8
And no crash on ctrl-c, awesome!
But if you ctrl-c during the auto-tuning, it waits for the auto tuning to finish before shutting down.

The question is, how much further can you take this? Wink


Google Doc page to track GPU outputs.

credit goes to Cairpre
Huh Where?
hero member
Activity: 1330
Merit: 502
Vave.com - Crypto Casino
GT520 from 18,4 to 19khash/s on the last version...
sr. member
Activity: 247
Merit: 250
newbie
Activity: 20
Merit: 0
Is there a reason CPU usage is at higher levels when using interactive mode as opposed to not using it? I'm at about 25-30% usage in interactive mode and around <12% when not.
newbie
Activity: 12
Merit: 0
Following the computation on total memory used, what is the reason that not even half of the memory available on the Titan is being used? Up until the last release today (4/14) autotune has always picked a configuration of 300-307x2 (now 263x2) which comes out to about 2.5GB of used memory at most. Since scrypt is a space/computation tradeoff algorithm isn't there potential for almost a doubling in hash performance if the full 6GB could be used?

Btw, the lastest work has been fantastic. My card hashes at a much more consistent rate and about 30khash/s more than previous builds. Fixing the Ctrl-C is just a nice usability improvement.
Jump to: