Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1131. (Read 3426947 times)

sr. member
Activity: 378
Merit: 255
I tried this out on Amazon's EC2 and saw a marked improvement over cgminer's with scrypt. The auto-tune found 223x2, which I presume is aiming for half of the CUDA cores (224x2).

https://bitcointalksearch.org/topic/mining-on-amazon-ec2-scrypt-or-btc-169377
sr. member
Activity: 840
Merit: 251
So I was thinking cool, I might be able to get another 50 kh/s from that workstation that's been sitting in my office for the last few months. WRONG! I'm averaging almost 400 kh/s right now from a quadro 600... if this keeps up and doesn't burn my office down by the morning (i'm not physically at the location atm, so hoping nothing is smoking right now), expect a decent donation to this project soon.
omo
full member
Activity: 147
Merit: 100
after upgrading to the latest driver,I made my Quadro FX 1800M running:
Code:
.......
.......
[2013-04-10 12:28:56] GPU #0:   14.12 khash/s with configuration S10x3
[2013-04-10 12:28:56] GPU #0: using launch configuration S10x3
[2013-04-10 12:28:56] GPU #0: Quadro FX 2800M, 960 hashes, 0.00 khash/s
[2013-04-10 12:28:56] GPU #0: Quadro FX 2800M, 960 hashes, 6.91 khash/s
[2013-04-10 12:28:56] LONGPOLL detected new block
[2013-04-10 12:28:56] DEBUG: got new work
[2013-04-10 12:28:56] GPU #0: Quadro FX 2800M, 7680 hashes, 12.41 khash/s
[2013-04-10 12:29:06] DEBUG: hash <= target
Hash:   00005f1b32735cc067794562b715b6dc1b42133abb3416db26e127ca7e8d59de

and GTX 680:
Code:
[2013-04-10 14:06:19] GPU #0:  143.56 khash/s with configuration  166x2
[2013-04-10 14:06:19] GPU #0: using launch configuration  166x2
[2013-04-10 14:06:20] GPU #0: GeForce GTX 680, 10624 hashes, 0.06 khash/s
[2013-04-10 14:06:20] DEBUG: got new work in 647 ms
[2013-04-10 14:06:20] GPU #0: GeForce GTX 680, 10624 hashes, 69.53 khash/s
[2013-04-10 14:06:23] DEBUG: hash <= target
omo
full member
Activity: 147
Merit: 100
I saw your edit on your previous post. Is your computer in another language? your card name isn't even showing. So maybe a language issue?

Drives are up to date?



yeah, my win7 is Chinese version.
I shall check the driver.
thank you
sr. member
Activity: 247
Merit: 250
I saw your edit on your previous post. Is your computer in another language? your card name isn't even showing. So maybe a language issue?

Drives are up to date?

omo
full member
Activity: 147
Merit: 100
I tried the latest version on a Quadro FX 2800M card, it crashed after giving the following message:
Code:
E:\cudaminer-2013-04-09>cudaminer.exe  --url http://litecoinpool.org:9332/ --user user --pass x --thread 1
           *** CudaMiner for nVidia GPUs by Christian Buchner ***
                     This is version 2013-04-09 (alpha)
        based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler
               Cuda additions Copyright 2013 Christian Buchner
           My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-04-10 10:43:36] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-10 10:43:37] Long-polling activated for http://litecoinpool.org:9332/LP
[2013-04-10 10:43:37] GPU #0:  with compute capability 0.47446600
[2013-04-10 10:43:37] GPU #0: Performing auto-tuning (Patience...)
[2013-04-10 10:43:37] GPU #0:    0.00 khash/s with configuration  0x0
[2013-04-10 10:43:37] GPU #0: using launch configuration  0x0


What happens if you don't specify the thread? same thing?  you can also use -D to see what Autotuning does

thank you for your reply.
if I did'nt specify the thread option, it won't go below the donation line.
I specified -D and got a slightly different messages, but still crashed:
Code:
E:\cudaminer-2013-04-09>cudaminer.exe  --url http://litecoinpool.org:9332/ --user user --pass x --thread 1 -D

           *** CudaMiner for nVidia GPUs by Christian Buchner ***
                     This is version 2013-04-09 (alpha)
        based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler
               Cuda additions Copyright 2013 Christian Buchner
           My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-04-10 11:07:21] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-10 11:07:22] Long-polling activated for http://litecoinpool.org:9332/LP
[2013-04-10 11:07:22] DEBUG: got new work in 1019 ms
[2013-04-10 11:07:22] GPU #0:  with compute capability 532.0
[2013-04-10 11:07:22] GPU #0: Performing auto-tuning (Patience...)
[2013-04-10 11:07:22] GPU #0:    0.00 khash/s with configuration  0x0
[2013-04-10 11:07:22] GPU #0: using launch configuration  0x0
sr. member
Activity: 247
Merit: 250
I tried the latest version on a Quadro FX 2800M card, it crashed after giving the following message:
Code:
E:\cudaminer-2013-04-09>cudaminer.exe  --url http://litecoinpool.org:9332/ --user user --pass x --thread 1
           *** CudaMiner for nVidia GPUs by Christian Buchner ***
                     This is version 2013-04-09 (alpha)
        based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler
               Cuda additions Copyright 2013 Christian Buchner
           My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-04-10 10:43:36] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-10 10:43:37] Long-polling activated for http://litecoinpool.org:9332/LP
[2013-04-10 10:43:37] GPU #0:  with compute capability 0.47446600
[2013-04-10 10:43:37] GPU #0: Performing auto-tuning (Patience...)
[2013-04-10 10:43:37] GPU #0:    0.00 khash/s with configuration  0x0
[2013-04-10 10:43:37] GPU #0: using launch configuration  0x0


What happens if you don't specify the thread? same thing?  you can also use -D to see what Autotuning does


Nevermind. it isn't even seeing your cards properly. Language issue maybe? I see Japanese character on the one print out.
omo
full member
Activity: 147
Merit: 100
I tried the latest version on a Quadro FX 2800M card, it crashed after giving the following message:
Code:
E:\cudaminer-2013-04-09>cudaminer.exe  --url http://litecoinpool.org:9332/ --user user --pass x --thread 1
           *** CudaMiner for nVidia GPUs by Christian Buchner ***
                     This is version 2013-04-09 (alpha)
        based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler
               Cuda additions Copyright 2013 Christian Buchner
           My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-04-10 10:43:36] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-10 10:43:37] Long-polling activated for http://litecoinpool.org:9332/LP
[2013-04-10 10:43:37] GPU #0:  with compute capability 0.47446600
[2013-04-10 10:43:37] GPU #0: Performing auto-tuning (Patience...)
[2013-04-10 10:43:37] GPU #0:    0.00 khash/s with configuration  0x0
[2013-04-10 10:43:37] GPU #0: using launch configuration  0x0

edit:
I got similar messages on a GTX680 box(desktop remote accessing), all are win7/64 os:
Code:
[2013-04-10 11:01:22] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-10 11:01:23] Long-polling activated for http://litecoinpool.org:9332/LP

[2013-04-10 11:01:23] GPU #0: ,跸?羨 with compute capability 10002000.10002000
[2013-04-10 11:01:23] GPU #0: Performing auto-tuning (Patience...)
[2013-04-10 11:01:23] GPU #0:    0.00 khash/s with configuration  0x0
[2013-04-10 11:01:23] GPU #0: using launch configuration  0x0
newbie
Activity: 59
Merit: 0
Not nearly as impressive as I thought it would be. Is ECC turned off?
Performance is not scaling up with core count as much as I would have hoped for.
Nice to know the code does not barf on 64bit and that shares are accepted.

Yes, ECC is disabled.  The older version only got ~110 khash/sec, so the Titan version is nearly 70% faster on the K20.
newbie
Activity: 59
Merit: 0
sr. member
Activity: 247
Merit: 250
Thank god! I can finally post! No longer have to email Christian Directly Smiley


\o/

GTX 570 GPU 822 MHZ receiving  165 KH
650M receiving 29 KH

On the list:  GTX 9800 to be tested.

Pool is reporting tho I am mining at  190 KH, and 41 KHs

hero member
Activity: 914
Merit: 500
Hello,

First of all, good work Christian. Applause!

I have a GTX 295 (2 GPU's). The first version of the cudaminer performs a bit better than the latest(09/04) one. Between 1 a 2 Khash/s better.
Without overclocking I get +-36 Khash/s on each GPU.


Tested for bitcoin mining (not a cudaminer), and there it went to +-52 Mhash/s on each GPU.
Should I be able to get more or less, litecoin whise mining?


What do I have to overclock/underclock to be efficient?

Bitcoin uses SHA-256 where as Litecoin uses scrypt for proof of work. Scrypt is much more memory intensive, so it's a world of difference.
newbie
Activity: 11
Merit: 0
Hello,

First of all, good work Christian. Applause!

I have a GTX 295 (2 GPU's). The first version of the cudaminer performs a bit better than the latest(09/04) one. Between 1 a 2 Khash/s better.
Without overclocking I get +-36 Khash/s on each GPU.


Tested for bitcoin mining (not a cudaminer), and there it went to +-52 Mhash/s on each GPU.
Should I be able to get more or less, litecoin whise mining?


What do I have to overclock/underclock to be efficient?
hero member
Activity: 756
Merit: 502
I couldn't get a 32-bit compile in Ubuntu 12.04 because of libcurl issues.  For the 64-bit compile, I had to add -fpermissive to CXXFLAGS to get the compiler to accept a cast related to jansson.  After that, I have a binary.  

Here's the autotune for a K20.
http://pastebin.com/s9Cyb8yA

Not nearly as impressive as I thought it would be. Is ECC turned off?
Performance is not scaling up with core count as much as I would have hoped for.
Nice to know the code does not barf on 64bit and that shares are accepted.
newbie
Activity: 59
Merit: 0
I couldn't get a 32-bit compile in Ubuntu 12.04 because of libcurl issues.  For the 64-bit compile, I had to add -fpermissive to CXXFLAGS to get the compiler to accept a cast related to jansson.  After that, I have a binary. 

Here's the autotune for a K20.
http://pastebin.com/s9Cyb8yA
sr. member
Activity: 490
Merit: 254
I am still lacking data points on compute 1.0 and 1.1 devices (8800GTX, etc...), and also compute 1.2. Really ancient hardware Wink  

I actually have a old box with an 8800GTX in it! I will run it for awhile and report my results.


Update:
Autotune picked 1x4
Hashrate of 2.4 kH/sec

didn't bother which much more testing due to such a low rate
sr. member
Activity: 490
Merit: 254
Memory size can be a limiting factor if you have lots of SP (streamprocessors or CUDA cores) on the cards.

e.g. on a 660Ti with 3GB I can go for 290x2 (which consumes some 2.5 GB of RAM), whereas on smaller cards we can choose e.g. 148x2.

That explains why mine didn't work on 290x2 then and picked 148x2. I thought I had a 3 GB 660TI, but then looked at the box and indeed it was only a 2 GB model.

Would you mind explaining the math a bit more? From what you posted it seems as if I could go higher than 148x2, but not to 290x2 which you mention consumes 2.5 GB.

For other's reference:

GTX 660Ti 2GB
314.22 WHLQ Driver
GPU Core OC to 1320 MHz
autotune launch config settings 148x2
130-140 khash/sec with 98.8% average accepted share rate - via console
130.8 kH/sec reported by Pool (average over 10 minutes)
hero member
Activity: 756
Merit: 502
What is the autotune using to determine the outcome? Should I try to find an suitable config and then fix than in the startup params? Or is it better if I let it autotune on each startup?

Due to measurement errors there is always a bit of randomness involved. Occasionally it will even find a surprising new configuration that suddenly beats the one you were using previously.

It may depend on current overclocking settings as well (memory vs. core clock), that's why it's called "tuning".

Thanks to everybody who is posting their launch configs here. It allowed me to improve the heuristics code for people who do not have the patience to wait through an autotune session.

I am still lacking data points on compute 1.0 and 1.1 devices (8800GTX, etc...), and also compute 1.2. Really ancient hardware Wink    And of course, the TITAN!

Due to good benchmark results, I just ordered myself a GTX 570 (Club 3D also, 149 Euros)
Anyone got a GTX 590? This dual GPU card could do 300 kHash/s I guess.

Also I posted one last binary+source code update for today, which includes the improved heuristics for --no-autotune. I found and fixed one more bug appearing on Linux in my 9600M GPU. Most autotune measurements were showing 0 kHash suddenly.

Christian
hero member
Activity: 756
Merit: 502
Is it the SP's available or card memory available that is a limiting factor for the CUDA miner? I ask because I have a 670 with 4GB (vs. 2GB standard) and was curious if I should pop that in and give it a try.

Memory size can be a limiting factor if you have lots of SP (streamprocessors or CUDA cores) on the cards.

e.g. on a 660Ti with 3GB I can go for 290x2 (which consumes some 2.5 GB of RAM), whereas on smaller cards we can choose e.g. 148x2.

The more CUDA cores a card has, the larger the grid x block value we need to throw at it to keep the SPs busy - but also the memory requirement grows.

I have now improved my heuristics (the --no-autotune case) for such cases on compute 3.0 cards.

I have no experience with the 670 or 680 cards yet. But 4GB vs 2GB can make a difference in the autotune results.

Christian

hero member
Activity: 914
Merit: 500
Is it the SP's available or card memory available that is a limiting factor for the CUDA miner? I ask because I have a 670 with 4GB (vs. 2GB standard) and was curious if I should pop that in and give it a try.
Jump to: