Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1072. (Read 3426947 times)

newbie
Activity: 10
Merit: 0
hi, try to use different combinations, maybe all of them :-)

 -i 0 (or 1)
 -m 1 (with or without)
 -C 0 (or 1 or 2 or without)
 -H 1 (with or without)
autotuning kernel for best results: -l Fauto (or  -l Tauto or -l Sauto)

Hi, I tried tens of combinations. No luck, impossible to get more than 20 kHash O.o

Antares88: I have 9800GT that is actually the same 8800GT. As I posted before, I couldn't find good settings for that card. Also, I get strange differences in speed with every time running miner with the same settings.
If you succeed  finding optimal settings for that card, please share them. Smiley It's interesting how much can score that old cards. My best speed I get is ~25kh/s, but it's not stable as not always MCU gets the same load. Sometimes it gets only half of load and i get only 14kh/s Sad

Sad the results seems pretty random... do you use 64 bit cudaminer or 32 bit ? what drivers?
newbie
Activity: 3
Merit: 0
I think it is because http://api.bitcoin.cz:8332 is only for SHA256 - bitcoin mining, not scrypt - litecoin mining...

Ahhhh ok, some of the confusion gets cleared.. I use bitcoin not Litecoin, that's where all the confusion lies. I figured that it could go vise verse but I guess not.
sr. member
Activity: 396
Merit: 250
I speak: LT, RU, EN
Antares88: I have 9800GT that is actually the same 8800GT. As I posted before, I couldn't find good settings for that card. Also, I get strange differences in speed with every time running miner with the same settings.
If you succeed  finding optimal settings for that card, please share them. Smiley It's interesting how much can score that old cards. My best speed I get is ~25kh/s, but it's not stable as not always MCU gets the same load. Sometimes it gets only half of load and i get only 14kh/s Sad
13G
newbie
Activity: 17
Merit: 0
hi, try to use different combinations, maybe all of them :-)

 -i 0 (or 1)
 -m 1 (with or without)
 -C 0 (or 1 or 2 or without)
 -H 1 (with or without)
autotuning kernel for best results: -l Fauto (or  -l Tauto or -l Sauto)
newbie
Activity: 10
Merit: 0
Hi everyone, I'm new here.

I'm trying to mine Litecoins with my GPU, a 8800gt, but I'm not able to get more than 20 kHash/s. I know that it's an old card and that cudaMiner is young, but from the charts it seems i should get way better results.
I'm using the settings -l 28x3 -i 1 -C 0
L14x3, L26x3 are giving similar results.
Many other configurations just don't work, with cudaMiner crashing. Others are giving impossible hashrates (i.e. 66x2 -> 324 kHash/s) that are not validated by processor. Autotune gives every time different results and often crashes trying L0x0.
Any advice ?

I'm on Win7 x64, CPU i5-2500k, nvidia driver 320.78
13G
newbie
Activity: 17
Merit: 0
I think it is because http://api.bitcoin.cz:8332 is only for SHA256 - bitcoin mining, not scrypt - litecoin mining...
sr. member
Activity: 362
Merit: 250
0diz, that means your shares are being rejected by the pool. Are you sure you have the address and port set correctly? In your log output it says port 3333 but in the url below it, you show port 8332.
newbie
Activity: 3
Merit: 0
So I'm trying to get CUDAMiner to work but this is what I keep getting... I searched the forum but nothing with (booooo) came up  Huh


[2013-11-26 16:42:19] 1 miner threads started, using 'scrypt' algorithm.
[2013-11-26 16:42:19] Binding thread 0 to cpu 0
[2013-11-26 16:42:23] Starting Stratum on stratum+tcp://stratum.bitcoin.cz:3333
[2013-11-26 16:42:24] Stratum detected new block
[2013-11-26 16:42:25] GPU #0: GeForce GTX 650 with compute capability 3.0
[2013-11-26 16:42:25] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0
[2013-11-26 16:42:25] GPU #0: Performing auto-tuning (Patience...)
[2013-11-26 16:42:39] GPU #0:   61.54 khash/s with configuration K10x6
[2013-11-26 16:42:39] GPU #0: using launch configuration K10x6
[2013-11-26 16:42:39] GPU #0: GeForce GTX 650, 5760 hashes, 0.39 khash/s
[2013-11-26 16:42:40] GPU #0: GeForce GTX 650, 24960 hashes, 38.10 khash/s
[2013-11-26 16:42:41] GPU #0: GeForce GTX 650, 44160 hashes, 39.32 khash/s
[2013-11-26 16:42:41] accepted: 0/1 (0.00%), 39.32 khash/s (booooo)
[2013-11-26 16:42:42] Stratum detected new block
[2013-11-26 16:42:42] GPU #0: GeForce GTX 650, 61440 hashes, 39.78 khash/s
[2013-11-26 16:42:43] GPU #0: GeForce GTX 650, 21120 hashes, 37.61 khash/s
[2013-11-26 16:42:43] accepted: 0/2 (0.00%), 37.61 khash/s (booooo)
[2013-11-26 16:42:45] GPU #0: GeForce GTX 650, 69120 hashes, 39.92 khash/s
[2013-11-26 16:42:45] accepted: 0/3 (0.00%), 39.92 khash/s (booooo)
[2013-11-26 16:42:46] GPU #0: GeForce GTX 650, 63360 hashes, 39.82 khash/s
[2013-11-26 16:42:46] accepted: 0/4 (0.00%), 39.82 khash/s (booooo)

I'm using the following setup: "cudaminer.exe -o http://api.bitcoin.cz:8332 -O Worker:Pass"

Does anyone know how to stop the (booooo) from happening? Or does anyone have any tips to try?

Note: I'm running a Win 7 64x machine
newbie
Activity: 11
Merit: 0
What settings should be best for K5000 & K6000? Autotuning K5000 gets at best 136 kH/s.
hero member
Activity: 756
Merit: 502
Some tests on my 780Ti show that the 450 kHash/s performance limit is somewhat related to memory transfer performance.

When I comment out the compute intensive xor_salsa subroutine entirely, the GPU still hashes at 450 kHash/s, consumes 85% TDP and shows 99% GPU load, 50% memory controller load. This is quite astonishing for a GPU that only moves data between global and shared memory and back! Why is it still consuming so much power? And why does it only show 50% memory controller load even if this appears to be the limiting factor?

In order to achieve ANY improvement here, I will have to come up with some new ideas about more efficient memory storage layout and transfers.
sr. member
Activity: 396
Merit: 250
I speak: LT, RU, EN
Windows XP or Linux works better for old Compute 1.x devices. Different driver model.
Strange that you're not getting decent performance on Windows XP 64 bit...

I don't run these old cards anymore. I wish I could still test this...

You running with -i 0 and -C2 also? What happens when you let it autotune the Fermi or Kepler kernel instead of the Legacy kernel with -l F or -l K ?

I don't run XP, as I wrote, I run Vista x64.
Tried -i 0 -C 2. With --benchmark system starts laging. When mining not so much, but feels little bit lagging too.
But -i 0 -C 2 does not give better results. Just after I had speed ~24kh/s, I stopped miner and tried i and C values, autotuning with Fermi and Kepler and could not get more than 15.2kh/s. After that tests, I returned to my default command line:
cudaminer -a scrypt -o stratum+tcp://coinotron.com:3334 -O mirer.usr:pass -i 1 -l K14x3 -R 5 -H 1
and I get 24-25.27kh/s again! I don't get what makes the difference. Looks like some register in the card or some setting value in driver is changing. I can't explain that...

Maybe I can run some test for you? Smiley It would be interesting to find the reason of such strange problem.
hero member
Activity: 756
Merit: 502
Windows XP or Linux works better for old Compute 1.x devices. Different driver model.
Strange that you're not getting decent performance on Windows XP 64 bit...

I don't run these old cards anymore. I wish I could still test this...

You running with -i 0 and -C2 also? What happens when you let it autotune the Fermi or Kepler kernel instead of the Legacy kernel with -l F or -l K ?


sr. member
Activity: 396
Merit: 250
I speak: LT, RU, EN
Hi everyone.
Got intrested in mining LTC and have a card NVidia 9800GT. I know, it's an old card, it's not very effective, but i'm not trying to get much profit. There's just interest.
Tried cudaminer 2013-11-20 (both 32bit and 64bit) on Windows Vista x64 SP2. Driver ver. 331.82 (also tried 314.22). My CPU is Q6600.
As someone wrote in this forum, with 9800GT we could get ~45kh/s if card is overclocked. So, i was expecting to get around 30-35kh/s without overclocking. But I get very strange results:
Autotune gives different result/configuration every time. So by trying i've found best settings like this:
-l K14x3
Unfortunately, I get only ~14kh/s with these settings. Card loads at that speed are GPU: ~95% MCU: 15%
But! sometimes, I just don't know the reason, I get better result - 24kh/s. Card loads changes GPU: ~92% MCU: 26%. So I believe, that in that case GPU is used more effectively. But why with the same config MCU load differs?
If i stop and run miner again, I get 14kh/s again... To get better speed sometimes I have to try stop/load miner lots of time (did not tried to do it specially). It happens about once a day.
Tried different drivers, thew different versions of cudaminer. What can be the reason of such low speed and diferrent results with the same configuration? And how can I spot the problem?
13G
newbie
Activity: 17
Merit: 0
cbuchner1:

At which clock is running your GTX 780Ti in GPU-Z ? 450khash/s is unbelievable...

I have Zotac GTX Titan AMP! with cpu i7 2600k@4,6Ghz:

Best results i have: -d 0 -i 0 -C 2 -m 1 -H 1 -l T222x1
 80C, 2180RPM FAN and 88%TDP @ 966Mhz


           *** CudaMiner for nVidia GPUs by Christian Buchner ***
                     This is version 2013-11-20 (alpha)
        based on pooler-cpuminer 2.3.2 (c) 2010 Jeff Garzik, 2012 pooler
               Cuda additions Copyright 2013 Christian Buchner
           My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-11-26 19:54:45] 1 miner threads started, using 'scrypt' algorithm.
[2013-11-26 19:54:45] Long-polling activated for http://localhost:8332/lp
[2013-11-26 19:54:46] GPU #0: GeForce GTX TITAN with compute capability 3.5
[2013-11-26 19:54:46] GPU #0: the 'T' kernel ignores the texture cache argument
[2013-11-26 19:54:46] GPU #0: interactive: 0, tex-cache: 0 , single-alloc: 1
[2013-11-26 19:54:46] GPU #0: using launch configuration T222x1
[2013-11-26 19:54:46] GPU #0: GeForce GTX TITAN, 7104 hashes, 31.27 khash/s
[2013-11-26 19:54:48] GPU #0: GeForce GTX TITAN, 923520 hashes, 383.56 khash/s
[2013-11-26 19:54:48] accepted: 1/1 (100.00%), 383.56 khash/s (yay!!!)
[2013-11-26 19:54:54] GPU #0: GeForce GTX TITAN, 2401152 hashes, 389.99 khash/s
[2013-11-26 19:54:54] accepted: 2/2 (100.00%), 389.99 khash/s (yay!!!)
[2013-11-26 19:55:06] GPU #0: GeForce GTX TITAN, 4631808 hashes, 391.26 khash/s
[2013-11-26 19:55:06] accepted: 3/3 (100.00%), 391.26 khash/s (yay!!!)
member
Activity: 104
Merit: 10
still had no real joy with this but iv just won a ati 5870 on ebay should i get a good deal better hash rate than 70-90Kh/s?
https://litecoin.info/Mining_Hardware_Comparison#AMD_.28ATI.29

According to that anywhere from 250-450Kh/s for that card.
sr. member
Activity: 490
Merit: 254
Yep, just downloaded fruittool's version as well to compare. I used the exact same launch settings as I posted above, and it is also about 40 kHash lower, or ~ 180 kHash/sec in my case. I also cross checked what the console was reporting with the statistics on my pool, and the results were consistent; lower with the 64 bit clients.

Back to using the 32 bit version for now and 220 kHash/sec.

Did you try the -H 1 switch? Assuming you have a dual core or more. The 64 bit build seems to use more cpu and may benefit from it more or less depending on cpu speed.

-H 1 gives me about 5 kHash more, so it helps some, but still nearly 35 kHash/sec lower than the 32 bit client. This box is an older Quad core i7 with HT, so 8 usable cores.
hero member
Activity: 756
Merit: 502
Maybe...I have Phenom II x4 955 3.2Ghz. When mining, it is utilized about 30%.

Don't forget the -H 1 flag, otherwise it will only use a single core of the CPU - which can be a bottleneck.
hero member
Activity: 756
Merit: 502
Card runs 1150/1800 and gets avg 360kh/s

Interesting. I should also try to autotune the Kepler kernel on my 780 Ti. Maybe I get a few extra kHash/s out.
Did you allow 100% TDP, or more?

Your settings should scale to the 780 Ti  as  15/12 * 360 kHash = 450 kHash according to the number of enabled SMX engines. But then of course there's also the thermal and power (TDP) limitation which can ruin the day.


I am not having much luck running a Kepler kernel on my 780Ti. Can't get much above 400 khash/s. -C 2 -l K119x4 is the best I can do and it results in 401.8 kHash/s

UPDATE: using the 32 bit cudaminer binary and -C 2 -l K119x4 actually gets me to 445 kHash/s. But at the price of increased CPU usage.

Back to Titan kernel for me!

UPDATE2: And now I find that with the 32 bit cudaminer and -l T30x16 I actually get around 450 kHash/s. There seems to be a definitive register advantage in the 32 bit CUDA kernels, making the code run faster.

One other thing that bugs me is that my Phenom X6 1055T CPU is pegged at 80% load when all my 4 GPUs are running at around 800 kHash/s total. I think I should add an option to do the SHA256 hashing on the GPU, lowering the CPU load below 10%, even if it costs a few kHash/s of performance.
hero member
Activity: 675
Merit: 514
Maybe there's a difference between AMD and Intel CPU?
Maybe...I have Phenom II x4 955 3.2Ghz. When mining, it is utilized about 30%.
I have a Phenom II X4 too, so that's not the reason.

newbie
Activity: 12
Merit: 0
Currently getting 180-190kH/s on my 660Ti (not OC) with this setup:
cudaminer.exe -H 1 -d 0 -i 0 -C2 -l K14x14
on 64bit cudaminer it is slightly slower and im getting 160kH/s

It's very similar for me. Interesting launch config by the way. I shall try that one.

Maybe the 64 bit compiled kernel version is running into a register limitation, as pointers are then 64bit and require 2 registers each. I can compare the resulting PTX code to verify the assumption.
I found it somewhere on the net :-)
With other configs, I have only 80-150kh/s, dont know why. People here have more kh/s than my GPU. With this config, it it running better.

Maybe there's a difference between AMD and Intel CPU?
Maybe...I have Phenom II x4 955 3.2Ghz. When mining, it is utilized about 30%.
Jump to: