Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 995. (Read 3426921 times)

hero member
Activity: 756
Merit: 502
Two new experimental kernels added to github - currently for Linux only. The Visual C++
project has not yet been updated. You will want to run ./autogen.sh and configure after
doing a git pull.

"Z" code submission by nVidia for Compute 3.5 devices (GTX 780 etc...). Good for scrypt.
"Y" code submission by nVidia, modified to run on Compute 3.0 devices also. Good for scrypt.

I find that scrypt-jane still runs faster with the "X" (Fermi) and "K/T" (Kepler/Titan) kernels
from the current github code.

Test away... Especially the Z kernel is expected to rule. I haven't tested it yet in detail.
Best config for "Z" is No. of SMX x 24, according to the engineer who wrote it.
Best config for "Y" is (guessing) No. of SMX x 32   - or just autotune.

The Z kernel is best run with -C 0 (it supports C 1 and C2, but that is mostly pointless).

When you make kHash/s benchmarks compare with the best scrypt values achieved with the
2013-12-18 release.

I got 86 kHash/s on GTX 750M with the -C2 flag and -l Y4x32 in some quick tests, which
might be slightly faster than what the 2013-12-18 release delivered.

Christian
member
Activity: 85
Merit: 10
thanks for getting back to me.

I mine  from middle.com

https://bitcointalksearch.org/topic/ann-profit-switching-auto-exchanging-pool-wwwmiddlecoincom-259649

I does a lot of different coins then converts them in to bitcoin..

like mutlipool.in

here is my bat setup below with my username and password.

cudaminer -o stratum+tcp://asia.middlecoin.com:3333 -u HuhHuhHuhHuhHuhHuh? -p 123


thanks for trying to help me. I am very greatful

member
Activity: 106
Merit: 10
hello there you great and very smart people you!!

well this my very first post in this cudaminer thread

I have 2 computers (lappys)

I has a nvida gt750m 4GB gpu and its running at about 75khps and has some k4x16 (it shows up when the auto thing starts  up (don't know what that means) and can I increase it to go better...

and in my second lappy there is a nvida gtx 670 m 3gb gpu and its getting about 75khps and has some thingy f56x2.. it shows this when the autoscan  starts up  can I increase it some how..

I am sadly no were near as smart or great as the rest of you is there away to get more KHps out of my cards with out blowing my lappys up..

any help would be greatfully


thanks




sorry to ask again and be a pain in the ass but can some give some advise on what to do about trying to get more kh/s out of my gpus on my 2 laptops please.
Hi
First of all, what do you try to mine? Litecoins, any other Scrypt based coin or maybe yacoin?
The hashrate is different for any kind of coin.
Autoscan tries to find the best values for you. Once it has found them (like F56x2) you can start cudaminer with -l F56x2 to skip the autotune every time.
But we need to know which coin you try for more possible tuning.
member
Activity: 106
Merit: 10
I built the latest commit (111) for you.
Please note that this comes without any warranties or anything. Donations please go to cbuchner!
Thanks @cbuchner for your continued work!
64-bit: https://www.dropbox.com/s/7qp3cwgufivu5jt/cudaminer_commit_111_x64.rar
32-bit: https://www.dropbox.com/s/z6aenjphoew7xs1/cudaminer_commit_111_x86.rar
member
Activity: 85
Merit: 10
hello there you great and very smart people you!!

well this my very first post in this cudaminer thread

I have 2 computers (lappys)

I has a nvida gt750m 4GB gpu and its running at about 75khps and has some k4x16 (it shows up when the auto thing starts  up (don't know what that means) and can I increase it to go better...

and in my second lappy there is a nvida gtx 670 m 3gb gpu and its getting about 75khps and has some thingy f56x2.. it shows this when the autoscan  starts up  can I increase it some how..

I am sadly no were near as smart or great as the rest of you is there away to get more KHps out of my cards with out blowing my lappys up..

any help would be greatfully


thanks




sorry to ask again and be a pain in the ass but can some give some advise on what to do about trying to get more kh/s out of my gpus on my 2 laptops please.
hero member
Activity: 809
Merit: 501
I want to use the -L version myself! Is someone posting binaries for this?Huh
hero member
Activity: 840
Merit: 1000
Hmm, I'm getting like 19 coins a day with 100 kH/s (<0% rejected) on vert.bitcrush.info pool.

Difficulty went up at lot since my post  :-/ I'm down to 200 coins a day with 1000 kH now. However, the exchange rate went up quite a bit too - 0.0004 BTC on CoinedUp, so profitability is about the same.


Hey, when I checked my mobile phone this morning I noticed that an nVidia engineer has submitted an optimized kernel for Kepler devices. Apparently they are aware of the whole AMD vs nVidia  mining discrepancy and want to help me put nVidia into a better position.

I will review their code submission and integrate it if it's better than my code (which is likely, considering they designed this silicon). They don't include scrypt-jane yet, so I will have to do that part myself.

Christian




That is very cool. Can't wait to see the results.
member
Activity: 85
Merit: 10
 hello there you great and very smart people you!!

well this my very first post in this cudaminer thread

I have 2 computers (lappys)

I has a nvida gt750m 4GB gpu and its running at about 75khps and has some k4x16 (it shows up when the auto thing starts  up (don't know what that means) and can I increase it to go better...

and in my second lappy there is a nvida gtx 670 m 3gb gpu and its getting about 75khps and has some thingy f56x2.. it shows this when the autoscan  starts up  can I increase it some how..

I am sadly no were near as smart or great as the rest of you is there away to get more KHps out of my cards with out blowing my lappys up..

any help would be greatfully


thanks

member
Activity: 70
Merit: 10
Woot, I can finally post here!

Is Scrypt jane heavily dependent on memory available? If so how are 6GB Titans performing?
I think no one has tested this yet.

I have!

I think Titans have a problem allocating all that memory.
Quoted from the Scrypt-jane spreadsheet:
"GPU Memory usage: 2883 MB"
"Seems Kepler Kernal has better memory allocation then the Titan for Scrypt-Jane. Texture Cache set to -C 1 throws an error indicating it fails over to -C 2 but if you launch a -C2 the hashrate is nearly halved."
Edit: Also "Current Titan kernal appears not to allow much more then this before driver soft crashes currently."

This was me.

Could this be a 32-bit limitation? Maybe we need a working 64-bit version?

Possibly, I was using Patoberli's x86 build of commit 92 for both the T16x1 and K21x1 runs.

don't think so. Features for the Titan kernel have been brought to the same level as for Kepler.  Just -C 1 and 2 aren't needed for Titan, as caching is automatic and always implied.

This is true but on the x86 build of commit 92, T16x1 netted about 3.68 while K21x1 was 3.93 (few spikes to 3.97) all with a 325+ Core offset... I have not been able to try an x64 build yet or any with the new Lookup Gap implementation (no time to build one myself sadly) so I am not aware of how this may change in later commits.

I can say without a doubt that the K kernel was much more stable then the T kernel was. I could go as high as T19x1 (more and driver would hard crash) but the has rate dropped off significantly after T16x1(T8x2 was virtually the same). The K kernel simply allowed me to reach K21x1 with little to no issue and a significant improvement to hash rate.

Given some time later this week I plan to build a more recent commit as well as an x64 for further testing.

For those interested in Scrypt-Jane OCing, Memory bandwidth is very much not a factor, I was able to do a -502 (maximum possible) memory offset and have literally no change in hash rate. The Memory controller was maybe 17% utilized so there is zero bottle neck here. It is safe to drop this to improve total TDP on your cards. Core Offset can be set vastly higher then a normal overclock, Where i was able to get +155 when running Scrypt I can now easily and safely reach +325 to +350 without much issue. This is with stock Bios and Default Drivers while Air Cooled, do not be afraid to push core clock higher when running Scrypt-Jane Smiley
full member
Activity: 182
Merit: 100
I have been solo mining YACoin all day with the latest client and -l 128x2 -b 1024 -L 4 -i 1 --algo=scrypt-jane at 4khash/s and I haven't found a single block, bad luck or something wrong?

This might sound like a conpiracy, but both the old and the new YAC wallet to me works perfectly, except seemingly randomly after a couple of hours of solomining it shuts off the miner with (Internal Error: 500) and it can't reconnect. Degub.log has nothing. Sometimes it does it after an hour, sometimes it does it a day later, while I'm not even using the PC. I'm not saying it does that when I'm about to find a block, buuuut since I solomined more time than I care to admit with 2-4 kH/s and got nothing, it kind of popped up in my head. Undecided

Hmm I haven't gotten 500 errors, no...
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
I have been solo mining YACoin all day with the latest client and -l 128x2 -b 1024 -L 4 -i 1 --algo=scrypt-jane at 4khash/s and I haven't found a single block, bad luck or something wrong?

This might sound like a conpiracy, but both the old and the new YAC wallet to me works perfectly, except seemingly randomly after a couple of hours of solomining it shuts off the miner with (Internal Error: 500) and it can't reconnect. Degub.log has nothing. Sometimes it does it after an hour, sometimes it does it a day later, while I'm not even using the PC. I'm not saying it does that when I'm about to find a block, buuuut since I solomined more time than I care to admit with 2-4 kH/s and got nothing, it kind of popped up in my head. Undecided

Wanted to report my results using the latest git version of cudaminer against vertcoin. With my gtx670 I am averaging around 126khps.  Using a 64bit version does not offer any improvement on my end.

As far as I noticed, scrypt gets no benefit from 64bit, only scrypt-jane does.
In fact, to me the x86 version of the 2013-12-18 release was slighty faster then the x64. Haven't really mined scrypt since then though.

PS:
1000 kH would get you 500 coins per day at the moment, at the current sell price that's around 0.08 BTC per day. So yeah very profitable assuming the value holds steady, maybe the most profitable coin out? Seems better than Doge even.

Hmm, I'm getting like 19 coins a day with 100 kH/s (<0% rejected) on vert.bitcrush.info pool.
newbie
Activity: 9
Merit: 0

Christian, are you communicating with your Nvidia Friend about CUDA 6? Will it give any performance enhancements for our old Fermi cards?

the communication was so far limited to a kernel submission from nVidia.

It's a high register count (1 hash per thread) Compute 3.5 kernel that gives some marginal improvement over Dave Andersen's work. Unfortunately it's not well suited for implementing a LOOKUP_GAP.

Christian


Told you that your work was getting noticed. Just didn't know it went all the way up to Nvidia itself.  Grin


On a more related note I would imagine you welcoming CUDA 6 with open arms due to simplified memory management.

Additionally the ARM cpu that should be on Maxwell cards should be really nice for mining. I envision a Maxwell kernel that uses it handle things that aren't great for the GPU while getting CPU usage to a more consistently near zero level.
full member
Activity: 182
Merit: 100
I have been solo mining YACoin all day with the latest client and -l 128x2 -b 1024 -L 4 -i 1 --algo=scrypt-jane at 4khash/s and I haven't found a single block, bad luck or something wrong?

a bad dry spell like that (with 20kHash/s mining power) caused me to join the yac.coinmine.pl pool again. ;-)

Also note that within 2 weeks you have to upgrade your yacoin wallet to the new 4.2 release. Failing to do so will definitely break your solo mining (possible blockchain split at block 420000 for non-upgraders).
I have updated the wallet, so it was just bad luck I suppose...
I am indeed on a pool now.
hero member
Activity: 756
Merit: 502
PSA - when you sell your mined YAC, don't just put up a sell order for the current (=last) price. Aim higher. You will wait 2-3 days, but eventually your order will get fulfilled.

Code:
Date Type Pair Price Amount Total
2014-01-19 12:59:32 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:59:22 Sell YAC/BTC 0.00004600 BTC 2,065.170435 YAC 0.0950 BTC
2014-01-19 12:59:16 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:59:16 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:59:06 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:59:01 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:56:56 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:56:36 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:56:30 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:56:30 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:46:21 Sell YAC/BTC 0.00004600 BTC 101.000000 YAC 0.0046 BTC
2014-01-19 12:39:26 Sell YAC/BTC 0.00004200 BTC 1,524.000000 YAC 0.0640 BTC
2014-01-14 04:44:49 Sell YAC/BTC 0.00004900 BTC 2,850.000000 YAC 0.1397 BTC

see, there are trade bots and buyers occasionally putting their order limits well above the current price.
hero member
Activity: 756
Merit: 502

Christian, are you communicating with your Nvidia Friend about CUDA 6? Will it give any performance enhancements for our old Fermi cards?

the communication was so far limited to a kernel submission from nVidia.

It's a high register count (1 hash per thread) Compute 3.5 kernel that gives some marginal improvement over Dave Andersen's work. Unfortunately it's not well suited for implementing a LOOKUP_GAP.

Christian
hero member
Activity: 756
Merit: 502
I have been solo mining YACoin all day with the latest client and -l 128x2 -b 1024 -L 4 -i 1 --algo=scrypt-jane at 4khash/s and I haven't found a single block, bad luck or something wrong?

a bad dry spell like that (with 20kHash/s mining power) caused me to join the yac.coinmine.pl pool again. ;-)

Also note that within 2 weeks you have to upgrade your yacoin wallet to the new 4.2 release. Failing to do so will definitely break your solo mining (possible blockchain split at block 420000 for non-upgraders).

hero member
Activity: 756
Merit: 502
I disabled SLI on mine, it was behaving weird with it. I get roughly ~2.6/2.7 per GTX 660 card now. I can't specify a decent -l number though, it errors out every time I do it. It only works with -L 2.

I still can't figure why the auto tune causes Cudaminer to crash with 2 GTX660.

Can someone suggest command instructions to get this working?

Appreciate it!

Workaround: run two separate instances, one with -d 0, the other with -d 1.
hero member
Activity: 756
Merit: 502
Hi Christian, I recently just git clone the source and compiled in ubuntu 13.10 x64. My gpu is a GTX 570. Everything compiled fine but running to mine litecoin (scrypt) I got this error:

GPU #0: GeForce GTX 570 result does not validate on CPU (i=5456, s=0)!

Below is the execution config.
./cudaminer -a scrypt -o stratum+tcp://hk2.wemineltc.com:3333 -u -p -l F15x16 -C 1 -m 1 -H 2 -i 0

Is it a bug or some wrong config?

Today I was told that CUDA 5.0 and the current github code gives validation errors on Fermi devices with the F kernel. Use 5.5! I was also informed that removing the rule in Makefile.am specific to fermi_kernel.cu give a boost in scrypt performance with the F kernel. For some reason compiling it for sm_20 under Linux causes a 40% performance drop. Removal of the rule causes the .cu.o rule to take over, which compiles this for compute_10. Run autogen and configure again.

Christian
full member
Activity: 239
Merit: 103
Wanted to report my results using the latest git version of cudaminer against vertcoin. With my gtx670 I am averaging around 126khps.  Using a 64bit version does not offer any improvement on my end.

Getting 160kh/s with an 680 but im using an older cudaminer version. With the latest git i get around 140kh/s
The 640 is pinned at 40 kh/s regardless of which version i use. If u compare this to an 6 core Xeon with gets 33 kh/s its really bad but expected.
newbie
Activity: 4
Merit: 0
I disabled SLI on mine, it was behaving weird with it. I get roughly ~2.6/2.7 per GTX 660 card now. I can't specify a decent -l number though, it errors out every time I do it. It only works with -L 2.

I still can't figure why the auto tune causes Cudaminer to crash with 2 GTX660. Still can't get this to work. I have SLI turned off. Both GTX 660s have 2g ram. There's no overclock on either one, stock speeds on everything.

Can someone suggest command instructions to get this working?

Appreciate it!
Jump to: