[ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1133.

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: Caesar V on April 09, 2013, 08:22:51 AM

< -Lost newb

Can a kind sir please direct me to a guide so I can get this functioning?

Looks like you need to switch to a pool that uses the getwork protocol, not Stratum. For stratum you need to set up a local proxy server. Can't help you with that, as I haven't tried this myself.

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: wndrbr3d on April 09, 2013, 08:27:32 AM

The 04-08 build works on the 690 if I tell it only one thread, so we're getting close!

If it'd be easier than getting it working properly in an SLI rig, could there be a command line to set GPU affinity? That way I could just kick off two copies of cudaminer for their respective GPU.

--device 0

--device 1

(device indexing starts at 0 in CUDA)

--device implies the -t (threads) option, based on how many devices are given in a comma argument list.

--device 1,2 works if you want to skip your CUDA device 0, but crunch on devices 1 and 2.

RTFM Wink

Christian

wndrbr3d

hero member

Activity: 914

Merit: 500

The 04-08 build works on the 690 if I tell it only one thread, so we're getting close!

If it'd be easier than getting it working properly in an SLI rig, could there be a command line to set GPU affinity? That way I could just kick off two copies of cudaminer for their respective GPU.

Otherwise, so far so good! --no-autotune gives me ~110khash/sec on GPU0 in a 690.

Caesar V

sr. member

Activity: 369

Merit: 250

< -Lost newb

Can a kind sir please direct me to a guide so I can get this functioning?

I also get this error when I run the .exe

[2013-04-07 21:36:47] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-07 21:36:47] HTTP request failed: Empty reply from server
[2013-04-07 21:36:47] json_rpc_call failed, retry after 15 seconds
[2013-04-07 21:37:02] HTTP request failed: Recv failure: Connection was reset
[2013-04-07 21:37:02] json_rpc_call failed, retry after 15 seconds

cbuchner1

hero member

Activity: 756

Merit: 502

Where are the Titan owners? Hello, we have support for Titan and I really need someone to test this. If you could post the result of a Titan autotune session with debug flags to the forum - that would be immensely helpful.

cudaminer.exe -D >log.txt

This would generate two large tables in the log file. One table for the normal kernel, the other one for the spinlock kernel. (hopefully cudaminer prints to stdout, and not stderr - and you would have to wait through the time it takes to complete autotune. Can take quite long if your card has a lot of RAM: 6GB for Titan - of which at most 4GB are addressable from a 32 bit binary).

Christian

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: ?? on ??

I've just sent 40 LTC to your address. Another 60 LTC will be one its way when you get it done.
P.S. My nVidia card is C2050, I wonder what hashrate it can achieve.
Thank you again for your work.

Wow, you really mean it. That's a really huge donation. Thanks.

I started to put back the autoconf/automake scripts back together. Now I am at the point where all the CPU side things build, but I have yet to integrate the NVCC compiler into the automake procedure. I will be using google to see how other people did this kind of integration before.

Tesla C2050, that's a Fermi class device with 448 streaming processors. I would think it can do around 100kHash based on my experience with the GTX 460. It will take a few months before it will mine these 40 LTCs again, ... especially as the difficulty has been shooting up a lot.

Christian

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: BaghdadSteve on April 08, 2013, 11:18:48 PM

Tried the 4-08 update but auto-tuner does some wacky things before finally getting to mining at a higher rate (130khs), but nothing actually shows up valid with the pool, where with 4-06 it does.

can you try the 04-08 version with --no-autotune ? I might have broken autotune a bit, as indicated by people having issues with the "S" kernel being used in a 1x4 configuration, even though that kernel can't work at 4 warps Wink

So because the kernel isn't run, it throws off the timing measurements resulting in inflated kHash values, and then autotune picks that broken configuration. Meh.

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: dentldir on April 08, 2013, 10:29:20 PM

Autotune on a 660ti autodetected S1x4 which reported 136-138kH/s, but all shares were rejected.

1 LTC sent to LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

At least the non-autotune version works for you!

S1x4 doesn't look right to me. That's just one block launched on a card with 7 multiprocessors @ 192 cores each.

Oh, and the S kernel generally works with 1..3 warps only (x1,x2,x3). So it appears my autotune has an issue here with this "special" kernel. I only added that back to support older hardware like my GTX 260 at decent speeds, because my latest optimizations didn't work for these devices.

Thanks for the donation. I'll try to have that autotune bug fixed soon. Also working on getting Linux compilation to work again.

grosminer

hero member

Activity: 718

Merit: 500

Quote from: dentldir on April 08, 2013, 10:29:20 PM

Autotune on a 660ti autodetected S1x4 which reported 136-138kH/s, but all shares were rejected.

No autotune using 290x2 on the same card comes in a little slower, 115-132kH/s, but very low reject rate and faster than the previous version.

Same here for the first part (S1x4=100% reject)
Tried using 290x2, my device driver(314.14) crashed and i had 40-50% rejects..

So, i'm back with 2013-04-04..

BaghdadSteve

newbie

Activity: 20

Merit: 0

Managed to figure out the proxy thing and get things running earlier today, went from 75khash/s on guiminer to 120khash/s with the 4-06 version of cudaMiner. Tried the 4-08 update but auto-tuner does some wacky things before finally getting to mining at a higher rate (130khs), but nothing actually shows up valid with the pool, where with 4-06 it does. Hmm. Sticking with 4-06 for now, will donate when I have something to donate Kiss

dentldir

sr. member

Activity: 333

Merit: 250

Autotune on a 660ti autodetected S1x4 which reported 136-138kH/s, but all shares were rejected.

No autotune using 290x2 on the same card comes in a little slower, 115-132kH/s, but very low reject rate and faster than the previous version.

1 LTC sent to LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

Looking forward to testing Linux support.

cbuchner1

hero member

Activity: 756

Merit: 502

Check first post. And fire up the Titans. Cheesy

Almost everyone else may get 20% faster kHash.

And we have command line support for tinkerers (--devices, --launch-config), the README.txt explains it.
Also the debug (-D) option will give a lot of info during autotuning.

I'd love to see an autotuning log for a Titan. Really.

now excuse me - I've got mining to do.

Christian

cbuchner1

hero member

Activity: 756

Merit: 502

I might add linux build support back in at some point. BUT. So far no one has donated anything to my donation address, and hence it feels utterly bizarre when you wave a 100 LTC bill around like that. In all of the last 2 weeks I have only mined 13 LTC.

I just added a little Microsoft' ism into the code (scrypt.cpp): I use a parallel for construct around the HMAC SHA256 hashes. This takes the load off a single CPU core and allows the mining to go above ~120 kHash/s without choking the CPU. For Linux compilation I would have to use either OpenMP or turn this feature off via #ifdef. But Microsoft Visual Studio is not supporting OpenMP in the Express versions...

Grin

Update!

I've just added some special code for Geforce Titan. It is compiled against device capability 3.5 and makes use of the new funnel shifter, it keeps the register count tidy (64) and it consumes very little shared memory. And as extra effort, it is automatically enabled when you have a Titan. I am really curious what khash this will produce.

Now I need to get my command line overrides for device selection and launch configuration done. An update within the next 6 hours is likely!

Christian

cbuchner1

hero member

Activity: 756

Merit: 502

NOTE: I was told that coinotron is a stratum pool. so set up a local getwork->stratum proxy, and connect cudaMiner to http://localhost:9332 instead. I really shouldn't be doing your homework for setting up this app. Wink

BaghdadSteve

newbie

Activity: 20

Merit: 0

Quote from: AzNmeowmeow on April 07, 2013, 10:44:54 PM

Quote from: aigeezer on April 07, 2013, 07:48:45 PM

Thanks for the CUDA miner. I've certainly got uses for it.

I gave it a quick try hoping to use the coinotron pool and got:

E:\cudaminer-2013-04-06>cudaminer -t 1 -o http://coinotron.com:3334 -O myworkername:mypassword
*** CudaMiner for nVidia GPUs by Christian Buchner ***
This is version 2013-04-04 (alpha)
based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler
Cuda additions Copyright 2013 Christian Buchner
My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-04-07 21:36:47] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-07 21:36:47] HTTP request failed: Empty reply from server
[2013-04-07 21:36:47] json_rpc_call failed, retry after 15 seconds
[2013-04-07 21:37:02] HTTP request failed: Recv failure: Connection was reset
[2013-04-07 21:37:02] json_rpc_call failed, retry after 15 seconds
^C
E:\cudaminer-2013-04-06>

I'm a tired old man - see anything wrong with what I did? I used actual name and password, of course,
but I'll double-check them. Wink

Yeah I'm getting a similar problem anyone can give me like any instructions that I'm missing?

I'm getting this same issue on my GTX670 machine. Looking forward to testing this once I can get it to work - looking good so far though

Ryu.Hayabusa

member

Activity: 84

Merit: 10

Great work! I was running BFGMiner on a 9800GTX+ and was only getting ~2khash/s, which is terrible. Just ran this and it gave a near 10x increase in speed to ~19khash/s.

cbuchner1

hero member

Activity: 756

Merit: 502

I ran some initial tests on GTX 660Ti this morning. It looks like my newly optimized CUDA kernels up the khash from 112 to about 135 khash tops.

Don't know why JSON API calls would be failing, other than the pool being under DDOS attack or heavy load.

Maybe I should still allow the single memory allocation as a fallback, in case chunked allocation gets too slow?

Christian

AzNmeowmeow

member

Activity: 69

Merit: 10

Quote from: aigeezer on April 07, 2013, 07:48:45 PM

Thanks for the CUDA miner. I've certainly got uses for it.

I gave it a quick try hoping to use the coinotron pool and got:

E:\cudaminer-2013-04-06>cudaminer -t 1 -o http://coinotron.com:3334 -O myworkername:mypassword
*** CudaMiner for nVidia GPUs by Christian Buchner ***
This is version 2013-04-04 (alpha)
based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler
Cuda additions Copyright 2013 Christian Buchner
My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-04-07 21:36:47] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-07 21:36:47] HTTP request failed: Empty reply from server
[2013-04-07 21:36:47] json_rpc_call failed, retry after 15 seconds
[2013-04-07 21:37:02] HTTP request failed: Recv failure: Connection was reset
[2013-04-07 21:37:02] json_rpc_call failed, retry after 15 seconds
^C
E:\cudaminer-2013-04-06>

I'm a tired old man - see anything wrong with what I did? I used actual name and password, of course,
but I'll double-check them. Wink

Yeah I'm getting a similar problem anyone can give me like any instructions that I'm missing?

vonross2012

member

Activity: 102

Merit: 10

I'm running about 10khash/s faster with this over Guiminer. I didn't adjust any of the settings yet. Ty

wndrbr3d

hero member

Activity: 914

Merit: 500

Quote from: wndrbr3d on April 07, 2013, 08:23:24 PM

Quote from: cbuchner1 on April 07, 2013, 07:07:33 PM

wndrbr3d: pass the --no-autotune option please or wait a few minutes more... It could also be stuck at requesting a LONGPOLL connection with the pool. Longpoll can be disabled in options, too.

Still no bueno, although you're getting closer

This version doesn't crash on start up, but now it just seems to not mine. Not entirely sure yet.

Here's what I see happening in GPU-Z:

GPU load seems to go up and down some while the host CPU (i5-3570) gets pegged. Perhaps an issue with the CPU not being able to feed the massive amounts of SP's with the SHA256 calc happening on the CPU?

Dove into the code and I think there's something going on with the chunked memory allocation routine in find_optimal_blockcount(). I had it log out the warps through the loop, you can see the two threads doing their allocations but things start getting REAAAALLY slow... Wink

Not sure what could cause this. When I set -t 1, it flies through this no problem so I imagine it has something to do with the two GPU threads being kicked off at once.

Hope this helps!

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1133. (Read 3426996 times)