Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1120. (Read 3426918 times)

hero member
Activity: 756
Merit: 502
Any news on a secondary download source? Dropbox, github,sourceforge?

You can get the source code from github now, but not the binaries. For Linux compilation this should suffice.

hero member
Activity: 756
Merit: 502
From 8khash to 28khash. Interesting. Now, wonder what the 8800GTX will do...

That's more like what I would have expected from these cards.

Strangely, when I enable texture caching the determined performance during autotune is about 10-25% higher than without cache. But the achieved performance during the mining is way about 30% less than without cache. So why does the performance advantage turn into a disadvantage? This discrepancy needs to be understood before I can put out another version. I've even tried to completely randomize the input data during autotune - but no change. I really want to get that measured gain into the actual mining. I hope it's not just an illusion.
full member
Activity: 176
Merit: 100
Whoa, never mind Linux. This just happened when I disabled Aero/desktop composition:


From 8khash to 28khash. Interesting. Now, wonder what the 8800GTX will do...

I do think the resolution of the auto-tune is a bit sketchy though. It seems to fly through the khash/sec timings far faster than it can get an accurate reading, which results in many test results just being all over the place (20... 18... 20... 22... 18...). There's stuff going on in the background (like drawing on the screen) that I'm sure causes some bumps in the readings that it doesn't test twice. Maybe increase the test duration for each step, and lock into multiples of two? I couldn't imagine "13x3" would serve any better purpose than a rounded number like 14x2 or such... (or am I wrong there?)
full member
Activity: 176
Merit: 100
Well, these old cards don't have dynamic clocks for 2D/3D modes - which is why they get so damn hot while just sitting idle. I do however think that if Linux is giving so much of a performance boost, it'd be worth just dumping Ubuntu on these things to mine with them. They're "shell" computers anyway - optimally just going to sit up on a shelf connected to power and network, just being remote-controlled for mining. TONS of motherboards, hard drives, CPUs, memory sticks, and GPUs laying around that I'd love to put to work while the shop doesn't have to pay for power Wink You got a recommendation for a Linux distro that'll do the job best? Cheesy
hero member
Activity: 756
Merit: 502
CUDA drivers rel 304.54 (Linux) and 306.94 (Windows) or later are required for CUDA 5.0 apps like cudaMiner. Because I am not doing error checks yet, you will see the program crashing if these requirements are not met.

About 8800/9800 cards performance: I am seeing the same performance varitions on an nVidia 9600M GT on Windows. sometimes 4 kHash, sometimes 6 kHash.

Linux: I get solid 9.6 kHash.

I am starting to believe there's something wrong with windows drivers for very old card models.

Could be that the device is not clocking up for CUDA workloads? Have you tried running any kind of DirectX or OpenGL app simultaneously, to see if that makes it get up to speed?

UPDATE: the texture cache feature seems to work in 1D and 2D modes now, but does not really make things faster yet. I do get accepted and verified shares though (happy!)

UPDATE2: I may have solved the excessive CPU utilization problem on Windows, too.

Christian
full member
Activity: 176
Merit: 100
Well, I definitely appreciate that someone's put some work into an nVidia miner!  Grin

Maybe I'm alone here, but I kinda think most of us *aren't* going to go out and buy all-new cards just to mine Litecoin. Maybe. Maybe not. I dunno. But the most valuable use I have for it now is going through a junk-pile at the shop and pulling out all the 8000-series and higher cards and building mining systems for them (while the shop owner and I work together mining Bitcoin, of course... hehe).

That said, the best card that's been in the pile so far is a 9800GT (which was kinda impressive - thought it was an 8800). So I've got a 9800GT and a 8800GTX working right now with this cudaMiner.

Here's the problem I ran into. Both are experiencing all-over-the-map performance variations. The 8800GTX was previously cranking out 34-36khps (with accepted results), then when I moved to a 64-bit Windows 7 SP1 install (previously 32-bit Vista SP0 from the initial OEM install), it shot up to ~44khps. However, after updating drivers and allowing me to crank the fan speed higher, it fell through the floor and lingers around 16khps.

And that 9800GT? It was cranking out 16khps, pretty pathetic, under a 32-bit Win7 SP0 install. When I moved that up to Win7 SP1 x64, it again shot up to ~24khps, but that also wasn't stable - next time I restarted the miner, it's only doing... EIGHT... YES... EIGHT! 8khps.


I've been playing with the different driver versions, and it seems that cudaMiner won't run (just silently crashes/exits without any output other than the initial banner - not even an error log entry) with any drivers below version 300. Can't make any sense of it... :/
hero member
Activity: 756
Merit: 502
cudaMiner's inconsistent CPU usage is a topic that I will be working on. You can currently only play with the -i flag to see if it makes a difference.

I think I found out what is wrong with the texture cache. I was not computing the texel coordinates correctly - in particular I failed to add a block+warp specific texel offset. Results now do validate, but I see a performance degradation instead of a gain.

I will have to determine whether it is better to use a 2-dimensional texturing or a single 1 dimensional linear texture. I may even allow to pass in the dimensionality via the -C flag directly Wink

Christian
newbie
Activity: 19
Merit: 0
cudaMiner shows cpu usage near 100%, how can i fix it?
newbie
Activity: 19
Merit: 2
Got 185 kh\s on GTX680 1214\6038

Thanks for the software!
hero member
Activity: 756
Merit: 502
A few days ago I ordered a used 560Ti 448 core edition (~130 Euros) because of the stellar performance figures.

I believe the high memory bandwidth of 500 series cards is mainly responsible for their performance. And it seems the core count vs. memory throughput is rather balanced for this type of application.

The Kepler series (6xx) seems to have too many CUDA cores and a memory interface that isn't any better than the 500 series. In other words: too much compute power in relation to bandwidth.

About future optimization possibilities:

I do believe that adding a LOOKUP_GAP implementation for factor 2 and 3 may boost the performance slightly - and more significantly for Kepler cards and the GTX Titan (250 kHash for a non-overclocked Titan seems really low).

I think that using some inline PTX assembly for the xor_salsa implementation we can get another slight boost, and maybe also a reduction in kernel register count.

I have doubts about the potential and/or feasibility of the texture cache. The texture cache would work better for a very small scratchpad for sure - a small lookup table size increases the cache hit-to-miss ratio, but maybe it requires such a high LOOKUP_GAP value that any memory performance benefit is offset by the required extra computation.

Christian
sr. member
Activity: 840
Merit: 251

Thanks for this! I've updated the sheet with the Quadro cards I've been messing with (600, 4000 and 4600). I haven't really had enough time to mess with the settings much, so I pretty much let auto tune do its thing then used whatever kernel it decided on after that (and notated in the spreadsheet).
hero member
Activity: 675
Merit: 514
My only prior experience is with sourceforge, but I will see how I can get started on github.
UPDATE: I think they've removed the feature to serve binary distributions as separate downloads.
Sourceforge is ok I think. You could use that for the binaries.
hero member
Activity: 756
Merit: 502
Will you improve sha256 version? I see that your miner can achieve good khashes ratio so i can't wait for fully working gpu version Smiley

No motivation to do so, as Bitcoin mining is so unprofitable.
hero member
Activity: 756
Merit: 502
Out of curiosity how much further do you think you can push nvidia cards? Do you see any improvements coming any time soon or if we see another large improvement it will be due to an unusual find?

My crystal ball is currently malfunctioning.  I advise that you consult a fortune teller of your choosing Wink
newbie
Activity: 19
Merit: 0
Will you improve sha256 version? I see that your miner can achieve good khashes ratio so i can't wait for fully working gpu version Smiley
newbie
Activity: 47
Merit: 0
Out of curiosity how much further do you think you can push nvidia cards? Do you see any improvements coming any time soon or if we see another large improvement it will be due to an unusual find?
hero member
Activity: 756
Merit: 502
When compiling 04-14 in Linux (Ubuntu 12.04), I'm getting the following message not seen in 04-09:

it's a known problem - try targeting a 32 bit executable, as shown in configure.sh

g++-multilib, ia32-libs and libcurl4-dev:i386 should be installed prior to that.
hero member
Activity: 756
Merit: 502
Christian, could you just post the source to git and host the binaries there?

My only prior experience is with sourceforge, but I will see how I can get started on github.

UPDATE: I think they've removed the feature to serve binary distributions as separate downloads.
hero member
Activity: 756
Merit: 502
Thank you for your work!  Smiley
I think you did a great job!

What I miss is a variable to control the system/GPU load.
The --interactive flag does not really work for me, I even experienced greater desktop lags with "interactive 1"...

For interactive you need to let autotune choose a smaller workload. Manually specifying the same -l parameter as for non-interactive mode won't be a good idea.

Interactive mode will be trying that you have around 60 individual CUDA kernel launches per second, and a millisecond of CPU+GPU sleep time inbetween. -> 60 frame updates on the display should be possible, so you can watch movies or porn or whatever while mining Wink

Christian
newbie
Activity: 17
Merit: 0
Thank you for your work!  Smiley
I think you did a great job!

What I miss is a variable to control the system/GPU load.
The --interactive flag does not really work for me, I even experienced greater desktop lags with "interactive 1"...
Jump to: