Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1030. (Read 3426921 times)

sr. member
Activity: 350
Merit: 250
i run it, and just on the first hash attempt the driver crashes and recovers, then it carries on

ok so i had everything installed, can not for the life of me figure out how to clone the git so i am going to give up lol
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
Do you mean at the beginning when you run cudaMiner that it hangs for so long that your driver crashes?
Because to me it does hang for some time, but it's not that long to trigger the driver recovery.

Anyway, after the initial hang, to me everything is perfect.

The only time I get crashes if cudaMiner states that there's not enough memory (with the current kernel configuration). If it's the same to you, you could probably go and try T20x1 and maybe even a bit higher until you're using the maximum amount of VRAM. I only have a Kepler, so I'm not entirely sure, but that is what I'd try.
sr. member
Activity: 350
Merit: 250
T16x1 gives me 2.2GB memory used. hashing at 3.07khash/s

it does cause a driver crash and debug is stating "DEBUG: got new work in 0ms" and sometimes says 15ms.
so i take it that it is still running ok
DBG
member
Activity: 119
Merit: 100
Digital Illustrator + Software/Hardware Developer
so 2012 100%? and i take it that i will need to compile it all on my system with the gpu in it? cant d it on my laptop for example?

When it comes to compiling/cross-compiling, all you need are the correct libraries and some time (i.e. you could compile this on a computer that only has integrated graphics).  

POST HI-JACKING (semi-related)


I actually have a setup for nightly builds (I break things a lot when I code so I needed a continuous compiling/integration solution) and would be happy to compile/host new cudaMiner builds (the settings I have now only create a new package/packages when there has been a commit and then it waits 30 mins before queuing it).  Travis CI (which I'm not a huge fan of but it's godlike compared to Jenkins) is available to all GitHub projects and I'll probably pull the current master branch and try and come up with a working YAML file.  Then if cbuchner1 wants to use it (assuming I can nail down all the prerequisites) he can.
sr. member
Activity: 350
Merit: 250
i will give it a try with the new drivers and see if it runs ok
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
it was using 2gb out of 3gb using the files i was using. but its taking for ever to get this compiled to run a test. as i don't have copies of both i was sent one. and a new driver has just been released which has now cut my tv off while installing. lol

Try to go a bit higher to like: T16x1 or T18x1
sr. member
Activity: 350
Merit: 250
will my 780 out last the n rise?
and 12GB? mmmmm me want one
hero member
Activity: 756
Merit: 502
there are some GTX 660 (non-Ti) with 4 GB DDR3 RAM sold in bulk for 150 Euros. 192 bit memory interface. Possibly OEM ware. 4 SMX.

This would be ideal for scrypt-jane. The only downside is that it is a dual slot card requiring an extra 6 pin power connector.

My next miner will have 4 GTX 640 and 1 GTX 660 card in it. Total of 20 GB video RAM. Wink  Possibly hitting 12 kHash/s.

Due to the large VRAM, these cards will survive the next N-factor increase, which is due in May this year. Some 2 GB cards will run into problems then.

the latest 12 GB Tesla cards would be fun to mine with!

Christian
sr. member
Activity: 350
Merit: 250
it was using 2gb out of 3gb using the files i was using. but its taking for ever to get this compiled to run a test. as i don't have copies of both i was sent one. and a new driver has just been released which has now cut my tv off while installing. lol
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
Edit: I didnt trust the T4x4 because of the driver crash. T14x1 is what auto tune sets me with around 2.7hash/s and no driver crash. I did try T10x1 and T20x1 which were posted earlier but they gave hash rates of about 1.5khash/s for me and crashed my driver aswell. I am using the lastest windows 7 driver for it

I am also mining using a 64 bit compile which may have effect as 32bit made a huge difference on the official cudaminer release

Check your VRAM usage with each setup. As far as I noticed, more VRAM = faster hashrate with scrypt-jane.
Autotune gave me K13x1 which used 1694 MB VRAM, but after manually picking and testing values
I ended up with K14x1 which uses 1822 MB (out of 2GB) which is the maximum I could get cudaMiner to use without the driver crashing (note, idle memory use is 0 MB due to it being a secondary card) which resulted in slighty faster hashrates.

Now I'm reaching 2.44 kH/s with a non-Ti GTX 660.
hero member
Activity: 756
Merit: 502
Is it normal that cudaminer is not using 100% of gpu ram? Using normal scrypt mining

Code:
+------------------------------------------------------+
| NVIDIA-SMI 5.319.17   Driver Version: 319.17         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K10.G2.8GB    Off  | 0000:05:00.0     Off |                    0 |
| N/A   58C    P0   108W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K10.G2.8GB    Off  | 0000:06:00.0     Off |                    0 |
| N/A   53C    P0   108W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K10.G1.8GB    Off  | 0000:85:00.0     Off |                    0 |
| N/A   59C    P0   115W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K10.G1.8GB    Off  | 0000:86:00.0     Off |                    0 |
| N/A   68C    P0   117W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0     24937  ./CudaMiner/cudaminer                               4297MB  |
|    1     24937  ./CudaMiner/cudaminer                               4297MB  |
|    2     24937  ./CudaMiner/cudaminer                               4297MB  |
|    3     24937  ./CudaMiner/cudaminer                               4297MB  |
+-----------------------------------------------------------------------------+

perfectly normal.
sr. member
Activity: 350
Merit: 250
so 2012 100%? and i take it that i will need to compile it all on my system with the gpu in it? cant d it on my laptop for example?
and how do i clone the git into visual studio?

i figured out the lib curl install from their guide which literally states to run the command prompt and "Then run 'nmake vc' in curl's root directory."
once i know how to clone the git i should be ready
full member
Activity: 196
Merit: 100
Don't waste your time with 2013.  Cuda 5.5 toolkit is not compatible.
newbie
Activity: 19
Merit: 0
Is it normal that cudaminer is not using 100% of gpu ram? Using normal scrypt mining

Code:
+------------------------------------------------------+
| NVIDIA-SMI 5.319.17   Driver Version: 319.17         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K10.G2.8GB    Off  | 0000:05:00.0     Off |                    0 |
| N/A   58C    P0   108W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K10.G2.8GB    Off  | 0000:06:00.0     Off |                    0 |
| N/A   53C    P0   108W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K10.G1.8GB    Off  | 0000:85:00.0     Off |                    0 |
| N/A   59C    P0   115W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K10.G1.8GB    Off  | 0000:86:00.0     Off |                    0 |
| N/A   68C    P0   117W / 117W |     1085MB /  3583MB |     99%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0     24937  ./CudaMiner/cudaminer                               4297MB  |
|    1     24937  ./CudaMiner/cudaminer                               4297MB  |
|    2     24937  ./CudaMiner/cudaminer                               4297MB  |
|    3     24937  ./CudaMiner/cudaminer                               4297MB  |
+-----------------------------------------------------------------------------+
sr. member
Activity: 350
Merit: 250
Using T4x4 with -I 0 gives me 2.88khash/s in windows 7. It does cause my drivers to crash when it initially loads but they then recover and my gpu is still 99% and 2.3gb of memory used. So I take it that the miner is still running as they normally hang after a driver crash.

So a 660ti is faster at this then my gtx 780? Whyyyy? This algorithm is going to take a lot of work to figure stuff out. Although, are we finding better speeds in linux then windows?

I stopped trying to get the linux version running because I couldnt get cuda 5.5 to install for me in ubuntu 13.10 so I dont know if it is getting better performance or not

Edit: I didnt trust the T4x4 because of the driver crash. T14x1 is what auto tune sets me with around 2.7hash/s and no driver crash. I did try T10x1 and T20x1 which were posted earlier but they gave hash rates of about 1.5khash/s for me and crashed my driver aswell. I am using the lastest windows 7 driver for it

I am also mining using a 64 bit compile which may have effect as 32bit made a huge difference on the official cudaminer release

EDIT 2: ok so i am going to follow the example Treggar posted on compiling this, i am using Express 2013, but i do have every other version, free from my university ;-) so i am going to try and compile my own 32bit from that. The instructions i will be using are as follows;

1) Install visual studio 2012 express (free from microsoft)
2) Install the cuda 5.5 toolkit (free from nvidia)
3) Download libcurl source & read the README on how to build the library for windows
4) Download the pthread library for windows  - I downloaded prebuilt lib
5) Open visual c command prompt & build libcurl library  - I built static lib and included it in the exe
6) Clone the git inside visual studio
7) Edit the properties of the project to include the libcurl & pthread includes & libs
Cool Build release
9) Profit from YAC!!~? 
newbie
Activity: 4
Merit: 0

Are you using the latest version from git or a binary release? I see a commit message on Dec 28th that says "add back support for chunked memory allocation and texture cache to Kepler kernel. Slight speed-ups with -C 1 are seen." which implies he removed it at one point, possibly when upgrading to CUDA 5.5.


Was using the latest binary, the 12-18 release. I'll try compiling from git, thanks.

edit: Is there a guide somewhere to compiling on windows? I got all the components but I'm unsure what to do with them.
sr. member
Activity: 350
Merit: 250
Thanks :-)

Would be great if we could get it doing lots at the same time. Although my 780 seems to be siyting at 100% usage although I have the intensity set to 0
hero member
Activity: 756
Merit: 502
I am breaking some speed records for Yacoin mining today Wink

WOW. 1.96 kHash/s on GT 750M with 4 GB DDR3 VRAM using latest github code. -l K2x8.   This is a laptop. And it's only using 2077MB out of 4096, including running an Xorg display on Kubuntu 12.04 64bit. And the device remains usable even while mining without the -i 1 flag!

Code:
[2014-01-07 12:20:10] GPU #0: GeForce GT 750M with compute capability 3.0
[2014-01-07 12:20:10] GPU #0: interactive: 0, tex-cache: 0 , single-alloc: 0
[2014-01-07 12:20:10] GPU #0: using launch configuration K2x8
[2014-01-07 12:20:12] GPU #0: GeForce GT 750M, 1.87 khash/s
[2014-01-07 12:20:18] Stratum detected new block
[2014-01-07 12:20:19] GPU #0: GeForce GT 750M, 1.89 khash/s
[2014-01-07 12:20:33] Stratum detected new block
[2014-01-07 12:20:33] GPU #0: GeForce GT 750M, 1.93 khash/s
[2014-01-07 12:20:54] GPU #0: GeForce GT 750M, 1.94 khash/s
[2014-01-07 12:20:54] accepted: 1/1 (100.00%), 1.94 khash/s (yay!!!)
[2014-01-07 12:21:53] GPU #0: GeForce GT 750M, 1.96 khash/s
[2014-01-07 12:21:53] accepted: 2/2 (100.00%), 1.96 khash/s (yay!!!)
[2014-01-07 12:22:43] Stratum detected new block
[2014-01-07 12:22:44] GPU #0: GeForce GT 750M, 1.96 khash/s

Code:
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 750M     Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   90C  N/A     N/A /  N/A |   2077MiB /  4095MiB |     N/A      Default |

I think the equivalent desktop part would be a GT 640 with 4 GB DDR3 memory. On my testing with Windows I haven't seen those 1.9 kHash/s so far.    My next mining build might consist of a mainboard running 5 GT 640 cards with 4GB each.


And a GTX 660Ti Direct CU II OC from Asus on Ubuntu 9.04 32 bit (EDIT: previously incorrectly stated 12.04 64bit)

Code:
[2014-01-07 13:22:14] GPU #0: GeForce GTX 660 Ti with compute capability 3.0
[2014-01-07 13:22:14] GPU #0: interactive: 0, tex-cache: 0 , single-alloc: 1
[2014-01-07 13:22:14] GPU #0: using launch configuration K7x3
[2014-01-07 13:22:16] GPU #0: GeForce GTX 660 Ti, 2.86 khash/s
[2014-01-07 13:22:29] GPU #0: GeForce GTX 660 Ti, 3.09 khash/s
[2014-01-07 13:22:29] accepted: 1/1 (100.00%), 3.09 khash/s (yay!!!)
[2014-01-07 13:22:49] GPU #0: GeForce GTX 660 Ti, 3.12 khash/s
[2014-01-07 13:22:49] accepted: 2/2 (100.00%), 3.12 khash/s (yay!!!)

Code:
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 660 Ti  Off  | 0000:01:00.0     N/A |                  N/A |
| 62%   70C  N/A     N/A /  N/A |   2875MiB /  3071MiB |     N/A      Default |

On my 780 Tis I think I use -l T4x4 (EDIT: correction. It was -l T7x3 but -l T23x1 is actually slightly better) to get 3.2-3.3 kHash/s. I will check again when I get home.

High end cards are partially idling when doing scrypt-jane. It might be possible to do simultaneous scrypt-jane and scrypt mining, to use all resources on the cards. However it would require some very smart multithreading logic, and  simultaneous execution of kernels on the GPU.

Christian
sr. member
Activity: 350
Merit: 250
Ok thanks :-)
My miner seems to sit around 2.77khash/s now. Its on 75% so 3 of 4 accepted. I now have 3 transactions but my wallet is still only showing 2 qctual and saying 3 at the moment. So right now if tye 3rd was real im at 150yacoins when they mature
newbie
Activity: 34
Merit: 0
T12x32 is way to big for scrypt-jane. That's what you should use on regular scrypt coins. For yacoin that kind of config would require something like 50GB of GPU ram.

Autotune is not perfect. Look at the chart (run with -D to show debug info) then try configs that are just outside the populated area of the chart, until you've explored all the edges and found your limits. When you go too far it will print something like this: "GPU #0:Launch config 'T11x2' requires too much memory!"

I'm using T20x1 on my 780 and getting 3.4 kHash. T10x2 works well for me as well.

Thanks for the tips. I've tried some but T10x1 seems to be the best still. T20x1 doesn't perform well at all on my 780 - only 1.2 kH/s :\

Driver issue maybe? Running 331.92 here.

I'm running OS 10.9 on a hackintosh, so I don't know that I can help you out there. Although I did try it out on two gtx 670s at work today and they got 2 kH/s each with autotune with 331.20 on Linux. Are you running CUDA 5.5?

For those of you attempting to hash QQC or YBC or any other scrypt-N coin... you're probably going to need to modify some code and recompile. It looks to me like the time value used to calculate the N is hardcoded to Yacoin's start time. See GetNfactor() in scrypt-jane.cpp. edit: just reread the last two pages and I see that some of you have already done that. doh


Can someone please explain this to me? It looks contradictory.

-C, --texture-cache   comma separated list of flags (0/1) specifying which of the CUDA devices shall use the texture cache for mining. Kepler devices will profit.

This says kepler devices will profit.

GPU #0: GeForce GTX 770 with compute capability 3.0
GPU #0: the 'K' kernel ignores the texture cache argument

My 770 is a kepler device, but cudaminer says that my kernel ignores the texture cache...

So which is it and how will my device profit from a launch option which is ignored?

Are you using the latest version from git or a binary release? I see a commit message on Dec 28th that says "add back support for chunked memory allocation and texture cache to Kepler kernel. Slight speed-ups with -C 1 are seen." which implies he removed it at one point, possibly when upgrading to CUDA 5.5.

I cant use all my vram either. cbuchner1 what setring are you using to get 3.2kh/s? And is it using all available memory?

I just checked my yacoin. What is immature? I have 99.88 immature yacoins over night. Well in 5 hours but I dont know what it means

My yacoin config.
./cudaminer -u coercion.1 -p x -a scrypt-jane -o 127.0.0.1:9942 -l T21x1 -H 1

I just realized I can't run 21x1 since I pulled today. I ran that config all last night and now I can only run T20x1.

Immature means you can't spend them yet because they need to be included in more blocks so a fork doesn't screw you or anyone you transact with.
Jump to: