Pages:
Author

Topic: NVIDIA Kepler (K20) from 134MHash/s to 330MHash/s with CUDA - page 7. (Read 73308 times)

newbie
Activity: 49
Merit: 0
HI,

I have testet the windows version on Windows7 64.

I start the program and the cudart32_50.dll or cudart64_50.dll was mist. I install cuda for windows https://developer.nvidia.com/cuda-downloads and than all run.
It can be that this is a sideeffect from switching to runtime library of cuda.

If an error shows that bitconminer.cpp:174 crash that the parameter -gpugrid to 256 or 512, it came from windows intern limit that no calculation on gpu can run longer than 4 sec.

With GT555M I get:

22.3 - 22.6 MHash/s with OpenCL
22.3 - 22.9 MHash/s with CUDA

I put allw links to reposetories and Errors to the top post.
newbie
Activity: 27
Merit: 0

This version of C:\miners\rpcminer nvopt\rpcminer-mod-cuda.exe is not compatible with the version of Windows you're running.


Er..odd. Nothing changed since last version working from yesterday?

Can you download Dependency Walker from http://www.dependencywalker.com/ (the x86 version) and run it and open the rpcminer-mod-cuda.exe file and see if you get any errors. An error is typically a file shown in red in the top left tree.

I downloaded the x64 version since that is the OS I have. Output below:

http://i.imgur.com/TBOh3eV.png
member
Activity: 79
Merit: 10

This version of C:\miners\rpcminer nvopt\rpcminer-mod-cuda.exe is not compatible with the version of Windows you're running.


Er..odd. Nothing changed since last version working from yesterday?

Can you download Dependency Walker from http://www.dependencywalker.com/ (the x86 version) and run it and open the rpcminer-mod-cuda.exe file and see if you get any errors. An error is typically a file shown in red in the top left tree.
newbie
Activity: 27
Merit: 0
I have updated the Windows build with psychocoder's changes:

https://github.com/cdmackie/rpcminer-mod (use the master branch)

We'll merge them together shortly.

To just run, you only need the bin folder, and run the rpcminer-mod-cuda.exe. There is no need for the ptx files anymore.

To build yourself, you need MSVC 2010 and the CUDA SDK 5.x.

Please post any errors or successes.

This version of C:\miners\rpcminer nvopt\rpcminer-mod-cuda.exe is not compatible with the version of Windows you're running. Ch
eck your computer's system information to see whether you need a x86 (32-bit) or
 x64 (64-bit) version of the program, and then contact the software publisher.

Error message when starting from command line. Win 7 x64
member
Activity: 79
Merit: 10
I have updated the Windows build with psychocoder's changes:

https://github.com/cdmackie/rpcminer-mod (use the master branch)

We'll merge them together shortly.

To just run, you only need the bin folder, and run the rpcminer-mod-cuda.exe. There is no need for the ptx files anymore.

To build yourself, you need MSVC 2010 and the CUDA SDK 5.x.

Please post any errors or successes.
wzl
newbie
Activity: 25
Merit: 0
just compiled it and i'm running it on gtx680, CUDA still new to me, playing with parameters.
Unfortunately I don't have several thousand $ for a K20  Grin
newbie
Activity: 49
Merit: 0
K20c is http://www.techpowerup.com/gpudb/564/NVIDIA_Tesla_K20c.html a high performance GPU card. This cards are created to run math calculations with floating point operations. It is nothing for a home pc. The version for the home pc with the same architecture is GTX Titan http://www.techpowerup.com/gpudb/1996/.html

GTX680 has not the new bit rotate (funnel) operator I think 300+ is not possible.
Theorethic calculation: 1006*8/(3733/160+1194/32)=132 MHash/s   (magic numbers are the count of operations from the binary for this implementation)

IMO the GTX680 GPU is limited to max 132 MHash/s
hero member
Activity: 552
Merit: 500
OK, switched back to faster kernel.

GPU Overview update:

C1070 - old GPU - 30 Streaming Multiprocessors (SM) - ~46MHash/s (slows down with the last patched kernel)
C2050 - old Fermi - 14 SM - ~ 107MHash/s
K20c - new Kepler - 13 SM - ~ 325M - 350 Hash/s

what exactly is K20c ? just so we know here.. and what gpu are you using right now, I cant wait to see 300+ on my 680 ...
member
Activity: 70
Merit: 10
watching with anticpation both threads
newbie
Activity: 49
Merit: 0
OK, switched back to faster kernel.

GPU Overview update:

C1070 - old GPU - 30 Streaming Multiprocessors (SM) - ~46MHash/s (slows down with the last patched kernel)
C2050 - old Fermi - 14 SM - ~ 107MHash/s
K20c - new Kepler - 13 SM - ~ 325M - 350 Hash/s   (options: -aggression=11)

Note: the option -gputhreads change nothing, all kernel are build to run 256 threads.
newbie
Activity: 49
Merit: 0
I have checked in my code to https://github.com/psychocoder-germany/rpcminer-mod. The code is a little bit slower than my first patch. I have put same calculations to cpu to save registers.
Now I only need 32 Register for Fermi and have 100% occupancy. There is no need to create ptx because all versions are inside the binary after compiling.

Sry that I create a new repo but it was my first git commit. I am oldschool and normaly use svn^^

@charliemaggot: Please add windows compile support.
newbie
Activity: 59
Merit: 0
Thanks for the responses guys. I did get this working as i had the wrong port for localhost. 9332 Worked. Only could get one card hashing. Could get 160 easy. 214 was another common Mhash.

300 + Mh but only for 3 or 4 shares then 0.

Back to Cgminer and a steady 170 - 200 for now. I thought this was the Gen i had. Fermi not Kelper Duh.




member
Activity: 79
Merit: 10
@GimpyPrime The default bitcoinminercuda.ptx file was built to "compute 3.5", which was just from the patch in psychocoder's first post in this thread. It is the latest optimised level for NVIDIA Telsa K20 cards. You need to know the compute level for your card (https://developer.nvidia.com/cuda-gpus) and use the appropriate file. Your GTX 670 should be compute level 3.0, so you would need the .30 file.

You can obviously build them yourself, I was just trying to include as much as possible so it could just be downloaded and run, however I didn't include a note about which file was needed. If you have VS 2010 you can edit the buildcuda.bat file and change the compute number to be appropriate for your card - and change the compiler value if you are using VS2012.

@dentldir Should work with the 30 file on your 660. Latest stable driver? Maybe try again after I rebuild it from psychocoder's latest changes.

@camaro69327 Your card is a 2.0 Fermi device, so not sure you would see much benefit from the Kepler (3.0) optimisation in this thread. Wait and see if psychocoder can create some better optimisations. The app is still using the getwork api, so you need to use http://api.bitcoin.cz:8332 or download their stratum proxy.

@psychocoder I'll start building it for Windows once you get changes done, if you could let me know where they are. Thanks.
newbie
Activity: 49
Merit: 0
Please use a pool with getwork support, there is no real strtum support inside the miner.

Today night (german time) I post a new reposetory with my new code. I hope charliemaggot create a windows version.
The new code supports all old GPU (I think till GTX9800). The is no ptx needed.

I think on old GPU we can't get a good speedup because the GPUs has very slow bit operations.

GPU Overview:

C1070 - old GPU - 30 Streaming Multiprocessors (SM) - ~53MHash/s
C2050 - old Fermi - 14 SM - ~ 90MHash/s
K20c - new Kepler - 13 SM - ~ 325MHash/s
full member
Activity: 196
Merit: 100
I have 2x TITAN but they are on win7 for DirectX.
hero member
Activity: 552
Merit: 500
Might as well jump in ....First..no linuix ...just a point click, old guy in Win 7.

I have 2 - 580 GTX. Using CGminer I get 160-200 Mhash PER card . (depends on overclock and using Comp or not)

Trying this..@ first i had the "Unable to load CUDA module: 209" error. I grabbed the other .plx file bitcoinminercuda.20.ptx. Renamed it and...

Now i am getting "curl return Value = 7"

Kinda getting lost here...lol These are some of the Command lines tried....

rpcminer-mod-cuda.exe -aggression=8 -gpugrid=64 -gputhreads=384 -o - url=http://stratum.bitcoin.cz:3333 -user=####### -password=#####
rpcminer-mod-cuda.exe -aggression=8 -gpugrid=256 -gputhreads=512 - url=http://stratum.bitcoin.cz:3333 -user=###### -password=#####
rpcminer-mod-cuda.exe -url=http://stratum.bitcoin.cz:3333 -user=##### -password=####
rpcminer-mod-cuda.exe -url=http://localhost:8332 -user=##### -password=#### <<(set according to Bitcoin.conf)

"curl return Value = 7"

Thanks for all the hard work you guys do, would really like to get these cards working better (they are embarrassed to announce their terrible Hash rates to all the other cards on my network. Especially the 7970 getting 720 Mhash...lol).


dont use the stratum url us this instead.. btcguild.com:8332
hero member
Activity: 552
Merit: 500
this is still very exprimental. at linux os i get 350-370 MH/s while in windows only max 120MH/s.

i will try to make an workaround to it

wow thats a huge jump on linux.. be great to see that for us windows users!

What card are you using?
newbie
Activity: 59
Merit: 0
Might as well jump in ....First..no linuix ...just a point click, old guy in Win 7.

I have 2 - 580 GTX. Using CGminer I get 160-200 Mhash PER card . (depends on overclock and using Comp or not)

Trying this..@ first i had the "Unable to load CUDA module: 209" error. I grabbed the other .plx file bitcoinminercuda.20.ptx. Renamed it and...

Now i am getting "curl return Value = 7"

Kinda getting lost here...lol These are some of the Command lines tried....

rpcminer-mod-cuda.exe -aggression=8 -gpugrid=64 -gputhreads=384 -o - url=http://stratum.bitcoin.cz:3333 -user=####### -password=#####
rpcminer-mod-cuda.exe -aggression=8 -gpugrid=256 -gputhreads=512 - url=http://stratum.bitcoin.cz:3333 -user=###### -password=#####
rpcminer-mod-cuda.exe -url=http://stratum.bitcoin.cz:3333 -user=##### -password=####
rpcminer-mod-cuda.exe -url=http://localhost:8332 -user=##### -password=#### <<(set according to Bitcoin.conf)

"curl return Value = 7"

Thanks for all the hard work you guys do, would really like to get these cards working better (they are embarrassed to announce their terrible Hash rates to all the other cards on my network. Especially the 7970 getting 720 Mhash...lol).
newbie
Activity: 42
Merit: 0
this is still very exprimental. at linux os i get 350-370 MH/s while in windows only max 120MH/s.

i will try to make an workaround to it
sr. member
Activity: 333
Merit: 250
No luck with a 660ti on Windows with any of the .ptx files.  20 and 30 load but the binary crashes after the Target and Done allocating CUDA resource messages.  (rpcminer-mod-cuda.exe has stopped working).

Tried with 314.22 and the CUDA 5.0 dev driver default (306.xx I think?).

Can provide more info if needed.

Thanks.



Pages:
Jump to: