Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 204. (Read 3426932 times)

legendary
Activity: 1400
Merit: 1050
Hmmm.. Perhaps there is something I have missed a few posts back, but I am unable to get the latest nvminer working with DOOM.

The error I keep receiving is Unable to query number of CUDA devices! Is an nVidia driver installed?.  Please note that I have the CUDA 6 toolkit installed, and ccminer on cryptonight works flawlessly.

I'm also on drivers 337.88.
i dont have any tool kits installed that i know of. and the lastest driver will let doom work, but no other cuda miners will work and i get the same error .  how and where do i install cuda tool kit please? been running cuda miners for a long time.  thz   Smiley
Do a clean install of the latest beta driver (as the latest whql seems to have a problem (?)).
I use win 8.1, I have absolutely no problem.
legendary
Activity: 3164
Merit: 1003
Hmmm.. Perhaps there is something I have missed a few posts back, but I am unable to get the latest nvminer working with DOOM.

The error I keep receiving is Unable to query number of CUDA devices! Is an nVidia driver installed?.  Please note that I have the CUDA 6 toolkit installed, and ccminer on cryptonight works flawlessly.

I'm also on drivers 337.88.
i dont have any tool kits installed that i know of. and the lastest driver will let doom work, but no other cuda miners will work and i get the same error .  how and where do i install cuda tool kit please? been running cuda miners for a long time.  thz   Smiley
legendary
Activity: 1400
Merit: 1050
Take a look at this:

#define MULT2(a,j)\
    tmp = a[7+(8*j)];\
    a[7+(8*j)] = a[6+(8*j)];\
    a[6+(8*j)] = a[5+(8*j)];\
    a[5+(8*j)] = a[4+(8*j)];\
    a[4+(8*j)] = a[3+(8*j)] ^ tmp;\   (work is done)
    a[3+(8*j)] = a[2+(8*j)] ^ tmp;\   (work is done)
    a[2+(8*j)] = a[1+(8*j)];\
    a[1+(8*j)] = a[0+(8*j)] ^ tmp;\   (work is done)
    a[0+(8*j)] = tmp;

3 out of 9 lines of code is doing work.

The luffa implementation is moving 32bit words around in the cache. Instead of moving the memory, scramble the memory.
Create an index with a modulo for every write. Rewrite the loop to use 64bit registers .
The modulo can be created with an andmask. since the border of each block is on 2^n boundaries.

3-4 times the speed.

why not testing your suggestion ? It is only one algo to change.
legendary
Activity: 1400
Merit: 1050
Compiling latest djm34 source failed on Linux:

Code:
nvcc -g -O2 -I . -Xptxas "-abi=no -v" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --maxrregcount=80 --ptxas-options=-v -I./compat/jansson -o qubit/qubit_luffa512.o -c qubit/qubit_luffa512.cu
qubit/qubit_luffa512.cu(35): error: invalid redeclaration of type name "uint64_t"
/usr/include/stdint.h(55): here

1 error detected in the compilation of "/tmp/tmpxft_00003bab_00000000-9_qubit_luffa512.compute_35.cpp1.ii".
make[2]: *** [qubit/qubit_luffa512.o] Error 2
make[2]: Leaving directory `/home/henning/CryptCoins/compile/djm34/ccminer'

Changed:
Code:
typedef unsigned long long uint64_t;

to:
Code:
typedef unsigned char uint8_t;
typedef unsigned int uint32_t;
typedef unsigned long long uint64_t;;

in qubit/qubit_luffa512.cu
right, there are defined in stdint.h (my computer doesn't complain about that though).
just remove the 3 lines:
typedef unsigned char uint8_t;
typedef unsigned int uint32_t;
typedef unsigned long long uint64_t;

btw you really don't want to do:  Roll Eyes

typedef unsigned long  uint64_t; (however it depends of the system)

edit: please don't suggest random changes
(although, it doesn't matter as this algo doesn't use uint64_t type)
full member
Activity: 263
Merit: 100
Hmmm.. Perhaps there is something I have missed a few posts back, but I am unable to get the latest nvminer working with DOOM.

The error I keep receiving is Unable to query number of CUDA devices! Is an nVidia driver installed?.  Please note that I have the CUDA 6 toolkit installed, and ccminer on cryptonight works flawlessly.

I'm also on drivers 337.88.

If you not compile from source then you don't need to install cuda toolkit. Just update your nvidia driver to version 340.xx
legendary
Activity: 1400
Merit: 1050
Hmmm.. Perhaps there is something I have missed a few posts back, but I am unable to get the latest nvminer working with DOOM.

The error I keep receiving is Unable to query number of CUDA devices! Is an nVidia driver installed?.  Please note that I have the CUDA 6 toolkit installed, and ccminer on cryptonight works flawlessly.

I'm also on drivers 337.88.
in red your problem
newbie
Activity: 55
Merit: 0
Hmmm.. Perhaps there is something I have missed a few posts back, but I am unable to get the latest nvminer working with DOOM.

The error I keep receiving is Unable to query number of CUDA devices! Is an nVidia driver installed?.  Please note that I have the CUDA 6 toolkit installed, and ccminer on cryptonight works flawlessly.

I'm also on drivers 337.88.
full member
Activity: 266
Merit: 100
v1.2.7 "Split Screen ccMiner" (2014-07-30) Source + Windows Binary release

Add key --height Set height of terminal window. Usage: --height=xx (where xx >= 30)
Height of top section of split screen depends on count of cuda device or choosed by -d key.
Some fixes for output info string

https://github.com/zelante/ccminer/releases/tag/v1.2.7

Good to have an update.

I have noticed though that with your split screen version (previous ones, not this one) my rig crashes 1 out of 3 times when I launch ccminer.
full member
Activity: 263
Merit: 100
v1.2.7 "Split Screen ccMiner" (2014-07-30) Source + Windows Binary release

Add key --height Set height of terminal window. Usage: --height=xx (where xx >= 30)
Height of top section of split screen depends on count of cuda device or choosed by -d key.
Some fixes for output info string

https://github.com/zelante/ccminer/releases/tag/v1.2.7
hero member
Activity: 868
Merit: 1000
Take a look at this:

#define MULT2(a,j)\
    tmp = a[7+(8*j)];\
    a[7+(8*j)] = a[6+(8*j)];\
    a[6+(8*j)] = a[5+(8*j)];\
    a[5+(8*j)] = a[4+(8*j)];\
    a[4+(8*j)] = a[3+(8*j)] ^ tmp;\   (work is done)
    a[3+(8*j)] = a[2+(8*j)] ^ tmp;\   (work is done)
    a[2+(8*j)] = a[1+(8*j)];\
    a[1+(8*j)] = a[0+(8*j)] ^ tmp;\   (work is done)
    a[0+(8*j)] = tmp;

3 out of 9 lines of code is doing work.

The luffa implementation is moving 32bit words around in the cache. Instead of moving the memory, scramble the memory.
Create an index with a modulo for every write. Rewrite the loop to use 64bit registers .
The modulo can be created with an andmask. since the border of each block is on 2^n boundaries.

3-4 times the speed.


wow sp_ , u always came out with at least +100% speed coding suggestion. Hope to see x11 +100% hashrate soon.  Cool
legendary
Activity: 3248
Merit: 1070
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Take a look at this:

#define MULT2(a,j)\
    tmp = a[7+(8*j)];\
    a[7+(8*j)] = a[6+(8*j)];\
    a[6+(8*j)] = a[5+(8*j)];\
    a[5+(8*j)] = a[4+(8*j)];\
    a[4+(8*j)] = a[3+(8*j)] ^ tmp;\   (work is done)
    a[3+(8*j)] = a[2+(8*j)] ^ tmp;\   (work is done)
    a[2+(8*j)] = a[1+(8*j)];\
    a[1+(8*j)] = a[0+(8*j)] ^ tmp;\   (work is done)
    a[0+(8*j)] = tmp;

3 out of 9 lines of code is doing work.

The luffa implementation is moving 32bit words around in the cache. Instead of moving the memory, scramble the memory.
Create an index with a modulo for every write. Rewrite the loop to use 64bit registers .
The modulo can be created with an andmask. since the border of each block is on 2^n boundaries.

3-4 times the speed.
full member
Activity: 252
Merit: 102
OPEN Platform - Powering Blockchain Acceptance
Find find -lboost_thread_win32 in quid-qt.pro and remove "_win32".


Worked a treat, thanks!

BTW, that Doomcoin logo looks a lot like the Parallax logo...  Undecided
legendary
Activity: 3164
Merit: 1003
anybody able to do doom with 3788 drivers on windows 8.1 ? i cant, but new drivers will but all my old bat file ccminer  cuda wont work with that.  Cry
newbie
Activity: 27
Merit: 0
Compiling latest djm34 source failed on Linux:

Code:
nvcc -g -O2 -I . -Xptxas "-abi=no -v" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --maxrregcount=80 --ptxas-options=-v -I./compat/jansson -o qubit/qubit_luffa512.o -c qubit/qubit_luffa512.cu
qubit/qubit_luffa512.cu(35): error: invalid redeclaration of type name "uint64_t"
/usr/include/stdint.h(55): here

1 error detected in the compilation of "/tmp/tmpxft_00003bab_00000000-9_qubit_luffa512.compute_35.cpp1.ii".
make[2]: *** [qubit/qubit_luffa512.o] Error 2
make[2]: Leaving directory `/home/henning/CryptCoins/compile/djm34/ccminer'

Changed:
Code:
typedef unsigned long long uint64_t;

to:
Code:
typedef unsigned long uint64_t;

in qubit/qubit_luffa512.cu
newbie
Activity: 27
Merit: 0
Find find -lboost_thread_win32 in quid-qt.pro and remove "_win32".
full member
Activity: 252
Merit: 102
OPEN Platform - Powering Blockchain Acceptance
OK, massive difference with that new code. Now getting 360 Mh/s.

Thanks for the info, guys.

Has anyone managed to compile the QT wallet for Doom on Linux? I'm getting an error with that ...

Code:
In file included from src/tor/or.h:98:0,
                 from src/tor/addressmap.c:9:
src/tor/compat_libevent.h:18:25: fatal error: event2/util.h: No such file or directory
 #include
                         ^
compilation terminated.
make: *** [build/addressmap.o] Error 1

sudo apt-get install libevent-dev

Thanks, it compiled for longer but threw up a new error :

Code:
/usr/bin/ld: cannot find -lboost_thread_win32
collect2: error: ld returned 1 exit status
make: *** [doomcoin-qt] Error 1
11:02:04: The process "/usr/bin/make" exited with code 2.
Error while building/deploying project bitcoin-qt (kit: Desktop)
When executing step 'Make'

win32  Huh
legendary
Activity: 1400
Merit: 1050
Whoa, just noticed this. How do I get this to compile with DJM34 source code from Github? I'm compiling on Linux.

OR...

DJM34, do you have plans to add this speed increase to your github master?

just copy the pastebin into the doom.cu
and compile again

you need to add the unified qubit luffa and cuda luffa , called doom luffa

i just created a new filter as it is confusing right now...
I would change it to doom in ccminer, but that would confuse even more people  Grin
(I will update my github soon... right now my version is still using qubit hack...)

yeah i named the folder doom and put in doom_luffa512 and doom.cu., leave qubit alone, poor guy  Grin
I will do that in time...  Grin
I have updated my repository with the "improved" code...  That doesn't fix the duplicate share problem though...
legendary
Activity: 3248
Merit: 1070
Whoa, just noticed this. How do I get this to compile with DJM34 source code from Github? I'm compiling on Linux.

OR...

DJM34, do you have plans to add this speed increase to your github master?

just copy the pastebin into the doom.cu
and compile again

you need to add the unified qubit luffa and cuda luffa , called doom luffa

i just created a new filter as it is confusing right now...
I would change it to doom in ccminer, but that would confuse even more people  Grin
(I will update my github soon... right now my version is still using qubit hack...)

yeah i named the folder doom and put in doom_luffa512 and doom.cu., leave qubit alone, poor guy  Grin
legendary
Activity: 1400
Merit: 1050
Whoa, just noticed this. How do I get this to compile with DJM34 source code from Github? I'm compiling on Linux.

OR...

DJM34, do you have plans to add this speed increase to your github master?

just copy the pastebin into the doom.cu
and compile again

you need to add the unified qubit luffa and cuda luffa , called doom luffa

i just created a new filter as it is confusing right now...
I would change it to doom in ccminer, but that would confuse even more people  Grin
(I will update my github soon... right now my version is still using qubit hack...)
Jump to: