Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 220. (Read 3426989 times)

legendary
Activity: 1400
Merit: 1000
Anyone here mining Boolberry and if so what miner are you using?

I can't get mine to get going.

Linux or Windows?

On windows 7. Running the boolberry-opencl 0.2.0.28.

I did get it to run but it only picks up the first card (gpu 0 ). It shows "error:  CL_INVALID_VALUE" for the other cards. I am trying to mine with just gpu 1, 2 and 3 and leave gpu 0 idle.
legendary
Activity: 1400
Merit: 1000
Anyone here mining Boolberry and if so what miner are you using?

I can't get mine to get going.
hero member
Activity: 644
Merit: 500
mmh, too much lag with this version

chrome lag too much, everything is really slow, i don't like it
I guess it is the consequence of increasing the number of threads (my cards are not plugged into any monitor).
This need to be adjusted...

I will change it back to what it was, I think that stuff is a bit too experimental... (also the change was targeted at the 780ti and other big cards with plenty of cuda cores... might be a lot for 750ti plugged into a monitor and its gpu usage is already pretty close to 99%)
Yesterday release was mainly for keccak

my other rig isn't plugget as well(is fine there), but my main(which is a gaming rig also, well not much anymore since 750ti aren't top for gaming, i'm waiting for big nvidia beasts for that) is plugged yes

Watch your words Cheesy I can play Skyrim on full, so that little 750ti monster suffices for me Cheesy But yeah, even my 760 still does better at gaming, much higher fps. Can't wait for the 8xx monsters Wink The 750ti still makes a great gpu for a casual gamer or for HTPC's Wink
legendary
Activity: 3248
Merit: 1070
mmh, too much lag with this version

chrome lag too much, everything is really slow, i don't like it
I guess it is the consequence of increasing the number of threads (my cards are not plugged into any monitor).
This need to be adjusted...

I will change it back to what it was, I think that stuff is a bit too experimental... (also the change was targeted at the 780ti and other big cards with plenty of cuda cores... might be a lot for 750ti plugged into a monitor and its gpu usage is already pretty close to 99%)
Yesterday release was mainly for keccak

my other rig isn't plugget as well(is fine there), but my main(which is a gaming rig also, well not much anymore since 750ti aren't top for gaming, i'm waiting for big nvidia beasts for that) is plugged yes
legendary
Activity: 1400
Merit: 1050
mmh, too much lag with this version

chrome lag too much, everything is really slow, i don't like it
I guess it is the consequence of increasing the number of threads (my cards are not plugged into any monitor).
This need to be adjusted...

I will change it back to what it was, I think that stuff is a bit too experimental... (also the change was targeted at the 780ti and other big cards with plenty of cuda cores... might be a lot for 750ti plugged into a monitor and its gpu usage is already pretty close to 99%)
Yesterday release was mainly for keccak
legendary
Activity: 3248
Merit: 1070
mmh, too much lag with this version

chrome lag too much, everything is really slow, i don't like it
legendary
Activity: 1400
Merit: 1050
the changes for whirlcoin are just in the whirlcoin.cu? thus i don't re-compile everything
no, I made several change in whirlcoin.cu and cuda_whirlpool512.cu.
Also it use a specific maxregcount, you need to update ccminer.vxproj (which will break everything since it will want to compile keccak etc... ).
Personnally, I would recompile everything to avoid editing everything by hand (and your computer is fast...)

receiving compute 50 error, isn't falgged for that?
that's strange. I looked before compiling at the compiler option, compute_50; sm_50 wasn't there.
Then I start to build, got the error you mentionned.
Went back to the compiler option and the compute_50 was there.  Grin
You need to edit the compilation option and remove compute_50 sm_50

right click "ccminer"; references; configuration properties; CUDA C/C++; device; edit code generation then remove "compute_50, sm_50"

ok it works now, random problems lol
something like that, I found the problem, it was in ccminer.vxproj (updated in github)
For some strange reason, a compute_50 was still there (even though I had removed it with visual studio property editor...)

another error poped up

cannot open ccminer.rc

ok solved, i just removed it lol, dunno what was that
not sure either... must be some file generated by visual...
legendary
Activity: 3248
Merit: 1070
the changes for whirlcoin are just in the whirlcoin.cu? thus i don't re-compile everything
no, I made several change in whirlcoin.cu and cuda_whirlpool512.cu.
Also it use a specific maxregcount, you need to update ccminer.vxproj (which will break everything since it will want to compile keccak etc... ).
Personnally, I would recompile everything to avoid editing everything by hand (and your computer is fast...)

receiving compute 50 error, isn't falgged for that?
that's strange. I looked before compiling at the compiler option, compute_50; sm_50 wasn't there.
Then I start to build, got the error you mentionned.
Went back to the compiler option and the compute_50 was there.  Grin
You need to edit the compilation option and remove compute_50 sm_50

right click "ccminer"; references; configuration properties; CUDA C/C++; device; edit code generation then remove "compute_50, sm_50"

ok it works now, random problems lol
something like that, I found the problem, it was in ccminer.vxproj (updated in github)
For some strange reason, a compute_50 was still there (even though I had removed it with visual studio property editor...)

another error poped up

cannot open ccminer.rc

ok solved, i just removed it lol, dunno what was that
legendary
Activity: 1400
Merit: 1050
the changes for whirlcoin are just in the whirlcoin.cu? thus i don't re-compile everything
no, I made several change in whirlcoin.cu and cuda_whirlpool512.cu.
Also it use a specific maxregcount, you need to update ccminer.vxproj (which will break everything since it will want to compile keccak etc... ).
Personnally, I would recompile everything to avoid editing everything by hand (and your computer is fast...)

receiving compute 50 error, isn't falgged for that?
that's strange. I looked before compiling at the compiler option, compute_50; sm_50 wasn't there.
Then I start to build, got the error you mentionned.
Went back to the compiler option and the compute_50 was there.  Grin
You need to edit the compilation option and remove compute_50 sm_50

right click "ccminer"; references; configuration properties; CUDA C/C++; device; edit code generation then remove "compute_50, sm_50"

ok it works now, random problems lol
something like that, I found the problem, it was in ccminer.vxproj (updated in github)
For some strange reason, a compute_50 was still there (even though I had removed it with visual studio property editor...)
legendary
Activity: 3248
Merit: 1070
the changes for whirlcoin are just in the whirlcoin.cu? thus i don't re-compile everything
no, I made several change in whirlcoin.cu and cuda_whirlpool512.cu.
Also it use a specific maxregcount, you need to update ccminer.vxproj (which will break everything since it will want to compile keccak etc... ).
Personnally, I would recompile everything to avoid editing everything by hand (and your computer is fast...)

receiving compute 50 error, isn't falgged for that?
that's strange. I looked before compiling at the compiler option, compute_50; sm_50 wasn't there.
Then I start to build, got the error you mentionned.
Went back to the compiler option and the compute_50 was there.  Grin
You need to edit the compilation option and remove compute_50 sm_50

right click "ccminer"; references; configuration properties; CUDA C/C++; device; edit code generation then remove "compute_50, sm_50"

ok it works now, random problems lol
legendary
Activity: 1400
Merit: 1050
the changes for whirlcoin are just in the whirlcoin.cu? thus i don't re-compile everything
no, I made several change in whirlcoin.cu and cuda_whirlpool512.cu.
Also it use a specific maxregcount, you need to update ccminer.vxproj (which will break everything since it will want to compile keccak etc... ).
Personnally, I would recompile everything to avoid editing everything by hand (and your computer is fast...)

receiving compute 50 error, isn't falgged for that?
that's strange. I looked before compiling at the compiler option, compute_50; sm_50 wasn't there.
Then I start to build, got the error you mentionned.
Went back to the compiler option and the compute_50 was there.  Grin
You need to edit the compilation option and remove compute_50 sm_50

right click "ccminer"; references; configuration properties; CUDA C/C++; device; edit code generation then remove "compute_50, sm_50"
legendary
Activity: 3248
Merit: 1070
new version of ccminer:

* small improvement over whirlcoin (increased the total number of threads): gpu usage is 99% for the gtx780ti (as well as the gtx750ti)
in terms of performance: I was getting before for gtx780ti+gtx750ti = around 19MH/s now I get 20.6MHash/s.

* added keccak256 (maxcoin) to ccminer
It isn't a port of cudaminer as it makes use of the main function of keccak512 instead of the one of cudaminer.

In terms of performance. cudaminer gives over my gtx780ti (OC) 293MHash/s, with this kernel I get around 300MHash/s
(I am not sure if the problem of low share getting registered  by pool is still present or not... I would assume yes).
Please watch the number of rejected shares. This one can be high (it is due to the use of a difficulty multiplier, and some tuning is still needed there)

The changes have been included into https://github.com/djm34/ccminer

It should compile on linux however it is untested.
I don't plan to release binaries.

Hi no luck compiling on Linux..

Code:
snip

Contents of Algo256:
Code:
henning@KontorPC ~/CryptCoins/compile/djm34/ccminer $ ls Algo256/
cuda_keccak256.cu  cuda_keccak256.o  keccak256.cu  keccak256.o

I updated github...

can't compile it for maxwell unsupported gpu compute 50, makefile report only 30 and 35
full member
Activity: 137
Merit: 100
750Ti @ 1300/1500 gives ~ 270

mine are at 1350(stock) 1450(+200) and only 235h per card...

using this https://github.com/tsiv/ccminer-cryptonight
Palit DualX (1300/1500 stock) makes 270 using
-l 8x40 switch

still 235, are you using beta drivers? i'm at 337.88

8x60 seems to give the best hash rates on a 750 Ti. If you're on Windows you're also slightly limited by the default bfactor 6, you could try lowering it and the default bsleep of 100 microseconds but your interactivity will take a hit.

My rig has 6x Palit GeForce GTX 750 Ti StormX Dual, each hitting about 280 H/s at the default factory overclock using 8x60. The piece of shit Asus 750 Ti DC2OCwhatever on my win machine does like 244. Marvelous piece of engineering by Asus, take 6 GHz memory chips and fuck something up that they can't run beyond 5.4 GHz.

8x60 is better yes i'm getting 240 now or slightly more, for bfactor and bsleep how much i can lower them?

is nvidia like amd with hynix and elpidia? could explain why some card are better than others


p.s. tried 6 and 66 only 5h more 245h per card

You can probably get away with bfactor 1, maybe even 0. If you're not concerned about interactivity I'm pretty sure you can run bsleep 0. The way it works is described in the help text (and maybe readme, can't remember) but here's the quick recap:

The bfactor option controls how much of the biggest main loop is done in a single kernel launch. That particular kernel gets split into 2^bfactor pieces. At bfactor 0 (default on Linux because you don't need to worry about your OS being a dick and resetting the display driver because your CUDA kernel is taking too long to run) you get 2^0 = 1 meaning doing the whole thing in a single launch. At 1 you get 2^1 = 2 parts and at the Windows default 6 it gets split into 2^6 = 64 parts. 6 seems to be a reasonable balance between interactivity and performance. Of course if you're running more than one card you only need to worry about interactivity on your primary display GPU. So you could run --bfactor 6,1 which would make the primary GPU run the 64 part split and be fairly interactive while the rest of your cards would run a 2 part split and spend a little more time working instead of sleeping.

After typing that I just realized how insignificant the effect of the interactivity hack is after all Tongue It's 0.0064 seconds added to the roughly 1.5 second run time of the second loop, not really a big deal. Well, there's probably some overhead for each kernel launch but it's still not much.

And then we have --bsleep, the amount of time to wait before launching the kernel for the next part of the split. Windows defaults of bfactor 6 and bsleep 100 leaves you with 64 parts with 100 microseconds doing nothing after each part, for a total of 0.0064 seconds spent sleeping instead of working. Well, there's probably some overhead involved with each kernel launch too. Might be just the fact that Windows is doing stuff on the GPU that's slowing it down.

What a useless post.

PS. The temperature in my apartment just hit 35C, fuck my balls and fuck trying to think straight in this shit Grin
legendary
Activity: 1400
Merit: 1050
new version of ccminer:

* small improvement over whirlcoin (increased the total number of threads): gpu usage is 99% for the gtx780ti (as well as the gtx750ti)
in terms of performance: I was getting before for gtx780ti+gtx750ti = around 19MH/s now I get 20.6MHash/s.

* added keccak256 (maxcoin) to ccminer
It isn't a port of cudaminer as it makes use of the main function of keccak512 instead of the one of cudaminer.

In terms of performance. cudaminer gives over my gtx780ti (OC) 293MHash/s, with this kernel I get around 300MHash/s
(I am not sure if the problem of low share getting registered  by pool is still present or not... I would assume yes).
Please watch the number of rejected shares. This one can be high (it is due to the use of a difficulty multiplier, and some tuning is still needed there)

The changes have been included into https://github.com/djm34/ccminer

It should compile on linux however it is untested.
I don't plan to release binaries.

Hi no luck compiling on Linux..

Code:
g++  -g -O2 -pthread -L/usr/local/cuda/lib64  -o ccminer ccminer-cpu-miner.o ccminer-util.o ccminer-bmw.o ccminer-blake.o ccminer-groestl.o ccminer-jh.o ccminer-keccak.o ccminer-skein.o ccminer-hefty1.o ccminer-scrypt.o ccminer-sha2.o heavy/heavy.o heavy/cuda_blake512.o heavy/cuda_combine.o heavy/cuda_groestl512.o heavy/cuda_hefty1.o heavy/cuda_keccak512.o heavy/cuda_sha256.o ccminer-fuguecoin.o Algo256/cuda_fugue256.o ccminer-fugue.o ccminer-groestlcoin.o Algo256/cuda_groestlcoin.o ccminer-myriadgroestl.o Algo256/cuda_myriadgroestl.o JHA/jackpotcoin.o JHA/cuda_jha_keccak512.o JHA/cuda_jha_compactionTest.o quark/cuda_quark_checkhash.o quark/cuda_jh512.o quark/cuda_quark_blake512.o quark/cuda_quark_groestl512.o quark/cuda_skein512.o quark/cuda_bmw512.o quark/cuda_quark_keccak512.o quark/quarkcoin.o quark/animecoin.o quark/cuda_quark_compactionTest.o cuda_nist5.o ccminer-cubehash.o ccminer-echo.o ccminer-luffa.o ccminer-shavite.o ccminer-simd.o ccminer-hamsi.o ccminer-hamsi_helper.o ccminer-shabal.o ccminer-whirlpool.o qubit/qubit.o qubit/qubit_luffa512.o x13/x14.o x13/fresh.o x13/x17.o x13/x13.o x13/cuda_x13_hamsi512.o x13/cuda_x13_fugue512.o x13/x15.o x13/cuda_shabal512.o x13/cuda_whirlpool512.o x13/whirlcoin.o Algo256/cuda_keccak256.o Algo256/keccak256.o -L/usr/lib/x86_64-linux-gnu -lcurl compat/jansson/libjansson.a -lpthread  -lcudart -static-libstdc++ -fopenmp -lcrypto -lssl  -lcrypto -lssl
g++: error: Algo256/cuda_fugue256.o: No such file or directory
g++: error: Algo256/cuda_groestlcoin.o: No such file or directory
g++: error: Algo256/cuda_myriadgroestl.o: No such file or directory
make[2]: *** [ccminer] Error 1
make[2]: Leaving directory `/home/henning/CryptCoins/compile/djm34/ccminer'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/henning/CryptCoins/compile/djm34/ccminer'
make: *** [all] Error 2

Contents of Algo256:
Code:
henning@KontorPC ~/CryptCoins/compile/djm34/ccminer $ ls Algo256/
cuda_keccak256.cu  cuda_keccak256.o  keccak256.cu  keccak256.o

I updated github...
full member
Activity: 137
Merit: 100
750Ti @ 1300/1500 gives ~ 270

mine are at 1350(stock) 1450(+200) and only 235h per card...

using this https://github.com/tsiv/ccminer-cryptonight
Palit DualX (1300/1500 stock) makes 270 using
-l 8x40 switch

still 235, are you using beta drivers? i'm at 337.88

8x60 seems to give the best hash rates on a 750 Ti. If you're on Windows you're also slightly limited by the default bfactor 6, you could try lowering it and the default bsleep of 100 microseconds but your interactivity will take a hit.

My rig has 6x Palit GeForce GTX 750 Ti StormX Dual, each hitting about 280 H/s at the default factory overclock using 8x60. The piece of shit Asus 750 Ti DC2OCwhatever on my win machine does like 244. Marvelous piece of engineering by Asus, take 6 GHz memory chips and fuck something up that they can't run beyond 5.4 GHz.

tsiv, you may want to pull my keccak implementation - getting a lot of reports it works better on Kepler.

da70729, fd9d114 and f26efdb, right? Pulled them into my local repo, building right now and will push to my git repo if everything seems fine. As far as I can tell, the last commit doesn't really do anything useful though. Apparently __CUDA_ARCH__ is only defined when nvcc (or cudafe++, whatever) is processing device code and the define (like pretty much every single arch based define in ccminer) is in the host code, meaning you'll always hit the else branch on #if __CUDA_ARCH__ >= something. I might be wrong on this, haven't had the time to see what actually goes in the final product but on the other hand I've never seen any improvement on these compute based specific macros either which kind of supports the fact that they tend to default to the "ah well, we'll use the crappy default thingy then" branch.

I think my last commit was making the scratchpad pointer restricted - seemed to provide a small boost even on Maxwell.

EDIT: Also, any thoughts on the memory allocation bug?

I prefer to think of it more as a feature Cheesy

Yea, I know it's a bitch. One of last ideas I've got left for the first and third main loops involves rearranging the scratchpad layout for better access patterns instead of 2MB strides between hashes and at first glance I think having separate scratchpads would fuck that up. If that doesn't pan out I'll probably turn the single 2MB*hashcount allocation into hashcount 2MB allocations like you did, maybe make it an option. I tend to forget how much the single alloc can mess with stuff because I'm running on a headless rig that simply doesn't use the video memory for anything and let's me pretty much use the full 2GB.
full member
Activity: 137
Merit: 100
Pulled Wolf's keccak code, not seeing any difference on Maxwell but if it helps Kepler/Fermi, I'm all for it.

Win32 binary at https://github.com/tsiv/ccminer-cryptonight/releases/tag/v0.16 and source obviously at https://github.com/tsiv/ccminer-cryptonight/
newbie
Activity: 27
Merit: 0
new version of ccminer:

* small improvement over whirlcoin (increased the total number of threads): gpu usage is 99% for the gtx780ti (as well as the gtx750ti)
in terms of performance: I was getting before for gtx780ti+gtx750ti = around 19MH/s now I get 20.6MHash/s.

* added keccak256 (maxcoin) to ccminer
It isn't a port of cudaminer as it makes use of the main function of keccak512 instead of the one of cudaminer.

In terms of performance. cudaminer gives over my gtx780ti (OC) 293MHash/s, with this kernel I get around 300MHash/s
(I am not sure if the problem of low share getting registered  by pool is still present or not... I would assume yes).
Please watch the number of rejected shares. This one can be high (it is due to the use of a difficulty multiplier, and some tuning is still needed there)

The changes have been included into https://github.com/djm34/ccminer

It should compile on linux however it is untested.
I don't plan to release binaries.

Hi no luck compiling on Linux..

Code:
g++  -g -O2 -pthread -L/usr/local/cuda/lib64  -o ccminer ccminer-cpu-miner.o ccminer-util.o ccminer-bmw.o ccminer-blake.o ccminer-groestl.o ccminer-jh.o ccminer-keccak.o ccminer-skein.o ccminer-hefty1.o ccminer-scrypt.o ccminer-sha2.o heavy/heavy.o heavy/cuda_blake512.o heavy/cuda_combine.o heavy/cuda_groestl512.o heavy/cuda_hefty1.o heavy/cuda_keccak512.o heavy/cuda_sha256.o ccminer-fuguecoin.o Algo256/cuda_fugue256.o ccminer-fugue.o ccminer-groestlcoin.o Algo256/cuda_groestlcoin.o ccminer-myriadgroestl.o Algo256/cuda_myriadgroestl.o JHA/jackpotcoin.o JHA/cuda_jha_keccak512.o JHA/cuda_jha_compactionTest.o quark/cuda_quark_checkhash.o quark/cuda_jh512.o quark/cuda_quark_blake512.o quark/cuda_quark_groestl512.o quark/cuda_skein512.o quark/cuda_bmw512.o quark/cuda_quark_keccak512.o quark/quarkcoin.o quark/animecoin.o quark/cuda_quark_compactionTest.o cuda_nist5.o ccminer-cubehash.o ccminer-echo.o ccminer-luffa.o ccminer-shavite.o ccminer-simd.o ccminer-hamsi.o ccminer-hamsi_helper.o ccminer-shabal.o ccminer-whirlpool.o qubit/qubit.o qubit/qubit_luffa512.o x13/x14.o x13/fresh.o x13/x17.o x13/x13.o x13/cuda_x13_hamsi512.o x13/cuda_x13_fugue512.o x13/x15.o x13/cuda_shabal512.o x13/cuda_whirlpool512.o x13/whirlcoin.o Algo256/cuda_keccak256.o Algo256/keccak256.o -L/usr/lib/x86_64-linux-gnu -lcurl compat/jansson/libjansson.a -lpthread  -lcudart -static-libstdc++ -fopenmp -lcrypto -lssl  -lcrypto -lssl
g++: error: Algo256/cuda_fugue256.o: No such file or directory
g++: error: Algo256/cuda_groestlcoin.o: No such file or directory
g++: error: Algo256/cuda_myriadgroestl.o: No such file or directory
make[2]: *** [ccminer] Error 1
make[2]: Leaving directory `/home/henning/CryptCoins/compile/djm34/ccminer'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/henning/CryptCoins/compile/djm34/ccminer'
make: *** [all] Error 2

Contents of Algo256:
Code:
henning@KontorPC ~/CryptCoins/compile/djm34/ccminer $ ls Algo256/
cuda_keccak256.cu  cuda_keccak256.o  keccak256.cu  keccak256.o
legendary
Activity: 3248
Merit: 1070
the changes for whirlcoin are just in the whirlcoin.cu? thus i don't re-compile everything
no, I made several change in whirlcoin.cu and cuda_whirlpool512.cu.
Also it use a specific maxregcount, you need to update ccminer.vxproj (which will break everything since it will want to compile keccak etc... ).
Personnally, I would recompile everything to avoid editing everything by hand (and your computer is fast...)

receiving compute 50 error, isn't falgged for that?
full member
Activity: 348
Merit: 102

I get
Jul 25 21:46:41 minera4 cpuminer[2945]: GPU #0: GeForce GTX 750, 2118 khash/s
Jul 25 21:46:41 minera4 cpuminer[2945]: GPU #1: GeForce GTX 750, 2130 khash/s

on a 750 NON-TI, with a bit of overclocking.
No did not alter bios.


Intel® Core™ i7-930, ASUS SABERTOOTH X58, 16Gb DDR3, Win7x64, 335.23.

GPU #0: MSI GeForce GTX 750Ti, N750TI-2GD5/OC, 2Gb
GPU #2: ASUS GeForce GTX 750, GTX750-PHOC-1GD5, 1Gb

MaxCoin Keccak
[2014-07-01 16:16:47] GPU #0: GeForce GTX 750 Ti, 26015 khash/s
[2014-07-01 16:16:48] GPU #2: GeForce GTX 750, 11929 khash/s
[2014-07-01 16:16:48] Total: 37943 khash/s

VTC Adaptive N factor Scrypt
[2014-07-01 16:12:34] GPU #0: GeForce GTX 750 Ti, 134.23 khash/s
[2014-07-01 16:12:35] GPU #2: GeForce GTX 750, 111.58 khash/s
[2014-07-01 16:12:35] Total: 245.81 khash/s

Monero CryptoNight
[2014-07-06 03:02:52] GPU #0: GeForce GTX 750 Ti, 273.49 H/s
[2014-07-06 03:02:52] GPU #2: GeForce GTX 750, 257.16 H/s
[2014-07-06 03:02:52] accepted: 1391/1391 (100.00%), 530.65 H/s (yay!!!)

x11
[2014-07-03 20:08:18] GPU #0: GeForce GTX 750 Ti, 2701 khash/s
[2014-07-03 20:08:18] GPU #2: GeForce GTX 750, 2042 khash/s
[2014-07-03 20:08:18] Total: 4742 khash/s


sr. member
Activity: 378
Merit: 250
Time is Money - Benjamin Franklin
snipp..

djm, wondering if there will be any chance of adding cryptonite to ccminer?
Jump to: