Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 213. (Read 3426989 times)

sr. member
Activity: 252
Merit: 250
Im only getting 80% accepted shares with doomcoin. Is this normal for this algo??
legendary
Activity: 1400
Merit: 1050


Grin

tsiv you did solomine 150 blocks from total 600 blocks till now? LOL the fastest instaminer title in the cryptoworld from djm34 should be given to you  Grin
yep I doesn't have that much  Grin
I solomined only 5950 (took me a while to get the R9 working strangely...), but blocks are still coming...
(actually there wasn't any instamine or I arrived late to that coin... and block wasn't coming that easily even when I was doing 1/2 of the net hashrate)

ok there is still 331 unaccounted blocks  Grin
full member
Activity: 212
Merit: 100
Hey TSIV, thanks for your XMR miner and all your contributions !

I'm getting 230KH/s at the moment with overclocked 750 Ti and -l 8x60 parameter.

Any advice to get a bit more or it seems good enough like that ?

legendary
Activity: 1400
Merit: 1050
everything fine, that pesky ccminer.rc is there again
yes I know...  it doesn't want to go away Grin
(it must be referenced in some vxproj files... but the problem visual finder won't look in those files... so it is difficult to catch...)
hero member
Activity: 938
Merit: 1000


Grin

tsiv you did solomine 150 blocks from total 600 blocks till now? LOL the fastest instaminer title in the cryptoworld from djm34 should be given to you  Grin
legendary
Activity: 3248
Merit: 1070
everything fine, that pesky ccminer.rc is there again
full member
Activity: 168
Merit: 100

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5

funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why

i'm trying cuda 6.0 but it don't work with visual studio 2010
you have to right-click on the project and click on build customization and there you can chose the cuda version

i can't even load the project

If not and on windows open the project file in notepad.  Do a search for 5.5 and replace with 6.0

You will find it 2 or 3 times.

That will change the CUDA version from 5.5 to 6.0.  You can also do the same for 6.5 too.

Re-open project and see if that works.

Carlo

PS You can have multiple versions of the CUDA toolkit installed as they use different directories (what we just edited above).
legendary
Activity: 3248
Merit: 1070

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5

funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why

i'm trying cuda 6.0 but it don't work with visual studio 2010
you have to right-click on the project and click on build customization and there you can chose the cuda version

i can't even load the project
did you uninstall cuda 5.5 ?

yeah i thought it wasn't necessary anymore, i'm re-installing now
legendary
Activity: 1400
Merit: 1050

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5

funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why

i'm trying cuda 6.0 but it don't work with visual studio 2010
you have to right-click on the project and click on build customization and there you can chose the cuda version

i can't even load the project
did you uninstall cuda 5.5 ?
legendary
Activity: 3248
Merit: 1070

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5

funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why

i'm trying cuda 6.0 but it don't work with visual studio 2010
you have to right-click on the project and click on build customization and there you can chose the cuda version

i can't even load the project
legendary
Activity: 1400
Merit: 1050

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5

funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why

i'm trying cuda 6.0 but it don't work with visual studio 2010
you have to right-click on the project and click on build customization and there you can chose the cuda version
full member
Activity: 137
Merit: 100
Code:
#define SUBCRUMB(a0,a1,a2,a3,a4)\
    asm( \
        "mov.b32    %4, %0;\n\t" \
        "or.b32     %0, %0, %1;\n\t" \
        "xor.b32    %2, %2, %3;\n\t" \
        "not.b32    %1, %1;\n\t" \
        "xor.b32    %0, %0, %3;\n\t" \
        "and.b32    %3, %3, %4;\n\t" \
        "xor.b32    %1, %1, %3;\n\t" \
        "xor.b32    %3, %3, %2;\n\t" \
        "and.b32    %2, %2, %0;\n\t" \
        "not.b32    %0, %0;\n\t" \
        "xor.b32    %2, %2, %1;\n\t" \
        "or.b32     %1, %1, %3;\n\t" \
        "xor.b32    %4, %4, %1;\n\t" \
        "xor.b32    %3, %3, %2;\n\t" \
        "and.b32    %2, %2, %1;\n\t" \
        "xor.b32    %1, %1, %0;\n\t" \
        "mov.b32    %0, %4;\n\t" \
        :: "r"(a0), "r"(a1), "r"(a2), "r"(a3), "r"(a4))

Massive +1 MH/s on 750 Ti  Cheesy

Well, at least with CUDA 5.5. No idea how that can actually be even a single bit faster than straight C, the compiler seems to do a piss poor job sometimes with simple statements like in that define.
legendary
Activity: 3248
Merit: 1070

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5

funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why

i'm trying cuda 6.0 but it don't work with visual studio 2010
legendary
Activity: 1512
Merit: 1000
quarkchain.io
confirmed - 55-56 Mh/s average speed with a 750TI
legendary
Activity: 1400
Merit: 1050

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti).
But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C).

regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5
full member
Activity: 137
Merit: 100

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?

Pretty sure you're confusing it with some other coin, DOOM is just a  single round of luffa-512 and runs at about 48-49 MH/s using a poor implementation on my 750 Ti. Djm's version bumped it up to the 56 MH/s area. Per card.
legendary
Activity: 3248
Merit: 1070

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%

1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
newbie
Activity: 27
Merit: 0
Is it OK to compile latest miners on Ubuntu with Cuda 5.5 or should I install new Cuda version?

I compiled djm34's github-source successfully today. Mint 17 / Ubuntu 14.04, Cuda 5.5, drivers 331.38

full member
Activity: 137
Merit: 100

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley

290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn....

Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%
legendary
Activity: 1512
Merit: 1000
quarkchain.io

Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm.

Which speeds did you reaxh with 750TI ? Smiley
Jump to: