Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 318. (Read 3426936 times)

full member
Activity: 182
Merit: 100
Yeah... I mined your mom last night.
I'll just leave this here...



Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley


Can you compile this for Compute 2.1 (Fermi) ?

I'd like to test it. Still waiting on the funds to order my 750ti cards.
full member
Activity: 137
Merit: 100
What kind of problems did you run into on Windows?

The kind where it now builds successfully but seems to nuke the display driver almost instantly after starting  Cheesy
hero member
Activity: 482
Merit: 500
I'll just leave this here...

....

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley
Is this something you're coding on your own, then, or has Christian given you early access to his Cryptonight code? I'm guessing the former. Anyway, I've got some NVIDIA laptops as well as a GTX 770 and GTX 780 I could run it on to see how it scales if you're interested. Waiting with baited breath for this.... :-)
hero member
Activity: 868
Merit: 1000
I'll just leave this here.

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work! What sort of power usage do you see?

Ah, right. Forgot to mention that. That particular rig is pulling around 270W from the wall when running cryptonight. I guess that qualifies as mini-good-news.
are you open sourcing ? (otherwise I am not seeing the point for telling us it exists but you can't use it... this is becoming a peculiarity of this thread somehow  Grin)

That's the idea, still needs some work though. Just added a command line switch to set how many blocks and threads per block to use in the kernel launch as it was hard coded for what my 750 Ti seems to like best. Odds are that it wouldn't run at its best on other cards with the same settings. Next up is looking at building on Windows, didn't look too hot on first try.

possible to include GPU / CPU / RAM hybrid coding to improve speed ? Maybe certain part of hash can be run by CPU / RAM to boost speed. Perhaps cbuchner1 can give some pointer to increase speed.  Cool It seems for certain that speed can be improve since the your rig is using very little power running it.
full member
Activity: 168
Merit: 100
What kind of problems did you run into on Windows?
full member
Activity: 137
Merit: 100
I'll just leave this here.

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work! What sort of power usage do you see?

Ah, right. Forgot to mention that. That particular rig is pulling around 270W from the wall when running cryptonight. I guess that qualifies as mini-good-news.
are you open sourcing ? (otherwise I am not seeing the point for telling us it exists but you can't use it... this is becoming a peculiarity of this thread somehow  Grin)

That's the idea, still needs some work though. Just added a command line switch to set how many blocks and threads per block to use in the kernel launch as it was hard coded for what my 750 Ti seems to like best. Odds are that it wouldn't run at its best on other cards with the same settings. Next up is looking at building on Windows, didn't look too hot on first try.
hero member
Activity: 644
Merit: 500
I'll just leave this here...



Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work!
Is this OC'd? Wink Probably not Cheesy
Can't wait to see it released ^^" And will you release that "Pool set diff to" line too? I like it  Cool

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Out of curiosity, what does a CPU miner push through for Cryptonight?  If you are down in the hash per second vs. kilohash per second range then I'm left wondering if we have an algorithm which is truly better on CPU than GPU.

Kind of falls back on my Proof of Blockchain concept (linked below) to make it too hard for GPU (and especially ASIC) miners to out run a basic CPU miner.
Depends a lot on CPU and miner used, but you can find some hashrates here: https://bitcointalksearch.org/topic/wolfs-xmrbcndsh-cpuminer-2x-speed-compared-to-lucasjones-new-06202014-632724
sr. member
Activity: 401
Merit: 250
Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Out of curiosity, what does a CPU miner push through for Cryptonight?  If you are down in the hash per second vs. kilohash per second range then I'm left wondering if we have an algorithm which is truly better on CPU than GPU.

Kind of falls back on my Proof of Blockchain concept (linked below) to make it too hard for GPU (and especially ASIC) miners to out run a basic CPU miner.
full member
Activity: 168
Merit: 100
I'll just leave this here.

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work! What sort of power usage do you see?

Ah, right. Forgot to mention that. That particular rig is pulling around 270W from the wall when running cryptonight. I guess that qualifies as mini-good-news.
are you open sourcing ? (otherwise I am not seeing the point for telling us it exists but you can't use it... this is becoming a peculiarity of this thread somehow  Grin)

tsiv is the fabulous Smiley person who gave us x13.  So...
legendary
Activity: 1400
Merit: 1050
I'll just leave this here.

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work! What sort of power usage do you see?

Ah, right. Forgot to mention that. That particular rig is pulling around 270W from the wall when running cryptonight. I guess that qualifies as mini-good-news.
are you open sourcing ? (otherwise I am not seeing the point for telling us it exists but you can't use it... this is becoming a peculiarity of this thread somehow  Grin)
hero member
Activity: 672
Merit: 501
member
Activity: 117
Merit: 10
I'll just leave this here...



Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley
Awesome! You're becoming this threads hero!
full member
Activity: 137
Merit: 100
I'll just leave this here.

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work! What sort of power usage do you see?

Ah, right. Forgot to mention that. That particular rig is pulling around 270W from the wall when running cryptonight. I guess that qualifies as mini-good-news.
member
Activity: 80
Merit: 10
I'll just leave this here.

Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley

Nice work! What sort of power usage do you see?
full member
Activity: 137
Merit: 100
I'll just leave this here...



Damn, that's slow. Seems to scale almost perfectly with hardware memory bandwidth when comparing with Claymore's AMD miner. R9 290X has 3.7x theoretical memory bandwidth compared to 750 Ti and does around 600 H/s. Surprise, 600 / 3.7 comes to around 162. Same story with 270X and it's rougly 2x mem bandwidth. Guess that's not entirely unexpected since there's a whole lot of global memory access going on with the cryptonight algo. Still poking at it but I doubt it'll improve much without C&C level voodoo magic and that's well beyond my skillset Smiley
full member
Activity: 263
Merit: 100
Just wanted to add that I was having the same 'fan RPM' reporting CRAZY high numbers on my 4 MSI 750ti OC versions, but the 2 MSI 750ti TF versions worked fine.
Not sure if that in some way helps you track it down, but I am willing to test newer versions as well if you think anything has been addressed.
Smiley
Sad, that i have only 3 cards (2 x MSI 750ti 2GD5/OC TF and MSI 660ti 2GD5/OC ) on my rig and all cards show me rpm without any problem.  Sad
I'm using nvapi function for get rpm

NvAPI_GPU_GetTachReading http://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__gpucooler.html

And that function return only one pointer to a variable... So, i will add check if it return null then string will be without rpm info.

You could also just chop it down to the same 4-5 digits that a normal report would give and leave it at that...

Won't mess up the spacing/readability of all the cards in columns that way as well.

Care to share the window sizing you have it set for 'normally'?

Maybe another thing to add if you feel like it would be to only have the top section large enough for the amount of cards reporting (no idea how that works with that split screen library...)

i'm using formated string output with fixed size for each variable
Code:
 "GPU #%1d[%1d]: %18s %6.0f/%-6.0fkhash/s %2u/%2uC %4uRPM(%2u%%) %4uMHz %2u%% %4uMHz %2u%% %4uMB(%2u%%)"
and for rpm variable i set 4 chars, but now i see it won't work with wrong numbers (not cutted them).

update:
reupload binary to github.
All function for return using dword (unsigned long) now.

%u   unsigned   decimal number
%lu   unsigned long   decimal number

change %u to %lu in info string, maybe this will help resolve problem with wrong numbers in variables.
hero member
Activity: 526
Merit: 500
Its all about the Gold
Just wanted to add that I was having the same 'fan RPM' reporting CRAZY high numbers on my 4 MSI 750ti OC versions, but the 2 MSI 750ti TF versions worked fine.
Not sure if that in some way helps you track it down, but I am willing to test newer versions as well if you think anything has been addressed.
Smiley
Sad, that i have only 3 cards (2 x MSI 750ti 2GD5/OC TF and MSI 660ti 2GD5/OC ) on my rig and all cards show me rpm without any problem.  Sad
I'm using nvapi function for get rpm

NvAPI_GPU_GetTachReading http://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__gpucooler.html

And that function return only one pointer to a variable... So, i will add check if it return null then string will be without rpm info.

Just curious, could it be certain cards are not fully enclosed and maybe using a extra external fan source?
full member
Activity: 238
Merit: 100
Medichain: The Medical Big-Data Platform
Just wanted to add that I was having the same 'fan RPM' reporting CRAZY high numbers on my 4 MSI 750ti OC versions, but the 2 MSI 750ti TF versions worked fine.
Not sure if that in some way helps you track it down, but I am willing to test newer versions as well if you think anything has been addressed.
Smiley
Sad, that i have only 3 cards (2 x MSI 750ti 2GD5/OC TF and MSI 660ti 2GD5/OC ) on my rig and all cards show me rpm without any problem.  Sad
I'm using nvapi function for get rpm

NvAPI_GPU_GetTachReading http://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__gpucooler.html

And that function return only one pointer to a variable... So, i will add check if it return null then string will be without rpm info.

You could also just chop it down to the same 4-5 digits that a normal report would give and leave it at that...

Won't mess up the spacing/readability of all the cards in columns that way as well.

Care to share the window sizing you have it set for 'normally'?

Maybe another thing to add if you feel like it would be to only have the top section large enough for the amount of cards reporting (no idea how that works with that split screen library...)
full member
Activity: 238
Merit: 100
Medichain: The Medical Big-Data Platform
That would be okay but that would just kill cryptos ....99% of the population of the world could not code to save face. All your doing is encouraging more of "people with get more and people with not get nothing"....not the way you want to start going.

On the other hand, I do think people need to be more open and donating to devs for the work. I just started to do that myself because I understand many things come with a cost, and the devs take time to make such a good item for all of us.

If you don't already think there are probably more with their own code improvements and tweaks who are not releasing or even saying anything publicly you are crazy.

Now you may be right at this level of the average 100 sat low end coin, they aren't investing the time or effort on a wide basis, but I guarantee you that anyone running a million dollar bitcoin server farm isn't releasing their code if they find something that makes theirs faster.

The info slowly gets out and becomes common knowledge for the most part, and then to support the masses they would release their code to move the 90% up to all the same level, but there is no way this isn't already the 'norm'.

They will release it and make sure the public at large has the tools to keep enough support for the market to keep it profitable for them, because you are right that a broader market is healthier overall, but it is in their best interests to do so only once they have accumulated enough coins and are going to profit from it.
full member
Activity: 263
Merit: 100
Just wanted to add that I was having the same 'fan RPM' reporting CRAZY high numbers on my 4 MSI 750ti OC versions, but the 2 MSI 750ti TF versions worked fine.
Not sure if that in some way helps you track it down, but I am willing to test newer versions as well if you think anything has been addressed.
Smiley
Sad, that i have only 3 cards (2 x MSI 750ti 2GD5/OC TF and MSI 660ti 2GD5/OC ) on my rig and all cards show me rpm without any problem.  Sad
I'm using nvapi function for get rpm

NvAPI_GPU_GetTachReading http://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__gpucooler.html

And that function return only one pointer to a variable... So, i will add check if it return null then string will be without rpm info.
Jump to: