Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 271. (Read 3426936 times)

sr. member
Activity: 291
Merit: 250
8x60 gives me 261h/s per 750 ti....EVGA SC.
Was 240-243h/s
Thanks!
ivan, how many coins(monero) can I espect from a 750Ti ?
Thanks for the reply!
Thats cinda low on profit..
Anything worth mining now? Smiley
legendary
Activity: 1154
Merit: 1001
With 260 H/s you should get 0.34 XMR per Day
Maybe 20% more, as the BOT is pessimistic and accounts for 20% orphans, which is not accurate with the recent pool performance.

@tsiv: Me owes you a pizza! Let's exchange PMs on this.

Cheers,
~ Myagui

sr. member
Activity: 291
Merit: 250
8x60 gives me 261h/s per 750 ti....EVGA SC.
Was 240-243h/s
Thanks!
ivan, how many coins(monero) can I espect from a 750Ti ?
sr. member
Activity: 329
Merit: 250
Aaand I'm back. Had to focus a bit on shit I've been neglecting lately, including but not limited to my actual job and sleep  Grin

Say bye to (most of) the lag on Windows if you're willing to take a slight hit in performance. Basically I'm now splitting the heaviest part of the Cryptonight core into smaller batches with a little sleep between launching the batches. The default is still no splitting and no sleeping on Linux, on Windows it gets split into 64 batches with 100 microseconds sleep between the batches. Both values can be set on the command line per-device. I'm running it as I type this and while there is still noticeable lag, it's not too bad. The performance hit is pretty negligible on the defaults, something like 5 H/s. Well, on my system anyway.

Latest win32 binary at https://github.com/tsiv/ccminer-cryptonight/releases/tag/v0.14 and source at https://github.com/tsiv/ccminer-cryptonight as you might expect.
this commit broke mining under linux giving lot of:
Code:
GPU #0: result for nonce $00000069 does not validate on CPU!
reversing the commit and building again fixed it.
full member
Activity: 252
Merit: 102
OPEN Platform - Powering Blockchain Acceptance
Does anyone have a guide on how to set up a Monero wallet on Linux?
full member
Activity: 137
Merit: 100
... I might be wrong on this but 2 GB should be enough for 8x96. I can do 8x120 on my Linux rig with a 2 GB 750 Ti:

...both works with kopiemtu too.
8x60 & 8x120 have nearly the same performance. 8x96 is a lot worse.
what give you the most performance on your linux rig?

thanks for the update. 

I'm running on 8x60, haven't found anything that works better. And like you said, 8x120 is practically the same, but just a little bit slower.
hero member
Activity: 812
Merit: 1000
8x60 gives me 261h/s per 750 ti....EVGA SC.
Was 240-243h/s
Thanks!
hero member
Activity: 676
Merit: 500
Instaling the new nvidia beta driver crashed ccminer no mater what algo , i had to roll back the driver to fix it. Anyone with the same problem?
sr. member
Activity: 330
Merit: 252
... I might be wrong on this but 2 GB should be enough for 8x96. I can do 8x120 on my Linux rig with a 2 GB 750 Ti:

...both works with kopiemtu too.
8x60 & 8x120 have nearly the same performance. 8x96 is a lot worse.
what give you the most performance on your linux rig?

thanks for the update. 
full member
Activity: 137
Merit: 100
Not gonna lie, working on this algo is making a huge dent on the whiskey fund. Cheers mate Smiley

Here's 0.5 BTC in hope the excessive Whiskey supply will keep you short of our own performance benchmark Wink

Transaction-ID 7fdaf9602034832a8045887c7b592b62d53b74377ddbf3d958129b9ad8d4ed55-000

seriously, great work on your ccminer forks. Keep it up!

Christian


Wow, didn't see that coming. It's not every day you take a man's work, more or less turn it against him and then get paid by him. Always wondered how much I was stepping on your toes with my release, guess that answers that question. Thank you, good sir, thank you very much Smiley

Anyone played around with the launch config stuff for the TSIV version? I'm finding that 4x80 is far from optimal on certain systems. 6x60 gave me about a 25% boost on a GTX 770 and GTX 780, which on a GTX 860M 4x40 basically tripled my performance (from 50 H/s to 170 H/s). It would be great to hear what others are seeing with the -l parameter.

someone needs to come up with an autotune. Just sayin'...

NOTE: separate autotuning would be required for the 3 kernels of the algorithm.


Thought of that on the side, might be doable but some configs do so badly it might be TDR city all over again. I've managed values that give like 30 H/s compared to the inexplicably optimal ones that give around 280 H/s. Should probably take a poke at it anyway, at some point.

Tried to paid for a few beer and longdrinks in bitcoins tonight. Didn't work because the stupid Windows Phone wallet software confused decimal dot and comma in the German version of Windows phone. After 2 embarassing attempts that ended in a failure message, I shelled out 30 Euros in cash.

Oh well... the sad sad state of Windows Phone.  Of course switching the entire phone over to US English localization would have worked.



The slight relief of seeing I'm not the only one getting fsck'd by Microsoft software, priceless  Grin

anyone actually compare CUDA 5.5 to CUDA 6.0 compiles? see if there really is a speed difference?

I actually compiled ccminer using 6.0 for quite some time, until I finally got fed up with editing the VC project files every time a new version came out. Nothing gained and nothing lost on going 5.5 -> 6.0 as far as I could tell.

Obviosly this setting is too much for the 2gb ram of ti 750. But the new miner works fine with the old settings 40 blocks 8 treads.

Should be fine in theory, but if I'm not mistaken cudaMalloc requires a contiguous chunk of memory or it fails. So if there is even a tiniest allocation somewhere in the middle, the big allocation fails if there isn't a contiguous chunk of around 1.5 GB on either side of the smaller allocation. I might be wrong on this but 2 GB should be enough for 8x96. I can do 8x120 on my Linux rig with a 2 GB 750 Ti:

Code:
    FB Memory Usage
        Total                       : 2047 MiB
        Used                        : 1970 MiB
        Free                        : 77 MiB
full member
Activity: 137
Merit: 100
Aaand I'm back. Had to focus a bit on shit I've been neglecting lately, including but not limited to my actual job and sleep  Grin

Say bye to (most of) the lag on Windows if you're willing to take a slight hit in performance. Basically I'm now splitting the heaviest part of the Cryptonight core into smaller batches with a little sleep between launching the batches. The default is still no splitting and no sleeping on Linux, on Windows it gets split into 64 batches with 100 microseconds sleep between the batches. Both values can be set on the command line per-device. I'm running it as I type this and while there is still noticeable lag, it's not too bad. The performance hit is pretty negligible on the defaults, something like 5 H/s. Well, on my system anyway.

Latest win32 binary at https://github.com/tsiv/ccminer-cryptonight/releases/tag/v0.14 and source at https://github.com/tsiv/ccminer-cryptonight as you might expect.
hero member
Activity: 676
Merit: 500
Obviosly this setting is too much for the 2gb ram of ti 750. But the new miner works fine with the old settings 40 blocks 8 treads.
member
Activity: 77
Merit: 10
Also getting the same errors regardless the setting:

[2014-07-05 00:09:42] 1 miner threads started, using 'cryptonight' algorithm.
[2014-07-05 00:09:42] GPU #0: GeForce GTX 750 Ti, using 96 blocks of 8 threads
[2014-07-05 00:09:42] Starting Stratum on stratum+tcp://xmr.extremehash.com:9999

[2014-07-05 00:09:42] GPU #0: FATAL: failed to allocate device memory for long s
tate
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
I keep getting this with random GPUs:


I'm using windows, but I'm also using the onboard GPU so exactly 0 RAM is being used on the cards.

Not sure if there is a memory backoff value in ccminer as it was in cudaminer.
(#if WIN32
device_backoff[thr_id] = 12;
#else
device_backoff[thr_id] = 2;
#endif)

The only way I don't get that problem is if I use a much lower than optimal kernel setting.
hero member
Activity: 494
Merit: 500
Carlo, Crashes my card in  cryptonight for me too.. 750ti, win7 64 , using stock command line, other algos working ok.
Posted from Bitcointa.lk - #arNJ7ehtPEYt5Muh
newbie
Activity: 29
Merit: 0
Attached is nvMiner or otherwise known as ccminer 1.2U-B (U=unified)

Same as the last version but with possible improvements to CryptoNight
https://ayarscloud.tonidoid.com/urljfz5fy

If you run CryptoNight can you try this version with and without your usual -l and let me know what hash speeds you get?
Also what cards you have?

Have fun,
Carlo

Thanks for putting this all together in one package!

For CryptoNight without the launch config, nvminer gave me 64 blocks of 8 thread (8x64), and gave me about 385 h/s on my 780 Ti Classified. -l 4x120 gives me 475 h/s, for reference.

Nice.  So 4x120 on the 780 TI gives you the highest rate?  Could you do me a favor?  Please do a "cut and paste" of the specific way the card is shown. Examples:
GeForce GTX 750 Ti
GeForce GTX 660
GeForce GT 740M

Could you also try 4x128 and let me know the hash rate you get with that?  I think block sizes of multiples of 32 are closer to optimal.
Here is what I've found from testing:
GeForce GTX 750 Ti  8x96
GeForce GTX 660  7x96
GeForce GT 740M  9x64
Default for unknown cards 8x64

I'm still testing different variations but those are the best I've found so far.  If anyone has any better configs I'm all for trying it and can make it the default for those cards.

Carlo

Whoops!  475 h/s was actually at -l 6x120.

Did some short-run testing (FWIW, my rig is running on 8 GB of RAM and on an i5-4670k @ 4.5 GHz.):
GeForce GTX 780 Ti 6x128 : 450 h/s
GeForce GTX 780 Ti 4x128 : 430 h/s
GeForce GTX 780 Ti 4x120 : 440 h/s

I might do some more testing at a later date for higher thread numbers, but, from what it looks like so far, other choices are still more profitable for higher-end cards (like the 780 Ti) to mine.
full member
Activity: 168
Merit: 100

I must be missing something mine crashes driver ver 340.43 within 3 sec's (forgot to add gpu 750ti ftw stock)

What algo/kernal are you trying to run?

If CryptoNight did you previously run the Reg Hack that changes the timeout of the display driver timeout?

How much memory?

How many GPU cards? 

What is the command line you are using?

Did you try a different algo just to see if that would work?

Carlo
sr. member
Activity: 434
Merit: 250
"The mass of men lead lives of quiet desperation."
Attached is nvMiner or otherwise known as ccminer 1.2U-B (U=unified)

Same as the last version but with possible improvements to CryptoNight
https://ayarscloud.tonidoid.com/urljfz5fy

If you run CryptoNight can you try this version with and without your usual -l and let me know what hash speeds you get?
Also what cards you have?

Have fun,
Carlo

Thanks for putting this all together in one package!

For CryptoNight without the launch config, nvminer gave me 64 blocks of 8 thread (8x64), and gave me about 385 h/s on my 780 Ti Classified. -l 4x120 gives me 475 h/s, for reference.

Nice.  So 4x120 on the 780 TI gives you the highest rate?  Could you do me a favor?  Please do a "cut and paste" of the specific way the card is shown. Examples:
GeForce GTX 750 Ti
GeForce GTX 660
GeForce GT 740M

Could you also try 4x128 and let me know the hash rate you get with that?  I think block sizes of multiples of 32 are closer to optimal.
Here is what I've found from testing:
GeForce GTX 750 Ti  8x96
GeForce GTX 660  7x96
GeForce GT 740M  9x64
Default for unknown cards 8x64

I'm still testing different variations but those are the best I've found so far.  If anyone has any better configs I'm all for trying it and can make it the default for those cards.

Carlo

I must be missing something mine crashes driver ver 340.43 within 3 sec's (forgot to add gpu 750ti ftw stock)
full member
Activity: 168
Merit: 100
Attached is nvMiner or otherwise known as ccminer 1.2U-B (U=unified)

Same as the last version but with possible improvements to CryptoNight
https://ayarscloud.tonidoid.com/urljfz5fy

If you run CryptoNight can you try this version with and without your usual -l and let me know what hash speeds you get?
Also what cards you have?

Have fun,
Carlo

Thanks for putting this all together in one package!

For CryptoNight without the launch config, nvminer gave me 64 blocks of 8 thread (8x64), and gave me about 385 h/s on my 780 Ti Classified. -l 4x120 gives me 475 h/s, for reference.

Nice.  So 4x120 on the 780 TI gives you the highest rate?  Could you do me a favor?  Please do a "cut and paste" of the specific way the card is shown. Examples:
GeForce GTX 750 Ti
GeForce GTX 660
GeForce GT 740M

Could you also try 4x128 and let me know the hash rate you get with that?  I think block sizes of multiples of 32 are closer to optimal.
Here is what I've found from testing:
GeForce GTX 750 Ti  8x96
GeForce GTX 660  7x96
GeForce GT 740M  9x64
Default for unknown cards 8x64

I'm still testing different variations but those are the best I've found so far.  If anyone has any better configs I'm all for trying it and can make it the default for those cards.

Carlo
legendary
Activity: 3164
Merit: 1003
Be patient there will be more opportunities. Right now I cant complain I made ROI with the JPC and MIN. Lets start pilling some BTC for those 880. Tongue

how many kh/s in scrypt do you think it will do?
not enough for its power usage compared to asic.
based on the 750ti, it should do around 1.5MHash/s

That would be great on a 6 card rig with X11 ect.  Grin
Jump to: