Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 953. (Read 2347664 times)

legendary
Activity: 1260
Merit: 1008
anybody got the standard and sp_ mod hashrate for Monero on the new 950?

and anyone know if there's a way in linux to get this card to overclock?

http://www.newegg.com/Product/Product.aspx?Item=N82E16814121980&ignorebbr=1&cm_re=PPSSKQKDFPGTFN-_-14-121-980-_-Product

sr. member
Activity: 427
Merit: 250
@myagui, yes you are on point about enabling the fake X screens on multiple cards, (which then allows you to set fan speeds, mem clocks and core clocks with Nvidia X Server settings program).  Details here: https://litecointalk.org/index.php?topic=16800.msg266088#msg266088.

I am running an older driver (346.59 with CUDA 6.5.19, LUbuntu 14.04), so the attribute names and procedure may be slightly different.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
It may be time for me to use multiple work items on one Lyra2 execution... but I'm worried the increased memory accesses for the matrix (which I've worked to eliminate) will eat me alive.

In my latests fixes I added more memory accesses and speed.. Smiley

Perhaps this is the opposite of what you are supposed to do when doing a modded kernal, but I think it was fun..
Only latency wankers working for NVIDIA it seems...


Still slow though. You know the opensource is slow, because the market makes it not profitable to mine.

Your optimized kernals might end up here wolf0, And I think they can afford to pay you enough:

sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
TPB52 7 is best for mining. My gtx 980 make 11.2-11.6 Mh. 960 gtx make 6 Mh. Super!
For rig (2x980+1x960) hash is 28500!
PS. It is LYRA2v2! +100 git 1052 works!
Would you share your command line?  I am running a 1x980 and 1x960 in a box.  I have the hardest time to configuring both together.
Thanks

I have submitted a fix to autoconfig bether launch configs on the gtx 980 and the gtx 960. The 980ti and the 950 is probobly off. @latest github
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
frames? ... i build them custom ...

the frame ( 1.5mm square aluminium tubing ) - edge connectors - screws and lugs - angle iron ( 90 degree edging ) ... all this comes to approx $70.00aud ...

but initially i had to buy the cutting and measuring equipment as well as the files and mallets ( yes plastic head mallets - not hammers ) ...

parts alone - not counting labour - if you are looking at adding teh countless hours to design the thing in the first place ...

prototype 1 - prototype 2 and now prototype 3 is probably the design we are going with ...

hope that helps bathrobehero ...

#crysx

Thanks, it does. I have similar prices but personally, I don't find that to be worth it and I'm much more comfortable spending that ~50 USD into better PSUs.

I must say that I didn't have any problem benchmarking amd cards: if the room was hot, I'd put the fans at high speed, run the miner and wait a couple minute to stabilize, and that's it. I could make 100 changes to a kernel in a day and check them all, accurately.

On nvidia I have throttling problems I can't easily fix (the cards reduce clock speed in a number of situations I just can't predict), overclocking/downclocking is more difficult as the cards tend to change clocks by themselves, and the hashrates fluctuates wildly, and even changes between ccminer runs.
The rig is headless so I only have nvidia-smi to work with, and it can't set the fan speed.
So when I make a little kernel speedup, I spend more time benchmarking it (to be sure it's indeed an improvement), than making the improvement itself :-/
Maybe there are some nvidia-smi settings to make it more stable?
Or maybe on windows it's different...
Finally I may need a workstation with a nvidia as main card, and work on it.

You should probably disable boost in the bios of the card. Or even set custom fan speed/clock speeds if you want.
hero member
Activity: 677
Merit: 500
TPB52 7 is best for mining. My gtx 980 make 11.2-11.6 Mh. 960 gtx make 6 Mh. Super!
For rig (2x980+1x960) hash is 28500!
PS. It is LYRA2v2! +100 git 1052 works!

Would you share your command line?  I am running a 1x980 and 1x960 in a box.  I have the hardest time to configuring both together.

Thanks
All works by default, card OC'd to 1380 Mhz. Ccminer recompiled with TBP52=7 setted in lyra2REve.cu. Thanks to _SP!
full member
Activity: 181
Merit: 100
TPB52 7 is best for mining. My gtx 980 make 11.2-11.6 Mh. 960 gtx make 6 Mh. Super!
For rig (2x980+1x960) hash is 28500!
PS. It is LYRA2v2! +100 git 1052 works!

Would you share your command line?  I am running a 1x980 and 1x960 in a box.  I have the hardest time to configuring both together.

Thanks
legendary
Activity: 1154
Merit: 1001
@t-nelson: extending the invite that every cuda miner developer has probably received at some point:
Some of the usual suspects (both developers & community) hang out at #ccminer @freenode. It is not the most active channel in freenode, but it is a great place to exchange ideas with the other developers and the few of us there always try to help each other out (testing, etc).

In case that IRC is not your usual thing, maybe webchat will get you there: #ccminer webchat
member
Activity: 70
Merit: 10
I remember getting the number of SMM/SMXs in the CryptoNight miner... can't remember how, though.

Yep. TSIV's Cryptonight ccminer complains whenever the blocks/threads (don't recall which one now) is not a multiple of the SMX/SMM on the card. IIRC, it does this also accurately for cards that did not exist when it was released, so the approach is not LUT based.

Right - I can't remember if I wrote that or not.

Nice, I'll have to dig into these APIs a bit deeper once I get all of the cleanup I have planned done.
sr. member
Activity: 506
Merit: 252
Final release of CUDA 7.5 toolkit (including some fixes)!
legendary
Activity: 1797
Merit: 1028
Or maybe on windows it's different...

Yeps. On Windows it is extremely simple to set fixed clocks, fans, overclocking, so you can easily have a "benchmark platform".

With a headless Linux system I don't think there's any solution for fixed fans yet, others might know differently. I previously had fixed fans and clocks on Linux, but I perfectly recall that I specifically had to configure/attach a monitor in order to get that working at the time.

to set the fan on linux just test this

https://gist.github.com/squadbox/e5b5f7bcd86259d627ed

Thanks, but I fear it needs a monitor or the x session will not start... Or will it?

NOT ANY MORE--

If you have one of the latest driver packages, you can use nvidia-config to enable cards without monitors to be adjusted:

"sudo nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration" (Source: https://bitcointalksearch.org/topic/m.12279696)

The numbeer "28" is a bitsum that signifies clocks, fanspeed, and power controls.

This was in this thread, earlier. I think that "-a" and the longer "--allow-empty-initial-configuration" are equivalent. THEY ARE NOT EQUIVALENT!  Command "-a"  = "--enable-all-gpus"  

--scryptr
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Thanks everyone, I'll give all this a try asap.
Hopefully I'll then be able to test small kernel enhancements :-)
legendary
Activity: 1154
Merit: 1001
I dug around the Kopiemtu thread for the relevant bits, and seems this is the most important one:
In xorg.conf, just before the coolbits statement:
Code:
Option         "AllowEmptyInitialConfiguration" "True"

If I read right, this is how one sets up fake monitors on headless systems.

This will not work for every driver version. Will probably require this command to setup a fresh xorg.conf if a driver update is needed.
Code:
sudo nvidia-xconfig −−enable-all-gpus

I dunno, since I don't have a Linux system currently to test this on. I'd already been able to setup fake monitors (manually) before, so I know didn't use this at the time, but I did have 1 monitor attached to the 1st GPU, so mine was not a fully headless setup.

Edit: better yet, check the source. Hashbrown looks to be doing good stuff over there!  Smiley
https://litecointalk.org/index.php?topic=16800.msg266088#msg266088

Cheers!
newbie
Activity: 29
Merit: 0
Or maybe on windows it's different...

Yeps. On Windows it is extremely simple to set fixed clocks, fans, overclocking, so you can easily have a "benchmark platform".

With a headless Linux system I don't think there's any solution for fixed fans yet, others might know differently. I previously had fixed fans and clocks on Linux, but I perfectly recall that I specifically had to configure/attach a monitor in order to get that working at the time.

to set the fan on linux just test this

https://gist.github.com/squadbox/e5b5f7bcd86259d627ed

Thanks, but I fear it needs a monitor or the x session will not start... Or will it?

i have no monitor connected
but i have a linux 14.04 desktop version
legendary
Activity: 1470
Merit: 1114
Or maybe on windows it's different...

Yeps. On Windows it is extremely simple to set fixed clocks, fans, overclocking, so you can easily have a "benchmark platform".

With a headless Linux system I don't think there's any solution for fixed fans yet, others might know differently. I previously had fixed fans and clocks on Linux, but I perfectly recall that I specifically had to configure/attach a monitor in order to get that working at the time.

to set the fan on linux just test this

https://gist.github.com/squadbox/e5b5f7bcd86259d627ed

Thanks, but I fear it needs a monitor or the x session will not start... Or will it?

Look back about 10 pages in this thread, I got an xsession, OC and fan control on a headless card. Hashbrown
pointed to another thread with a detailed procedure for doing this. Coincidentally it only offers fixed fan settings.
The OC is a fixed offset from the base clock which can throttle.
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Or maybe on windows it's different...

Yeps. On Windows it is extremely simple to set fixed clocks, fans, overclocking, so you can easily have a "benchmark platform".

With a headless Linux system I don't think there's any solution for fixed fans yet, others might know differently. I previously had fixed fans and clocks on Linux, but I perfectly recall that I specifically had to configure/attach a monitor in order to get that working at the time.

to set the fan on linux just test this

https://gist.github.com/squadbox/e5b5f7bcd86259d627ed

Thanks, but I fear it needs a monitor or the x session will not start... Or will it?
legendary
Activity: 1154
Merit: 1001
I remember getting the number of SMM/SMXs in the CryptoNight miner... can't remember how, though.

Yep. TSIV's Cryptonight ccminer complains whenever the blocks/threads (don't recall which one now) is not a multiple of the SMX/SMM on the card. IIRC, it does this also accurately for cards that did not exist when it was released, so the approach is not LUT based.
member
Activity: 70
Merit: 10
These settinge should be taken out of the hardcoded kernal and be adjustable in the commandline. Like in scrypt.   -l 7x19 (7threads per block with -X intensity 19).
Because on the 970 9 seems to be bether.

@sp_:
Shouldn't the optimal blocks/threads values always be relative to the number of SMM/SMXs in any given card? It might be possible to have ccminer automatically detect - and adjust - to the number of SMM/SMXs on the available cards, and then just use the intensity parameter for the fine tuning.

Seems reasonable.  I don't think we can directly query the physical configuration of the chipset.  But it wouldn't be much to maintain a LUT.  The major hurdle would probably be reliably parsing the compilation output for the kernel.
newbie
Activity: 29
Merit: 0
Or maybe on windows it's different...

Yeps. On Windows it is extremely simple to set fixed clocks, fans, overclocking, so you can easily have a "benchmark platform".

With a headless Linux system I don't think there's any solution for fixed fans yet, others might know differently. I previously had fixed fans and clocks on Linux, but I perfectly recall that I specifically had to configure/attach a monitor in order to get that working at the time.

to set the fan on linux just test this

https://gist.github.com/squadbox/e5b5f7bcd86259d627ed
legendary
Activity: 1154
Merit: 1001
These settinge should be taken out of the hardcoded kernal and be adjustable in the commandline. Like in scrypt.   -l 7x19 (7threads per block with -X intensity 19).
Because on the 970 9 seems to be bether.

@sp_:
Shouldn't the optimal blocks/threads values always be relative to the number of SMM/SMXs in any given card? It might be possible to have ccminer automatically detect - and adjust - to the number of SMM/SMXs on the available cards, and then just use the intensity parameter for the fine tuning.

I lost this bit of information somewhere along the thread history, but seeing as you now have -X , is -i still doing anything on your fork? If both parameters can be used, what the F#%@ is each one doing? Thanks for clearing this up  Smiley

It's starting to look a bit mad to find the best settings for one's card, if there are 3 distinct parameters to tune (-X, -i, -l).
Jump to: