Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 709. (Read 2347641 times)

legendary
Activity: 3164
Merit: 1003
1. 0.1BTC: Pentablake +100-120% (3 releases)
4. 0.1BTC: All nicehash algos optimized. 0-10% (5 releases)(x11,x13,x15,nist5,quark,lyra2v2,neoscrypt)
5. 0.1BTC: decred +18-25% (7 releases) (Full sourcecode(linux) 0.4BTC)
6. 0.2BTC: Vcash(+13%+decred(+18-25%) (0.1 btc discount for the decred buyers) (4 releases)
I will put these 4 in one build.(1.7.6 refork) the price will be 0.4BTC It includes unpublished decred #8, Vcash #5, Quark #6. Discounts for the donators. So If you already have bought decred and VCASH you only need 0.2BTC more.
I just sent you .1 BTC for
4. 0.1BTC: All nicehash algos optimized. 0-10% (5 releases)(x11,x13,x15,nist5,quark,lyra2v2,neoscrypt)
I can't afford .4 BTC

But you have already donated 0.3 BTC haven't you. (from the beginning of the SP-MOD project) Donators get discounts..

Yes I did and I sent you another .1 BTC  Smiley
Net amount: -0.10010213 BTC
Transaction ID: ce6839a959c74297112b0c24b34859223cfffeffd4016ea4c72c42bf3dac887a-000
Thank you
Set you a pm.
Sp  how are you doing on the final build?  .... waiting for the quark and x11 x13  nicehash  algo's  Smiley
And yes now it's  a total of .4 BTC I sent...thx
EDIT:There is a increase in the last one you sent for DECRED but I only tested it for a few minutes cause I'm working on my first conf file for something and it's taking me a long time. Smiley
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Decred #4 is the fastest if you run it with -i 29.6. But the hashrate on the pool is lower because of a uint32_t overflow in the nonce per thread loop. I have corrected this in #5.

Default intensity for the 980ti is 31. But it runs faster with -i 31.9. with overclocking you can reach 3.3GHASH.

1. The difference between a Linux build and a windows builds can be as much as 5%.
2. A optimalization on the gtx 970, can be slower than on the 750ti.
3. I compare my builds to the cryptomining blog builds, and sometimes I build myself. (cuda 7.5 32 bit)
member
Activity: 61
Merit: 10
So Sp_
How fast is your DCR miner for the 970 now?
He reached my throughput with his SP_MOD#8
Run the 1.5% fee miner and you'll get a good approximation on your machines
https://bitcointalksearch.org/topic/m.14476625

I've released it yesterday because, up to yesterday, i thought SP's results were legit and he was indeed 25-30% than tpruvot's git while he is only 15-20%
and after the latest commits by pallas and me, he is 6% faster than the publicly available
Probably he used the default intensities there too in order to increase his sales and not the actual throughput Cry
member
Activity: 61
Merit: 10
xvc@1466 -i 30 #5
970 ~3170
980 ~3920

xvc@1466 -i 30 #6
970 ~3260
980 ~4030

All on windows 7 x64. Results with p0.
Here is a screenshot of my latest vcash code (Compiled less than an hour ago on CUDA7.5 CP5.2,5.0,3.5,3.0) which gave me a +0.7% boost



It reports an average of 3.303Gh/s on an average of 1453.3MHZ on my GTX970

And here are the latest binaries:
http://s000.tinyupload.com/index.php?file_id=07038368028057919944 from the above screenshot

I'd be glad if testers could validate my results

Also, for historical purposes and to make things clear, i'd be glad if testers could grab the ccminer for vcash compiled on 24th of March for CUDA7.5 CP5.2, 5.0, 3.5 from here:
https://v.cash/forum/threads/ccminer-faster-8-round-blake-algo-1-16x.282/page-2
(It's the last post)
And report their results on this one too

Edit2:
Testers should find their own intensities (i use 31 for 970 on vcash)
@antantti
Can you validate my kernel's hashrate on your machine too?
Some basic math makes me expect that you should see (1466/1453.3)*3.303 ~= 3.33Gh/s  (That's not the case for memory bounded algos.)

which is almost +70MH/s than SP_MOD#6 or 2.1% faster

Another thing is that I saw everybody using my first (13th of February) built for cuda 6.5 - CP 3.5 to compare against the private kernels.
I asked from A friend / forum member to run my latest build in comparison to cuda6.5 cp3.5 and found it 1.48% faster on his 2 x 980tis (running at 1466MHz).
His hashrate actually went from 10.7-10.8Gh/s to 10.91Gh/s .
legendary
Activity: 882
Merit: 1000
So Sp_
How fast is your DCR miner for the 970 now?
legendary
Activity: 1176
Merit: 1015
Pools can be picky sometimes, quick tests showed me that xvc gives reliable results on some major pools.

xvc@1466 -i 30 #5

970 ~3170
980 ~3920

xvc@1466 -i 30 #6

970 ~3260
980 ~4030

All on windows 7 x64.

-edit-

Results with p0.

member
Activity: 61
Merit: 10
@antantti
i'm on suprnova in order to check the reported hashrate through the graphs. Been using xCore's pool too (https://pool.v.cash/)
There should be no problem choosing a pool
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
You can use this pool

ccminer -a vanilla -o stratum+tcp://pool.v.cash:3001 -u  VafRjT9uiVUasRPVdQi4KiKJJYW81BuDLQ -p d=10
legendary
Activity: 1176
Merit: 1015
About xvc, your latest (cuda 7.5 32bit exe) on windows 7 x64,
clocks ~1466:
970 ~3160
980 ~3890

hmm. slower than version #3. Can you try #6?

It is faster, numbers soon.

-edit-

What pool are you guys using?

member
Activity: 61
Merit: 10
Can you disable L1 cache with -dlcm=cg while compiling? I dont know if it's already done since i dont use windows and the compilation parameters on my git (for windows) are broken. Thanks in advance

Interesting. I will try this.
Always here to help SP Smiley
I didnt come with a bad attitude, you brought me here with a bad attitude by promising inexistent speeds.
Also, you can try -abi=no , though it's deprecated, it performs slightly better here (edit: by slightly i mean less than 0.02%)
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Can you disable L1 cache with -dlcm=cg while compiling? I dont know if it's already done since i dont use windows and the compilation parameters on my git (for windows) are broken. Thanks in advance

Interesting. I will try this.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
About xvc, your latest (cuda 7.5 32bit exe) on windows 7 x64,
clocks ~1466:
970 ~3160
980 ~3890

hmm. slower than version #3. Can you try #6?
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Vanilla (V-cash) sp-mod #6 sendt to the donators. Please test and report your numbers compared to the alexis Kernal.

Compute 5.0 have a default intensity of 29.
Compute 5.2 have a default intensity of 30.

I get faster hashrates on all my cards. gtx 750,750ti,950,960,970,980,980ti

Compiled with the 364.51 driver. Cuda 7.5 32 bit.
member
Activity: 61
Merit: 10
Here is a screenshot of my latest vcash code (Compiled less than an hour ago on CUDA7.5 CP5.2,5.0,3.5,3.0) which gave me a +0.7% boost



It reports an average of 3.303Gh/s on an average of 1453.3MHZ on my GTX970

And here are the latest binaries:
http://s000.tinyupload.com/index.php?file_id=07038368028057919944 from the above screenshot

I'd be glad if testers could validate my results

Also, for historical purposes and to make things clear, i'd be glad if testers could grab the ccminer for vcash compiled on 24th of March for CUDA7.5 CP5.2, 5.0, 3.5 from here:
https://v.cash/forum/threads/ccminer-faster-8-round-blake-algo-1-16x.282/page-2
(It's the last post)
And report their results on this one too

Edit:

SP, i just read your post above about intensity. So, what did you exactly sold to all these people?
A better intensity option for an already open sourced code?

Edit2:
Testers should find their own intensities (i use 31 for 970 on vcash)
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
That's not a 6.8% increase between SP_MOD#3 and SP_MOD#5 and certainly not a 30% faster than Provos Alexis kernal, nor 15% faster than Provos Alexis kernal, neither 8% faster than Provos Alexis kernal
I'm waiting for more results from his other GPUS

Your miner is using default intensity of 24 on windows. If you compile and run it without the -i parameter my version is 30% faster on the same clocks..  Your fastest version is compiled with cuda 6.5. I haven't tested my version on anything else than cuda 7.5 32bit. I will release another faster version soon.
legendary
Activity: 1176
Merit: 1015
Decred #8 and vanilla #5 windows exe (cuda 7.5 32bit exe) sendt to the 0.2BTC donators.

Vanilla kernal sourcecode for sale for 0.4 BTC. (so you can run it on linux and learn from the 3lit3)

Quick test run on dcr:

+1.7% on 970
+1.1% on 980

About xvc, your latest (cuda 7.5 32bit exe) on windows 7 x64,

clocks ~1466:

970 ~3160
980 ~3890

clocks ~1521:

970 ~3280
980 ~4040

Alexis 6.5 + 3.5 version compiled by CMB month ago is actually faster.


member
Activity: 61
Merit: 10
Hi again SP_

I just got a message from a user of your SP_MODs.
He said that, for VCASH, none of your versions are as fast as CUDA6.5/CP3.5 (I dont know yet if he run my latest sources for cuda7.5 cp5.2), and that your SP_MOD#5 just started reaching my numbers. Since i see that you are unwilling to share a proper screenshot, i asked from him to make a comparison screenshot on the same PC, using the same gpu, under the same clocks.

Edit: U see why i said there is something odd here?

Edit2:
First reply from the user:
"I did a quick test on the 750ti - you guys are even. 1100mh/s on 750ti. Vcash"

That's not a 6.8% increase between SP_MOD#3 and SP_MOD#5 and certainly not a 30% faster than Provos Alexis kernal, nor 15% faster than Provos Alexis kernal, neither 8% faster than Provos Alexis kernal
I'm waiting for more results from his other GPUS
legendary
Activity: 3164
Merit: 1003
SP if i buy Decred+Vanilla miner, i want get only Decred8+vanilla5 miner, I don't want to buy for 0.2 even some algorithms unnecessary to me! I think that the majority of donators who bought a miner for a specific purpose I understand, and agree. Your tactics like luring BTC from the buyers

I want to collect all my kernals into one miner and increase the price.. Less admin. Easier for the miner.
And then I will quit. Let the young and promising do the pascal mod..

Happy hashing!
Sp  are you still working on that?  All kernels into one miner?   thx
Ps don't quit.  Undecided
Or at least get us started on pascal.
member
Activity: 61
Merit: 10
When I run your latest  kernal@github on windows i get 2.7MHASH on the same clocks as I do 3.52 MHASH wih my kernal. when running without the -i parameter.

You should remove the  !is_windows()

in the line:

int intensity = (device_sm[dev_id] > 500 && !is_windows()) ? 30 : 24;

From 2.7->3.52 is a 30% increase.

Windows cuda 7.5 32bit build compute 5.2

The intensity 24 was discussed with Epsylon3 and we thought it's better to start the miner with low intensity in case a user is trying to mine on his primary gpu.
Since we dont want them to freeze their systems by accident.
Edit: Now that you mentioned that, i should probably decrease the intensity on linux too


Anyway, can you run it with the same intensity you do on your kernel and post a screenshot? And since you are compiling on windows,
Can you disable L1 cache with -dlcm=cg while compiling? I dont know if it's already done since i dont use windows and the compilation parameters on my git (for windows) are broken. Thanks in advance

Edit2 : On 2nd day of April you stated that you get 3.3Gh/s at 1466MHz on a 970 with SP_MOD#3. Which was (3.3-3.292)-1 = 0.8% faster than the already public available version since the 20th of March.
Is this 3.525Gh/s performance measured on the same clocks? Cuz if it is, you claim a (3.525/3.3)-1 = 0.068 or 6.8% increase between (SP_MOD#3, the public available version) and SP_MOD#5
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
When I run your latest  kernal@github on windows i get 2.7MHASH on the same clocks as I do 3.52 MHASH wih my kernal. when running without the -i parameter.

You should remove the  !is_windows()

in the line:

int intensity = (device_sm[dev_id] > 500 && !is_windows()) ? 30 : 24;

From 2.7->3.52 is a 30% increase.

Windows cuda 7.5 32bit build compute 5.2
Jump to: