Pages:
Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 26. (Read 2347588 times)

full member
Activity: 728
Merit: 106
From what I've seen based on SPs updates, their miners (especially enemy) is updated about a day later with changes SP makes. Hence why open software doesn't go anywhere.
Enemy adopt sp_ "improvements" and release his miner that is faster than sp_ slow miner with all those "improvements".
Nice logic.  Shocked

For a person with normal logic it is evident that sp_ can't compete with z-enemy/t-rex miners and he make his "improvements" based on what z/t already implemented. That's why this miner is not sold as others - noone will by slow miner. That's why in this case sp_ decided to be a "knight in white clothes" and release slow but opensource miner.  Grin

P.S. But anyway - it is good work from some point of view.
member
Activity: 392
Merit: 27
http://radio.r41.ru
with changes SP makes
if others took work SP_ and used it - the income on coins at SP_ would be higher. at the moment, SP_ mainers in the console draws well, but the income of coins loses to everyone else
legendary
Activity: 1764
Merit: 1024
If coin devs give two shits about the actual mining community (IE gpu miners), they really need to be more proactive with protecting their coins against ASICs and especially FPGAs. Changing things up on a regular schedule including forking at the first signs of ASICs to a new algo. There are more then enough GPU developers to keep up with evolving algos, especially after this year. More advanced ASICs are functioning more and more like GPUs (and FPGAs) to fill in the gap with evolving algos as well. Keep it fresh if you want to keep your community around. Not everyone is a BTC or ETH.


X16r/X16s/C11/x17 Spmod-git #11 has been released

- Added support for X17 (+10-15% faster than the ccminer alexis 1.0 opensource fork) (24-25MHASH on the 1080ti)
- Faster x16r/x16s +3-5% more than sp-mod #10 (Optimalizations added to Fugue, whirlpool)

this miner is free, opensource, and have no fee

cuda 9.2 builds:

https://github.com/sp-hash/suprminer/releases

sourcecode:

https://github.com/sp-hash/suprminer/commits/master

Your free product will not allow t-rex, Cryptodredge and Enemy sleep. Definitely we will see their new versions in a few days.

From what I've seen based on SPs updates, their miners (especially enemy) is updated about a day later with changes SP makes. Hence why open software doesn't go anywhere.
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
"The maximum shared memory per block remains limited at 48KB as with prior architectures".

So you need 2 blocks.
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Thanks - your contributions have made it easier for even a relative novice to make further modifications to the code and come out (slightly) ahead of enemy and trex, even before their fees.

Convert simd to use shared memory instead of a big memory buffer  (d_temp4[thr_id]) and then you get a good speedup. this has already been done in the other private miners. I might opensource later on..

here from enemy  1.20 SIMD implementation disassembly converted to ptx

enemy 1,20 is using 16kb of sharedmem

Code:
..
st.shared.u32 [%r1671], %r1666;
shfl.sync.up.b32 %r1672|%p370, %r162, %r304, %r1653, %r302;
selp.b32 %r1673, %r144, %r1672, %p92;
shfl.sync.up.b32 %r1674|%p371, %r154, %r304, %r1653, %r302;
selp.b32 %r1675, %r136, %r1674, %p92;
mul.lo.s32 %r1676, %r1673, 185;
mul.lo.s32 %r1677, %r1675, 185;
prmt.b32 %r1678, %r1676, %r1677, %r1661;
shfl.sync.idx.b32 %r1679|%p372, %r1678, %r1665, %r301, %r302;
st.shared.u32 [%r1671+4], %r1679;
shfl.sync.up.b32 %r1680|%p373, %r126, %r304, %r1653, %r302;
selp.b32 %r1681, %r108, %r1680, %p92;
shfl.sync.up.b32 %r1682|%p374, %r118, %r304, %r1653, %r302;
selp.b32 %r1683, %r100, %r1682, %p92;
mul.lo.s32 %r1684, %r1681, 185;
mul.lo.s32 %r1685, %r1683, 185;
prmt.b32 %r1686, %r1684, %r1685, %r1661;
shfl.sync.idx.b32 %r1687|%p375, %r1686, %r1665, %r301, %r302;
st.shared.u32 [%r1671+8], %r1687;
shfl.sync.up.b32 %r1688|%p376, %r1613, %r304, %r1653, %r302;
selp.b32 %r1689, %r180, %r1688, %p92;
shfl.sync.up.b32 %r1690|%p377, %r190, %r304, %r1653, %r302;
selp.b32 %r1691, %r172, %r1690, %p92;
mul.lo.s32 %r1692, %r1689, 185;
mul.lo.s32 %r1693, %r1691, 185;
prmt.b32 %r1694, %r1692, %r1693, %r1661;
shfl.sync.idx.b32 %r1695|%p378, %r1694, %r1665, %r301, %r302;
st.shared.u32 [%r1671+12], %r1695;
shfl.sync.up.b32 %r1696|%p379, %r91, %r304, %r1653, %r302;
selp.b32 %r1697, %r73, %r1696, %p92;
shfl.sync.up.b32 %r1698|%p380, %r3433, %r304, %r1653, %r302;
selp.b32 %r1699, %r3432, %r1698, %p92;
mul.lo.s32 %r1700, %r1697, 185;
mul.lo.s32 %r1701, %r1699, 185;
prmt.b32 %r1702, %r1700, %r1701, %r1661;
ld.const.u8 %r1703, [%r1664+8];
shfl.sync.idx.b32 %r1704|%p381, %r1702, %r1703, %r301, %r302;
st.shared.u32 [%r1671+128], %r1704;
...
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Streebog can only fit 2 (3) copies of the tables in shared mem, unless you can compute them on fly. Did you manage to apply the "trick" to it?

The streebog opensource implementation use 8kb of shared memory. and the pascal chip has 96kb of shared memory. But you have some limitations.  I see that the cryptodredge 0.9 is using 48kb shared memory. You can use more shared memory if you disable the level1 cache..


256 X 8 X 8 = 16K
You can use up to 48k of shared mem so 2 or 3 copies.
As for pascal, "The maximum shared memory per block remains limited at 48KB as with prior architectures".
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Streebog can only fit 2 (3) copies of the tables in shared mem, unless you can compute them on fly. Did you manage to apply the "trick" to it?

The streebog opensource implementation use 8kb of shared memory. and the pascal chip has 96kb of shared memory. But you have some limitations.  I see that the cryptodredge 0.9 is using 48kb shared memory. You can use more shared memory if you disable the level1 cache..
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Pushed a 20% faster shavite_final implementation to github. Another #include "cuda_x11_aes_sp.cuh" modification...

The sp-modded Optimized AES has been applied to the following hashing functions.

echo  (done)
fugue (done)
shavite (only the final function / if shavite is the last algo in the chain)
whirlpool(not done)
streborg(not done)



Streebog can only fit 2 (3) copies of the tables in shared mem, unless you can compute them on fly. Did you manage to apply the "trick" to it?
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Pushed a 20% faster shavite_final implementation to github. Another #include "cuda_x11_aes_sp.cuh" modification...

The sp-modded Optimized AES has been applied to the following hashing functions.

echo  (done)
fugue (done)
shavite (only the final function / if shavite is the last algo in the chain)
whirlpool(not done)
streborg(not done)

jr. member
Activity: 213
Merit: 3
z-enemy wins 10-12% Net income of z-enemy miner

My opensource is improving and is now running at the same speed as the enemy miner did a few weeks ago.
Every algo has a limit, and the competition can't add 10-15% a week forever.

Thanks - your contributions have made it easier for even a relative novice to make further modifications to the code and come out (slightly) ahead of enemy and trex, even before their fees.

I've bought a couple mods from you - and been rejected when I asked for a discount on multiple, oh well, but I do have to say that your contributions have been very valuable, and make me more willing to pay full price in the future if it makes sense.

Are you saying you've taken SP's x16r free miner and have modified it to make it faster than Z-enemy's miner? As for his contribution, I agree, he has contributed a lot towards the development of good miners, regardless of whether I agree with his fee schedule or not.
full member
Activity: 209
Merit: 100
z-enemy wins 10-12% Net income of z-enemy miner

My opensource is improving and is now running at the same speed as the enemy miner did a few weeks ago.
Every algo has a limit, and the competition can't add 10-15% a week forever.

Thanks - your contributions have made it easier for even a relative novice to make further modifications to the code and come out (slightly) ahead of enemy and trex, even before their fees.

I've bought a couple mods from you - and been rejected when I asked for a discount on multiple, oh well, but I do have to say that your contributions have been very valuable, and make me more willing to pay full price in the future if it makes sense.
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
z-enemy wins 10-12% Net income of z-enemy miner

My opensource is improving and is now running at the same speed as the enemy miner did a few weeks ago.
Every algo has a limit, and the competition can't add 10-15% a week forever.
legendary
Activity: 3248
Merit: 1070
any speed test for the 2080?
member
Activity: 392
Merit: 27
http://radio.r41.ru
If see for time, SP_ late with the release of the new miner, z-enemy first, T-Rex second

My miner is not new, and my miner is the fastest opensource by far in the following algos. C11/X16R/X16S/X17
If you compare the income by coins, z-enemy wins 10-12% Net income of z-enemy miner
if see on pool your miner almost no one uses it
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
If see for time, SP_ late with the release of the new miner, z-enemy first, T-Rex second

My miner is not new, and my miner is the fastest opensource by far in the following algos. C11/X16R/X16S/X17
member
Activity: 392
Merit: 27
http://radio.r41.ru
X16r/X16s/C11/x17 Spmod-git #11 has been released

- Added support for X17 (+10-15% faster than the ccminer alexis 1.0 opensource fork) (24-25MHASH on the 1080ti)
- Faster x16r/x16s +3-5% more than sp-mod #10 (Optimalizations added to Fugue, whirlpool)

this miner is free, opensource, and have no fee

cuda 9.2 builds:

https://github.com/sp-hash/suprminer/releases

sourcecode:

https://github.com/sp-hash/suprminer/commits/master

Your free product will not allow t-rex, Cryptodredge and Enemy sleep. Definitely we will see their new versions in a few days.
If see for time, SP_ late with the release of the new miner, z-enemy first, T-Rex second
sr. member
Activity: 954
Merit: 250
X16r/X16s/C11/x17 Spmod-git #11 has been released

- Added support for X17 (+10-15% faster than the ccminer alexis 1.0 opensource fork) (24-25MHASH on the 1080ti)
- Faster x16r/x16s +3-5% more than sp-mod #10 (Optimalizations added to Fugue, whirlpool)

this miner is free, opensource, and have no fee

cuda 9.2 builds:

https://github.com/sp-hash/suprminer/releases

sourcecode:

https://github.com/sp-hash/suprminer/commits/master

Your free product will not allow t-rex, Cryptodredge and Enemy sleep. Definitely we will see their new versions in a few days.
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
X16r/X16s/C11/x17 Spmod-git #11 has been released

- Added support for X17 (+10-15% faster than the ccminer alexis 1.0 opensource fork) (24-25MHASH on the 1080ti)
- Faster x16r/x16s +3-5% more than sp-mod #10 (Optimalizations added to Fugue, whirlpool)

this miner is free, opensource, and have no fee

cuda 9.2 builds:

https://github.com/sp-hash/suprminer/releases

sourcecode:

https://github.com/sp-hash/suprminer/commits/master
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
submitted a 40% faster whirlpool80 @github / +6% whirlpool
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Submitted a 10% faster fugue implementation @github. Using the same tricks as in my opensource echo implementation with 8 copies of the mixtab table in sharedmem to avoid bank conflicts.
Pages:
Jump to: