Pages:
Author

Topic: [ANN] sgminer v5 - optimized X11/X13/NeoScrypt/Lyra2RE/etc. kernel-switch miner - page 77. (Read 877903 times)

legendary
Activity: 1400
Merit: 1050
new addition to sgminer (github/djm34/sgminer): yescrypt algo
(binaries here: http://ge.tt/5AAmOfF2/v/0?c)

there is 2 implementation of the algo --kernel yescrypt for amd
                                                  --kernel yescrypt-multi for newer nvidia cards (compute 5.2 ie 900 serie for other cards ccminer should be better)

see example.bat for example on how to use it
member
Activity: 74
Merit: 10
3.8 mh/s with my r9 280 (without x ). how much you get ? wich hardware , driver , setting ?
hero member
Activity: 968
Merit: 624
Still a manic miner
thanks for your answers, but i still cant get decent speeds  Sad
member
Activity: 74
Merit: 10
there is a better quark kernel. but the person who write this kernel doest make it public. look to the last sites here and you see it.
member
Activity: 98
Merit: 10
Could somebody plz share the config settings for quarkcoin with 7850/7950 cards?
thanks a lot!

I just use the same settings as x11 etc, but I change gpu-threads to 1, and worksize 256.

I wish there was a better quark kernel, it's still wayyyy slower than nvidia cards.
hero member
Activity: 968
Merit: 624
Still a manic miner
Could somebody plz share the config settings for quarkcoin with 7850/7950 cards?
thanks a lot!
member
Activity: 93
Merit: 11
It would be much appreciated if someone can confirm as to which algorithms/miners are supported by my Compute Capability 1.1 GPUs such as 9800GTX+/GTS250.  I've tried mining x11 w/ Cudaminer& ccminer on 340.52, the latest driver w/ support for my GPUs but unfortunately, GPU utilization is ~ 1%, making Scrypt my only viable yet unprofitable option. 

Thanks to anyone in advance for offering their help!

sgminer is for AMD GPU architectures...

I am well aware that sgminer is optimized for AMD GPUs but given that Cudaminer& ccminer work only w/ Scrypt which is currently dominated by ASICs, the difficulty is now too high for these GPUs to submit any shares even when connected to low difficulty servers.  Despite the performance hit which OpenCL will introduce compared to CUDA, it seems to remain my best option unless someone can point me in the right direction.

It would be much appreciated if someone can confirm as to which algorithms/miners are supported by my Compute Capability 1.1 GPUs such as 9800GTX+/GTS250.  I've tried mining x11 w/ Cudaminer& ccminer on 340.52, the latest driver w/ support for my GPUs but unfortunately, GPU utilization is ~ 1%, making Scrypt my only viable yet unprofitable option. 

Thanks to anyone in advance for offering their help!

You're looking for CudaMiner.  You can find the thread here...

https://bitcointalksearch.org/topic/ann-cudaminer-ccminer-cuda-based-mining-applications-windowslinuxmacosx-167229


Thanks for your input!
legendary
Activity: 952
Merit: 1002
It would be much appreciated if someone can confirm as to which algorithms/miners are supported by my Compute Capability 1.1 GPUs such as 9800GTX+/GTS250.  I've tried mining x11 w/ Cudaminer& ccminer on 340.52, the latest driver w/ support for my GPUs but unfortunately, GPU utilization is ~ 1%, making Scrypt my only viable yet unprofitable option. 

Thanks to anyone in advance for offering their help!

You're looking for CudaMiner.  You can find the thread here...

https://bitcointalksearch.org/topic/ann-cudaminer-ccminer-cuda-based-mining-applications-windowslinuxmacosx-167229
sr. member
Activity: 547
Merit: 250
It would be much appreciated if someone can confirm as to which algorithms/miners are supported by my Compute Capability 1.1 GPUs such as 9800GTX+/GTS250.  I've tried mining x11 w/ Cudaminer& ccminer on 340.52, the latest driver w/ support for my GPUs but unfortunately, GPU utilization is ~ 1%, making Scrypt my only viable yet unprofitable option. 

Thanks to anyone in advance for offering their help!

sgminer is for AMD GPU architectures...
member
Activity: 93
Merit: 11
It would be much appreciated if someone can confirm as to which algorithms/miners are supported by my Compute Capability 1.1 GPUs such as 9800GTX+/GTS250.  I've tried mining x11 w/ Cudaminer& ccminer on 340.52, the latest driver w/ support for my GPUs but unfortunately, GPU utilization is ~ 1%, making Scrypt my only viable yet unprofitable option. 

Thanks to anyone in advance for offering their help!
hero member
Activity: 672
Merit: 500
I agree. However, that's what most people use.
hero member
Activity: 672
Merit: 500
Very interesting experiment... curious to see CubeHash to be so slow.
However I would suggest to iterate at least 1024 calls to rule out possible I$ effects.
Also, clock() does not do what people really wants (it might be equivalent in this context), main problem is the granularity is implementation dependent. I'm surprised it can now be used to bench instructions such as those nowadays.

In any case, don't use it. Use the performance counter or even better C++11 std::chrono.

Quote
Groestl should be faster than SIMD on a CPU with AES-NI, I would think.
Are the AES-NI instructions applicable to the bigger Groestl round? I think it's most likely the compiler didn't emit correct code, it seems likely it wouldn't even try. It might be intrinsics or nothing.
newbie
Activity: 86
Merit: 0
OK, I did some measurements on CPU and GPU and here are the results below:

Detailed CPU results can be obtained from eBACS for multiple architectures, but I used my own quick code below on Core i7:

Code:
#include "stdafx.h"
#include
#include
#include
#include
#include

#include "sha3/sph_blake.h"
#include "sha3/sph_bmw.h"
#include "sha3/sph_groestl.h"
#include "sha3/sph_jh.h"
#include "sha3/sph_keccak.h"
#include "sha3/sph_skein.h"
#include "sha3/sph_luffa.h"
#include "sha3/sph_cubehash.h"
#include "sha3/sph_shavite.h"
#include "sha3/sph_simd.h"
#include "sha3/sph_echo.h"
#include "sha3/sph_hamsi.h"
#include "sha3/sph_fugue.h"
#include "sha3/sph_shabal.h"
#include "sha3/sph_whirlpool.h"

sph_blake512_context     ctx_blake;
sph_bmw512_context       ctx_bmw;
sph_groestl512_context   ctx_groestl;
sph_jh512_context        ctx_jh;
sph_keccak512_context    ctx_keccak;
sph_skein512_context     ctx_skein;
sph_luffa512_context     ctx_luffa;
sph_cubehash512_context  ctx_cubehash;
sph_shavite512_context   ctx_shavite;
sph_simd512_context      ctx_simd;
sph_echo512_context      ctx_echo;
sph_hamsi512_context     ctx_hamsi;
sph_fugue512_context     ctx_fugue;
sph_shabal512_context    ctx_shabal;
sph_whirlpool_context    ctx_whirlpool;

#define TEST_PRE(na) printf("%s: ", na); \
unsigned char hash[256]; \
for(int i=0; i<64; i++) hash[i] = i; \
int t = clock(); \
for(int i=0; i<1000000; i++) \
{

#define TEST_POST() } \
printf ("\t\t%g seconds\n", ((float)(clock() - t)) / CLOCKS_PER_SEC); \

void main()
{
{//blake512
TEST_PRE("blake512");
sph_blake512_init(&ctx_jh);
sph_blake512(&ctx_jh, hash, 64);
sph_blake512_close(&ctx_jh, hash);
TEST_POST();
}

{//bmw512
TEST_PRE("bmw512");
sph_bmw512_init(&ctx_jh);
sph_bmw512(&ctx_jh, hash, 64);
sph_bmw512_close(&ctx_jh, hash);
TEST_POST();
}

{//groestl512
TEST_PRE("groestl512");
sph_groestl512_init(&ctx_jh);
sph_groestl512(&ctx_jh, hash, 64);
sph_groestl512_close(&ctx_jh, hash);
TEST_POST();
}

{//skein512
TEST_PRE("skein512");
sph_skein512_init(&ctx_jh);
sph_skein512(&ctx_jh, hash, 64);
sph_skein512_close(&ctx_jh, hash);
TEST_POST();
}

{//jh512
TEST_PRE("jh512");
sph_jh512_init(&ctx_jh);
sph_jh512(&ctx_jh, hash, 64);
sph_jh512_close(&ctx_jh, hash);
TEST_POST();
}

{//keccak512
TEST_PRE("keccak512");
sph_keccak512_init(&ctx_jh);
sph_keccak512(&ctx_jh, hash, 64);
sph_keccak512_close(&ctx_jh, hash);
TEST_POST();
}

{//luffa512
TEST_PRE("luffa512");
sph_luffa512_init(&ctx_jh);
sph_luffa512(&ctx_jh, hash, 64);
sph_luffa512_close(&ctx_jh, hash);
TEST_POST();
}

{//cubehash512
TEST_PRE("cubehash512");
sph_cubehash512_init(&ctx_jh);
sph_cubehash512(&ctx_jh, hash, 64);
sph_cubehash512_close(&ctx_jh, hash);
TEST_POST();
}

{//shavite512
TEST_PRE("shavite512");
sph_shavite512_init(&ctx_jh);
sph_shavite512(&ctx_jh, hash, 64);
sph_shavite512_close(&ctx_jh, hash);
TEST_POST();
}

{//simd512
TEST_PRE("simd512");
sph_simd512_init(&ctx_jh);
sph_simd512(&ctx_jh, hash, 64);
sph_simd512_close(&ctx_jh, hash);
TEST_POST();
}

{//echo512
TEST_PRE("echo512");
sph_echo512_init(&ctx_jh);
sph_echo512(&ctx_jh, hash, 64);
sph_echo512_close(&ctx_jh, hash);
TEST_POST();
}

{//hamsi512
TEST_PRE("hamsi512");
sph_hamsi512_init(&ctx_jh);
sph_hamsi512(&ctx_jh, hash, 64);
sph_hamsi512_close(&ctx_jh, hash);
TEST_POST();
}

{//fugue512
TEST_PRE("fugue512");
sph_fugue512_init(&ctx_jh);
sph_fugue512(&ctx_jh, hash, 64);
sph_fugue512_close(&ctx_jh, hash);
TEST_POST();
}

{//shabal512
TEST_PRE("shabal512");
sph_shabal512_init(&ctx_jh);
sph_shabal512(&ctx_jh, hash, 64);
sph_shabal512_close(&ctx_jh, hash);
TEST_POST();
}

{//whirlpool
TEST_PRE("whirlpool");
sph_whirlpool_init(&ctx_jh);
sph_whirlpool(&ctx_jh, hash, 64);
sph_whirlpool_close(&ctx_jh, hash);
TEST_POST();
}
}

All results are scaled to be relative to the fastest hash function, which is shabal.

And here's the graph:
https://i.imgur.com/ySLE1xK.png


GPU results where obtained on R9 280X with smginer 5.1 and driver 13.X using AMD CodeXL and scaled to be relative to the fastest function shabal:
https://i.imgur.com/k6lif5s.png


Notice that CPU speed difference is 1X - 8X, while GPU speed difference is 1X - 45X.


PS: Note, you cannot compare values from CPU graph and GPU graph directly. Only algorithm's relative performance in respective graph matters.
newbie
Activity: 86
Merit: 0
Has anybody done per-algorithm* comparison of CPU and GPU megahashes? And/or the wattage on GPU.

Like for example: BMW 100MHS on R9 280X and 20MHS on Core i7 8 threads.

Which of the SPH algorithms are the best for CPU?

* By algoritm here I mean the primitive algos, like BMW, skein, luffa, etc. Not the combined ones like Quark, X11, X15, etc

Best for CPU compared to GPU? None. You want a KDF, not a hash function.

No, not better on CPU. I'd say "less bad" on CPU than on GPU. Perhaps on some algorithms CPU is 100X worse, while on others it is 5X worse?

For example, CPUs have built in AES NI, so algorithms like Fugue should be not as lousy on CPU?
newbie
Activity: 86
Merit: 0
Has anybody done per-algorithm* comparison of CPU and GPU megahashes? And/or the wattage on GPU.

Like for example: BMW 100MHS on R9 280X and 20MHS on Core i7 8 threads.

Which of the SPH algorithms are the best for CPU?

* By algoritm here I mean the primitive algos, like BMW, skein, luffa, etc. Not the combined ones like Quark, X11, X15, etc
sr. member
Activity: 439
Merit: 250


Why bother with Windows if you're a Linux guy? Get this: http://www.getpimp.org/

Wow, that makes life really easy.

Using my own pool but was going to set up a coinking profile since PiMP has that as a default listed pool. But for some reason though I dont think the MSI 290x Lightnings are used much for mining. I dont see any pre-configured settings for them. Coinking.io and others never seem to have my card listed. Of course, I was thinking about just getting a handful of 280x's or maybe even 7950's. Ive seen some 280x's pretty cheap and, tell me if Im wrong, but they seem to be the easiest and most miner friendly right now.
sr. member
Activity: 434
Merit: 250
Wolf0, when are you going to release it? How much is the fee?
hero member
Activity: 896
Merit: 1000
Not in SGMiner. When my own, fully custom-written one is in a usable state for end users, then it may be in there.

Are you going to release your miner publicly or privately paid for?
sr. member
Activity: 434
Merit: 250

Just ran a couple of quick tests, the first on SGMiner 5, right from Github. The source I used was a bit old, because I forgot to pull before I did the test, so for reference, I used commit f27f8dd544a107523435363e7b26bfc294719542, pushed on Tue Jan 13 11:00:57 2015 +0100. I highly doubt anything in Qubit has changed since then, however. Anyways, the rig I tested on, Freya, currently has a 270X, 280X, 290X, and 7950. The cards are OC'd - card types, clocks, and memory types are in the notepad.

Screenshot of stock SG5 on qubit (NSFW): https://ottrbutt.com/miner/qubitstock-03162015.png

I also spent maybe 2 - 3 hours or less on Qubit a while ago one afternoon. As such, the results are approaching barely acceptable in my opinion, but I did it just for the hell of it. Maybe I'll finish up with all the more obvious hashrate increases to be had at some point, then clean and polish the code, but not now. Nevertheless, here it is:

Screenshot of modified SG5 with completely rewritten qubit kernel (NSFW): https://ottrbutt.com/miner/qubitwolf-03162015.png


Wolf, do you have plan to release the optimized miner so that AMD cards can compete with nVidia cards?
legendary
Activity: 952
Merit: 1002
i getting 5.8mhs on qubit 1100/1500 Wink

what for a config u use for the 5.8 ??

I am getting 6.72mh/s on a 290x
also 6.90mh/s on 290

Nice config.  Try compiling with 14.9 drivers.  My R9-290 gets 7.3 @ 1150/1375.

Edit: Also a bit better with worksize 256.
Pages:
Jump to: