Pages:
Author

Topic: [ANN] sgminer v5 - optimized X11/X13/NeoScrypt/Lyra2RE/etc. kernel-switch miner - page 101. (Read 877859 times)

member
Activity: 81
Merit: 1002
It was only the wind.
Down are kernels and generated bins for X algos:

http://www.filedropper.com/x111315kernels
http://www.filedropper.com/x111314bins

Complete decision including last version of sgminer and all this kernels&bins&configuration file you can use:
http://www.filedropper.com/sgminer-51-dev-2014-mod

If you like to compile yourself you can download from:
https://github.com/sgminer-dev

This kernel&bins are for Tahiti&Hawaii. I have no Kaiman and Pitcairn to test.
After the test if you're satisfied you can tip me here: 13FykK1WoEwXV1WvrjT1hDRi1E1gFaBT8y
And Wolf0, of course!


hang on a minute, where did you got it?

Some old, shitty source was leaked with the bins.
legendary
Activity: 1092
Merit: 1004
MIXED DRIVER VERSIONS--

I have read various posts about putting the driver files "in the miner directory".  How does a person do this?

Here is my situation.  I have a uATX mother board with on-board graphics.  The video chip will run with AMD drivers 13.x, but no later.  The MB BIOS can be set to use additional video cards.  What I want to do is install Ubuntu on the rig, use AMD 13.x drivers, and run the monitor with the installed AMD 13.x drivers.  I then want to put AMD 14.6 drivers "in the mining directory", and compile and mine with the additional video cards.

How to do this?  Any help?       --scryptr

Talking to myself --

Sorry, don't mean to be a bother, but I have read several posts about installing one driver for the system, and then placing 14.x drivers in the mining directory for miner compilation.  How does one do this?  Is there a description or how-to on the web?  I have been googling my eyes out...       --scryptr

Pretty easy, you install the driver 13.12 (as an example) needed to create the bin file. Once this has been created you shut sgminer down and uninstall the 13.12 driver Grin

Then, you install the GPU driver you is fastest for mining or gaming depending on your priorities Roll Eyes

You don't need to do this when using Wolf0's leaked x11 mod, which is 50% faster hash; x13 mod, which is 50% faster than the official sgminer release - the bin files are already made to work with a modded kernel Wink
legendary
Activity: 1797
Merit: 1028
MIXED DRIVER VERSIONS--

I have read various posts about putting the driver files "in the miner directory".  How does a person do this?

Here is my situation.  I have a uATX mother board with on-board graphics.  The video chip will run with AMD drivers 13.x, but no later.  The MB BIOS can be set to use additional video cards.  What I want to do is install Ubuntu on the rig, use AMD 13.x drivers, and run the monitor with the installed AMD 13.x drivers.  I then want to put AMD 14.6 drivers "in the mining directory", and compile and mine with the additional video cards.

How to do this?  Any help?       --scryptr

Talking to myself --

Sorry, don't mean to be a bother, but I have read several posts about installing one driver for the system, and then placing 14.x drivers in the mining directory for miner compilation.  How does one do this?  Is there a description or how-to on the web?  I have been googling my eyes out...       --scryptr
member
Activity: 81
Merit: 1002
It was only the wind.
In more recent news, 290kh/s+ Neoscrypt on a 270X!  Grin

https://ottrbutt.com/miner/neoscryptwolf-12172014.png (nsfw)

No improvement yet for 280x? Smiley


No, I just tested on 270X first. By the way, I got the salsa permutation in main working, it sucks. The answer is, the index you need to look up is also permuted, it is not 48, but 60.

280X does... 475kh/s max, 290X hits 600+.

EDIT: Nope, passed 480 on 280X - but let's see how stable it is.

Oh, I had the index 60 in some earlier version, but it spurted HW errors.. Seems to work now. Smiley

Sucks that you need to readjust every unroll after each optimization. Smiley I'll see what I get.

Okay, got her stable: https://ottrbutt.com/miner/neoscryptwolf-12172014-2.png
legendary
Activity: 1092
Merit: 1004
Did anyone tested the new AMD APP SDK 3.0 ,and if yes ,what about GPU  speeds? Thks......

I don't think there is much point in testing them out, as the fastest drivers are  the 14.7 RC1 to RC3 Shocked

Those drivers were released last year and will be incompatiable with AMD APP SDK 3.00.

Secondly, on the 14.7 RC1 to RC3 is better not to install the AMD APP SDK, because it does not make any speed difference at all Tongue

legendary
Activity: 1316
Merit: 1021
2009 Alea iacta est
Did anyone tested the new AMD APP SDK 3.0 ,and if yes ,what about GPU  speeds? Thks......
member
Activity: 81
Merit: 1002
It was only the wind.
In more recent news, 290kh/s+ Neoscrypt on a 270X!  Grin

https://ottrbutt.com/miner/neoscryptwolf-12172014.png (nsfw)

No improvement yet for 280x? Smiley


No, I just tested on 270X first. By the way, I got the salsa permutation in main working, it sucks. The answer is, the index you need to look up is also permuted, it is not 48, but 60.

280X does... 475kh/s max, 290X hits 600+.

EDIT: Nope, passed 480 on 280X - but let's see how stable it is.
sr. member
Activity: 406
Merit: 250
.....This is worth 20KH/s on my 280X......from 343KHs to 363KH/s at 1020MHz clock
.....now somebody needs to find 20KH/s more for me....  Smiley

change the XORBytesInPlace call from
Code:
XORBytesInPlace(B + bufidx, input, BLAKE2S_OUT_SIZE);
to
Code:
     XORBytesInPlace(B + bufidx, input, bufidx);
and change the function itself to perform some byte alignment checking
Code:
//
// a bit of byte alignment checking goes a long ways...
//
void XORBytesInPlace(void *restrict dst, const void *restrict src, uint mod)
{
  switch(mod % 4)
  {
  case 0:
    #pragma unroll 2
    for(int i = 0; i < 4; i+=2)
    {
      ((uint2 *)dst)[i]   ^= ((uint2 *)src)[i];
      ((uint2 *)dst)[i+1] ^= ((uint2 *)src)[i+1];    
    }
    break;    

  case 2:  
    #pragma unroll 8
    for(int i = 0; i < 16; i+=2)
    {
      ((uchar2 *)dst)[i] ^= ((uchar2 *)src)[i];
      ((uchar2 *)dst)[i+1] ^= ((uchar2 *)src)[i+1];
    }
    break;

  default:
  #pragma unroll 8
   for(int i = 0; i < 31; i+=4)
   {
    ((uchar *)dst)[i] ^= ((uchar *)src)[i];
    ((uchar *)dst)[i+1] ^= ((uchar *)src)[i+1];
    ((uchar *)dst)[i+2] ^= ((uchar *)src)[i+2];
    ((uchar *)dst)[i+3] ^= ((uchar *)src)[i+3];  
    }
  }
}


Later you said
Quote
Very interesting.   I get about 2% gain on 7950 and need to use (mod % 2) with the case statements adjusted accordingly.
My 280X gains almost 6% as is, but the gain difference between (mod % 2) and (mod % 4) is pretty small, like 1-2 KHs
When you used (mod %2, same as mod &1), what are the case statements inside XORBytesInPlace(void *restrict dst, const void *restrict src, uint mod)?
legendary
Activity: 1400
Merit: 1050
(number & (power_of_two - 1)) of course.
A bit more detail for non programmer?
Are you saying you want to program crypto algorythms in opencl without understanding "(number & (power_of_two - 1))"?
I kinda feel the same way, but I'll be a little gentler. BitmoreCoin, you may want to look up bitwise operators and things like that - an AND operation is far, far faster than modulus (modulus and division are ouch slow, as a general rule).

I think he means how to adjust the statements inside the case 0 and case 1.
hu ?
he means   N % 2^n <=> N & (2^n-1)  but gets computed faster in some bad compilation case...
similarly      N / 2^n <=> N >> (n)
     and      N * 2^n <=> N << (n)
member
Activity: 81
Merit: 1002
It was only the wind.
Down are kernels and generated bins for X algos:

http://www.filedropper.com/x111315kernels
http://www.filedropper.com/x111314bins

Complete decision including last version of sgminer and all this kernels&bins&configuration file you can use:
http://www.filedropper.com/sgminer-51-dev-2014-mod

If you like to compile yourself you can download from:
https://github.com/sgminer-dev

This kernel&bins are for Tahiti&Hawaii. I have no Kaiman and Pitcairn to test.
After the test if you're satisfied you can tip me here: 13FykK1WoEwXV1WvrjT1hDRi1E1gFaBT8y
And Wolf0, of course!

You should let people know that those sources are extremely old, and the performance is very poor.

Is this a source leak or approved by Wolf0?

Nearly impossible that my current sources have leaked, or anything near. These are extremely old.

In more recent news, 290kh/s+ Neoscrypt on a 270X!  Grin

https://ottrbutt.com/miner/neoscryptwolf-12172014.png (nsfw)
hero member
Activity: 896
Merit: 1000
(number & (power_of_two - 1)) of course.
A bit more detail for non programmer?
Are you saying you want to program crypto algorythms in opencl without understanding "(number & (power_of_two - 1))"?
I kinda feel the same way, but I'll be a little gentler. BitmoreCoin, you may want to look up bitwise operators and things like that - an AND operation is far, far faster than modulus (modulus and division are ouch slow, as a general rule).

I think he means how to adjust the statements inside the case 0 and case 1.
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
(number & (power_of_two - 1)) of course.
A bit more detail for non programmer?
Are you saying you want to program crypto algorythms in opencl without understanding "(number & (power_of_two - 1))"?
sr. member
Activity: 406
Merit: 250
Very interesting.   I get about 2% gain on 7950 and need to use (mod % 2) with the case statements adjusted accordingly.
My 280X gains almost 6% as is, but the gain difference between (mod % 2) and (mod % 4) is pretty small, like 1-2 KHs

My SMix call is a bit different, I simply put the sub-calls inline so it doesn't bother with ScratchpadStore and ScratchpadMix.
Perhaps this fits nicer into the core and needs less swapping.  

I have tried, unsuccessfully, to further streamline the SMix, but any other way I do it, its either all HW errors or vastly slower. Any guidance here would be appreciated.


How do you adjust the case 2 accordingly? 

(number % power_of_two) should be a shooting offense when coding for a brain dead compiler.

Do you mind telling us how to do it?

(number & (power_of_two - 1)) of course.

A bit more detail for non programmer?
sr. member
Activity: 406
Merit: 250
Very interesting.   I get about 2% gain on 7950 and need to use (mod % 2) with the case statements adjusted accordingly.
My 280X gains almost 6% as is, but the gain difference between (mod % 2) and (mod % 4) is pretty small, like 1-2 KHs

My SMix call is a bit different, I simply put the sub-calls inline so it doesn't bother with ScratchpadStore and ScratchpadMix.
Perhaps this fits nicer into the core and needs less swapping.  

I have tried, unsuccessfully, to further streamline the SMix, but any other way I do it, its either all HW errors or vastly slower. Any guidance here would be appreciated.


How do you adjust the case 2 accordingly? 

(number % power_of_two) should be a shooting offense when coding for a brain dead compiler.

Do you mind telling us how to do it?
sr. member
Activity: 406
Merit: 250
Very interesting.   I get about 2% gain on 7950 and need to use (mod % 2) with the case statements adjusted accordingly.
My 280X gains almost 6% as is, but the gain difference between (mod % 2) and (mod % 4) is pretty small, like 1-2 KHs

My SMix call is a bit different, I simply put the sub-calls inline so it doesn't bother with ScratchpadStore and ScratchpadMix.
Perhaps this fits nicer into the core and needs less swapping.  

I have tried, unsuccessfully, to further streamline the SMix, but any other way I do it, its either all HW errors or vastly slower. Any guidance here would be appreciated.


How do you adjust the case 2 accordingly? 
member
Activity: 81
Merit: 1002
It was only the wind.


I'm not near the computer right now so I can't run sgminer via cmd but here are the steps I took to get the miner software:

-Extracted nicehash's sgminer-5.1-dev-2014-11-13-win32.zip
-From the thread you referenced earlier I grabbed the optimized sgminer kernels and put those in the sgminer directory
-Then added the fixed marucoin-mod.cl to the sgminer directory
-Created an x11 bin, then substituted Wolf0's bin
-and Profit (Very minimal profit)

I disabled CGWatcher creating temporary configs and deleted the bins.  Still had some weird bins created overnight so it appears that wasn't the problem.  Maybe the dailies have fixed this.






Try with my compilation.
https://drive.google.com/file/d/0B3TH7a-0opyWVXgtRkRaM1NTNFE/view?usp=sharing
Run sgminer, and left him to generate new bins.
Look at the speeds.
If you are not satisfied with the speed, try to replace new generated bins with yours, but renames if have difference between names (yours *.bins with the newer bin names).


Thanks, I'll try your sgminer later, even though that's kinda scary.  What changes did you make?

Nothing, except in the forums themes found optimized:
marucoin-mod.cl, darkcoin-mod,cl, neocoin.cl, groestlcoin-v1.cl aka groestlcoin.cl, fresh.cl...
The source is:
https://github.com/sgminer-dev
Building guide:
https://github.com/sgminer-dev/sgminer/tree/master/winbuild
This is the simple part Wink
Feel free to scan with online, or whatever you want antivirus programs and so on.
If I want to win from you, would have to rewrite the code, and to close the source Wink


I did a quick hack job on Fresh, got around 9.1MH/s on 290X. Shame it's not worth anything.
member
Activity: 96
Merit: 10
Hello everyone,  I've been away from this for quite some time (last time I was up to date was when X13 support was just added).  So, please forgive me if the answer is somewhere hidden in the 100+ pages of this forum.  I tried doing a search, but that task has become a whole project in itself.  Can someone please explain or point me to the proper direction for the answer.  I have compiled (for Ubuntu 14.04) sgminer v5 from the git source.  Everything is working fine, except...where the heck is the Lyra2RE kernel?  Is it something I'm supposed to enable at compile time?  Am I suppose to pull it from another sgminer?  Please advise.  Thanks!
Odd I thought lyra2re had been added to the official version but I guess not....
My branch is up to date and has lyra2re added.
Source and binaries are in my sig.
Ok, thanks.  I'm compiling for Nicehash automatic multi-algo switching.  Their "instructions" still points to https://github.com/sgminer-dev/sgminer/tree/develop.  I remember they used to say that only the "development" version supports the automatic switching, but that was a while back.  Yet, they still refer to the development git source.  Is https://github.com/badman74/sgminer now current with the automatic switching support for Nicehash?
Master branch of sgminer-dev works with nicehash switching now.
Also yes my version will work as well.
Awesome.   Thanks for the help and the confirmation.
hero member
Activity: 658
Merit: 500
Hello everyone,  I've been away from this for quite some time (last time I was up to date was when X13 support was just added).  So, please forgive me if the answer is somewhere hidden in the 100+ pages of this forum.  I tried doing a search, but that task has become a whole project in itself.  Can someone please explain or point me to the proper direction for the answer.  I have compiled (for Ubuntu 14.04) sgminer v5 from the git source.  Everything is working fine, except...where the heck is the Lyra2RE kernel?  Is it something I'm supposed to enable at compile time?  Am I suppose to pull it from another sgminer?  Please advise.  Thanks!
Odd I thought lyra2re had been added to the official version but I guess not....
My branch is up to date and has lyra2re added.
Source and binaries are in my sig.
Ok, thanks.  I'm compiling for Nicehash automatic multi-algo switching.  Their "instructions" still points to https://github.com/sgminer-dev/sgminer/tree/develop.  I remember they used to say that only the "development" version supports the automatic switching, but that was a while back.  Yet, they still refer to the development git source.  Is https://github.com/badman74/sgminer now current with the automatic switching support for Nicehash?
Master branch of sgminer-dev works with nicehash switching now.
Also yes my version will work as well.
member
Activity: 96
Merit: 10
Hello everyone,  I've been away from this for quite some time (last time I was up to date was when X13 support was just added).  So, please forgive me if the answer is somewhere hidden in the 100+ pages of this forum.  I tried doing a search, but that task has become a whole project in itself.  Can someone please explain or point me to the proper direction for the answer.  I have compiled (for Ubuntu 14.04) sgminer v5 from the git source.  Everything is working fine, except...where the heck is the Lyra2RE kernel?  Is it something I'm supposed to enable at compile time?  Am I suppose to pull it from another sgminer?  Please advise.  Thanks!
Odd I thought lyra2re had been added to the official version but I guess not....
My branch is up to date and has lyra2re added.
Source and binaries are in my sig.
Ok, thanks.  I'm compiling for Nicehash automatic multi-algo switching.  Their "instructions" still points to https://github.com/sgminer-dev/sgminer/tree/develop.  I remember they used to say that only the "development" version supports the automatic switching, but that was a while back.  Yet, they still refer to the development git source.  Is https://github.com/badman74/sgminer now current with the automatic switching support for Nicehash?
hero member
Activity: 658
Merit: 500
Hello everyone,  I've been away from this for quite some time (last time I was up to date was when X13 support was just added).  So, please forgive me if the answer is somewhere hidden in the 100+ pages of this forum.  I tried doing a search, but that task has become a whole project in itself.  Can someone please explain or point me to the proper direction for the answer.  I have compiled (for Ubuntu 14.04) sgminer v5 from the git source.  Everything is working fine, except...where the heck is the Lyra2RE kernel?  Is it something I'm supposed to enable at compile time?  Am I suppose to pull it from another sgminer?  Please advise.  Thanks!
Odd I thought lyra2re had been added to the official version but I guess not....
My branch is up to date and has lyra2re added.
Source and binaries are in my sig.
Pages:
Jump to: