Pages:
Author

Topic: [ANN] sgminer v5 - optimized X11/X13/NeoScrypt/Lyra2RE/etc. kernel-switch miner - page 87. (Read 877846 times)

legendary
Activity: 3808
Merit: 1723
Up to 300% + 200 FS deposit bonuses
Darkcoin is hitting new highs and X11 profits are pretty much the same as before.

Disappointing.
newbie
Activity: 1
Merit: 0
I'd just like to say, that Wolf0 has been a Major help to me

He got my 7950 from 1.8mh/s upto 5mh/s with his builds! More than DOUBLE the efficiency! Thankyou!

What did you do to get that hashrate, would love to get my 7950s to that level! What algo?

Here is a good list of settings:
https://github.com/sgminer-dev/sgminer/blob/master/doc/configuration.md
legendary
Activity: 1428
Merit: 1001
Okey Dokey Lokey
I'd just like to say, that Wolf0 has been a Major help to me

He got my 7950 from 1.8mh/s upto 5mh/s with his builds!(x11, darkcoin-mod) More than DOUBLE the efficiency! Thankyou!
legendary
Activity: 1428
Merit: 1001
Okey Dokey Lokey
Where the hell can I find a README file for the SGminer itself?! for example CG and BFG miner tell me how to do everything i need to know, I can't find Shit All for info about SGminer, it seems to just assume that I know all the arguments and commands
I'm having some issues setting up x11 gpu mining with two 6870's on my other miner computer

SGminer loads up, tells me it's version, then rams a CPU core to max, throws my ram usage to 1gb, and stops, never to do anything more.
What the fuck?

Whats wrong here?
For my main machine, it was very simple, grabbed the miner, made a very simple .bat file and ran it, bam, done. (single 7950)

Also, I am unable to copy/replace my .bin with the wolf0 one, where the heck do I set these?:
"keccak-unroll" : "0", "hamsi-expand-big" : "4".

I would assume in a config file, but I'm one of those people that runs a .bat file, when I try to create a config file for sgminer, the program closes itself so fast that you can't really read the error
Something about being unable to build/make/create a config file
legendary
Activity: 1050
Merit: 1293
Huh?
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
i get 8.5 mh/s on 290 and 9.0 mh/s on 290x with 1040/1500 clocks

Can you give me your conf for 290?

This will give you 8.5Mh

      "worksize": "64",
      "xintensity": "64",
      "gpu-engine": "1050",
      "gpu-memclock": "1500",
      "gpu-threads": "2",
      "gpu-powertune": "50"

 Wink

Thks!!! 8.3Mh R9 290 and Thnks to Good guy Wolf too

which use drivers?


.bin is pre-generated, so it doesn't really matter, but to be safe, try 14.6-9 (or on windows 14.7 or something :p
newbie
Activity: 21
Merit: 0
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
i get 8.5 mh/s on 290 and 9.0 mh/s on 290x with 1040/1500 clocks

Can you give me your conf for 290?

This will give you 8.5Mh

      "worksize": "64",
      "xintensity": "64",
      "gpu-engine": "1050",
      "gpu-memclock": "1500",
      "gpu-threads": "2",
      "gpu-powertune": "50"

 Wink

Thks!!! 8.3Mh R9 290 and Thnks to Good guy Wolf too

which use drivers?
member
Activity: 81
Merit: 1002
It was only the wind.
.....This is worth 20KH/s on my 280X......from 343KHs to 363KH/s at 1020MHz clock
.....now somebody needs to find 20KH/s more for me....  Smiley

change the XORBytesInPlace call from
Code:
	XORBytesInPlace(B + bufidx, input, BLAKE2S_OUT_SIZE);
to
Code:
      XORBytesInPlace(B + bufidx, input, bufidx);
and change the function itself to perform some byte alignment checking
Code:
//
// a bit of byte alignment checking goes a long ways...
//
void XORBytesInPlace(void *restrict dst, const void *restrict src, uint mod)
{
  switch(mod % 4)
  {
  case 0:
    #pragma unroll 2
    for(int i = 0; i < 4; i+=2)
    {
      ((uint2 *)dst)[i]   ^= ((uint2 *)src)[i];
        ((uint2 *)dst)[i+1] ^= ((uint2 *)src)[i+1];   
    }
    break;   

  case 2: 
    #pragma unroll 8
    for(int i = 0; i < 16; i+=2)
    {
      ((uchar2 *)dst)[i] ^= ((uchar2 *)src)[i];
      ((uchar2 *)dst)[i+1] ^= ((uchar2 *)src)[i+1];
    }
    break;

  default:
  #pragma unroll 8
   for(int i = 0; i < 31; i+=4)
   {
    ((uchar *)dst)[i] ^= ((uchar *)src)[i];
    ((uchar *)dst)[i+1] ^= ((uchar *)src)[i+1];
    ((uchar *)dst)[i+2] ^= ((uchar *)src)[i+2];
    ((uchar *)dst)[i+3] ^= ((uchar *)src)[i+3];   
    }
  }
}


Modulus by a power of two. The compiler probably fixed it, but you should still feel bad.
sr. member
Activity: 434
Merit: 250
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
i get 8.5 mh/s on 290 and 9.0 mh/s on 290x with 1040/1500 clocks

Can you give me your conf for 290?

This will give you 8.5Mh

      "worksize": "64",
      "xintensity": "64",
      "gpu-engine": "1050",
      "gpu-memclock": "1500",
      "gpu-threads": "2",
      "gpu-powertune": "50"

 Wink

Thks!!! 8.3Mh R9 290 and Thnks to Good guy Wolf too
member
Activity: 81
Merit: 1002
It was only the wind.
Very interesting.   I get about 2% gain on 7950 and need to use (mod % 2) with the case statements adjusted accordingly.
My 280X gains almost 6% as is, but the gain difference between (mod % 2) and (mod % 4) is pretty small, like 1-2 KHs

My SMix call is a bit different, I simply put the sub-calls inline so it doesn't bother with ScratchpadStore and ScratchpadMix.
Perhaps this fits nicer into the core and needs less swapping.  

I have tried, unsuccessfully, to further streamline the SMix, but any other way I do it, its either all HW errors or vastly slower. Any guidance here would be appreciated.

Code:
void SMix(ulong16 *X, __global ulong16 *V, bool flag)
{
  int i = 0;
  int idx;

    while (i^256)
    {
      V[i++]   = X[0];
       V[i++]   = X[1];      
        neoscrypt_blkmix(X, flag);
    }
    do {      
        idx = (( (uint *)X)[48] & 0x7F) << 1;
       X[0] ^= V[idx];
       X[1] ^= V[idx+1];
        neoscrypt_blkmix(X, flag);    
    }   while (i-=2);
}
 



The calls for store and mix are just to clean the code, they're inlined anyways.
legendary
Activity: 1050
Merit: 1293
Huh?
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
i get 8.5 mh/s on 290 and 9.0 mh/s on 290x with 1040/1500 clocks

Can you give me your conf for 290?

This will give you 8.5Mh

      "worksize": "64",
      "xintensity": "64",
      "gpu-engine": "1050",
      "gpu-memclock": "1500",
      "gpu-threads": "2",
      "gpu-powertune": "50"

 Wink
newbie
Activity: 21
Merit: 0
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
i get 8.5 mh/s on 290 and 9.0 mh/s on 290x with 1040/1500 clocks

Can you give me your conf for 290?
hero member
Activity: 658
Merit: 500
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
i get 8.5 mh/s on 290 and 9.0 mh/s on 290x with 1040/1500 clocks
sr. member
Activity: 434
Merit: 250
Finally for those bins working on pimp but after 3 minutes the GPU become dead. I have used the normal bins for over 3 weeks straight.

Would pimp say the GPU is dead if it's not getting enough voltage?, I am underclocked from 1.2 to 1.1.

You have to reduce the clock frequency. These bins stress GPU more than stock bins.

How low would you go ?.

It is mining but then starts rejecting then goes dead.

I have put the memory down to 1400, Ideally I don't want to go back to stock volts due to heat.

Keep the memory at 1500MHz, reduce the core frequency at 5 MHz steps. It is try and error.
legendary
Activity: 1274
Merit: 1006
Quick question:
what hash speed can I get with AMD R9 290 on X11 algo?

With 280x I get 6,6 mh/s with wolf's bins, how much more can I get with 290?

Thanks.
sr. member
Activity: 700
Merit: 250
Finally for those bins working on pimp but after 3 minutes the GPU become dead. I have used the normal bins for over 3 weeks straight.

Would pimp say the GPU is dead if it's not getting enough voltage?, I am underclocked from 1.2 to 1.1.

You have to reduce the clock frequency. These bins stress GPU more than stock bins.

How low would you go ?.

It is mining but then starts rejecting then goes dead.

I have put the memory down to 1400, Ideally I don't want to go back to stock volts due to heat.
sr. member
Activity: 434
Merit: 250
Finally for those bins working on pimp but after 3 minutes the GPU become dead. I have used the normal bins for over 3 weeks straight.

Would pimp say the GPU is dead if it's not getting enough voltage?, I am underclocked from 1.2 to 1.1.

You have to reduce the clock frequency. These bins stress GPU more than stock bins.
sr. member
Activity: 700
Merit: 250
Finally got those bins working on pimp  Grin but after 3 minutes the GPU becomes dead. I have used the normal bins for over 3 weeks straight.

Would pimp say the GPU is dead if it's not getting enough voltage?, I am underclocked from 1.2 to 1.1.
hero member
Activity: 672
Merit: 500
So Wolf's post just confined that the stock code for cubit support is hideously inefficient!

I had suspected this since mining it in the last 2 days as my fans spin down considerably when mining Qubit compared to x11(with wolfs bins).


So wolf was your optimisation of qubit as case of unhobbling a deliberately inefficient code base?

not trying to go too conspiratorial but it would not be the first time we have seen deliberate sabotage of publicly available GPU miners.

It's not deliberate, just old code from sph-sgminer, when all the algos where slow.

Pallas is right. The fact of the matter is, the original GPU miner for X11 (not sure about Qubit) was put together for a bounty - it's not malice that makes these GPU miners so shitty, it's incompetence.
I disagree.
PrettyHateMachine (sph-sgminer author) made a more than decent job, especially considering the code he had to deal with. He provided a lot of algorithms in a relatively short time frame. The precomputed SIMD_Q constants are a fairly good indicator he knew what he was doing in my opinion. It's obvious those kernels are CPU code, they are almost exactly SPHlib.

Honestly, why should anyone release hi-perf kernels since day one? He probably wasn't interested - bounties encourage first-come, low quality submissions, which in turn encourage development of elite kernels to be sold. As a side note, I have haven't improved my Qubit either and I think it has been published for over an year. I have just no interest in improving it at the moment.
alz
full member
Activity: 227
Merit: 100
Ok,
Cheers for that clarification wolf.

and yes i was referring to the CryptoNight morph into hamstrung Monero (XMR) miner scenario.


hope you are pointing your modded sgminer over at nicehash atm, that 100% bump you are getting on qubit must be nice.





newbie
Activity: 47
Merit: 0
Thanks for the post
Glad to see im on the right track

Quite impressive on that modded miner
Pages:
Jump to: