[ANN] Sia GPU Miner | Bitcointalksearch.org

antantti

legendary

Activity: 1176

Merit: 1015

Quote from: Genoil on July 13, 2016, 05:24:13 PM

Quote from: antantti on July 13, 2016, 04:06:42 PM

Quote from: Genoil on July 13, 2016, 01:56:23 PM

Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

Genoil,

If you ask me (you don't) that 50 eth could be much better spent during a long break instead of buying some lousy gpu's...

Bought camping gear Grin

There goes my baby... enjoy your break!!!

Genoil

sr. member

Activity: 438

Merit: 250

Quote from: antantti on July 13, 2016, 04:06:42 PM

Quote from: Genoil on July 13, 2016, 01:56:23 PM

Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

Genoil,

If you ask me (you don't) that 50 eth could be much better spent during a long break instead of buying some lousy gpu's...

Bought camping gear Grin

duts_bg

full member

Activity: 167

Merit: 100

Any ideas why i get only 420MH on saphhire x280?
What can i try to fix this?
It stays at 100% Usage

antantti

legendary

Activity: 1176

Merit: 1015

Quote from: Genoil on July 13, 2016, 01:56:23 PM

Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

Genoil,

If you ask me (you don't) that 50 eth could be much better spent during a long break instead of buying some lousy gpu's...

Genoil

sr. member

Activity: 438

Merit: 250

Quote from: miningpoolhub on July 13, 2016, 12:33:10 PM

Quote from: Genoil on July 13, 2016, 10:47:39 AM

Mind you I'm not responsible for what Cryptomining blog posts. My fork is far from production ready. It's more like a sketch that actually happens to mine blocks and shares. Go with the GO miner, it is far more complete.

Hi Genoil
Happy to see you at this forum.
Is there any chance of implementing stratum enabled miner in near future?

Current polling (long-polling from some mod) is quite harming performance of server and client both.
I would like to open stratum compatible Siacoin pool but there's no stratum supported miner yet.

As there's no standard stratum protocol for Siacoin, just adapting ethereum stratum protocol would be good. (Luckily they are quite similar)
("Nicehash compatible stratum" or "stratum-mining-proxy from coinotron" protocol seems quite good)

Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

miningpoolhub

legendary

Activity: 1456

Merit: 1006

Mining Pool Hub

Quote from: Genoil on July 13, 2016, 10:47:39 AM

Mind you I'm not responsible for what Cryptomining blog posts. My fork is far from production ready. It's more like a sketch that actually happens to mine blocks and shares. Go with the GO miner, it is far more complete.

Hi Genoil
Happy to see you at this forum.
Is there any chance of implementing stratum enabled miner in near future?

Current polling (long-polling from some mod) is quite harming performance of server and client both.
I would like to open stratum compatible Siacoin pool but there's no stratum supported miner yet.

As there's no standard stratum protocol for Siacoin, just adapting ethereum stratum protocol would be good. (Luckily they are quite similar)
("Nicehash compatible stratum" or "stratum-mining-proxy from coinotron" protocol seems quite good)

Genoil

sr. member

Activity: 438

Merit: 250

Mind you I'm not responsible for what Cryptomining blog posts. My fork is far from production ready. It's more like a sketch that actually happens to mine blocks and shares. Go with the GO miner, it is far more complete.

ldw-com

full member

Activity: 229

Merit: 100

Quote from: navydude on July 13, 2016, 09:36:20 AM

Quote from: BTCBusinessConsult on July 13, 2016, 08:31:02 AM

So is there a EXE for those of us who want to hobby mine this and have ZERO understanding of the last 3 pages of text? lol Grin

Ummm Yeah! Me too.

Gominer exe for pools

https://github.com/Eliovp/gominer/tree/Eliovp-binaries

Genoils exe

http://cryptomining-blog.com/8089-new-sia-gpu-miner-for-opencl-forked-by-genoil-now-available/

Greetings

navydude

sr. member

Activity: 283

Merit: 250

Quote from: BTCBusinessConsult on July 13, 2016, 08:31:02 AM

So is there a EXE for those of us who want to hobby mine this and have ZERO understanding of the last 3 pages of text? lol Grin

Ummm Yeah! Me too.

BTCBusinessConsult

sr. member

Activity: 546

Merit: 250

Active Trading on EPIC5k and Spectre.Ai

So is there a EXE for those of us who want to hobby mine this and have ZERO understanding of the last 3 pages of text? lol Grin

Amph

legendary

Activity: 3248

Merit: 1072

Quote from: marvykkio on July 13, 2016, 08:13:23 AM

because everyone crazy for siacoin? is worth nothing

1 SC = 0.00000091 BTC
Huh

because you can mine a shitton, with one stupid rig of six gpu you can do almost 30k coins a day

keep mining useless ethereum which is not profitable by a long shot anymore...

marvykkio

hero member

Activity: 798

Merit: 1000

because everyone crazy for siacoin? is worth nothing

1 SC = 0.00000091 BTC
Huh

sp_

legendary

Activity: 2954

Merit: 1087

Team Black developer

Quote from: sp_ on July 13, 2016, 01:59:47 AM

But the klaus_t version is getting alot of reject's on the pool.

looks like the rejects are gone with this small code change.
There seems to be a bug in the multinonce retrieval. Here I only return only one solution per iteration, and it works.

Code:

if (*((uint32_t*)h) <= target)
{
int i;
/* uint64_t tmp = devectorize(header[4]);
for (i = 0; i < MAXRESULTS; i++)
{
tmp = atomicCAS(&((uint64_t *)nonceOut)[i], 0, tmp);
if (tmp == 0)
break;
}
*/
nonceOut[i] = header[4];
// uint64_t tmp = devectorize(header[4])

hashOut[i * 4] = h[0];
v[1] = v[1] + v[6] + header[0]; v[12] = __swap_hilo(v[12] ^ v[1]); v[11] = v[11] + v[12];
v[1] = v[1] + __byte_perm_64(v[6] ^ v[11], 0x6543, 0x2107) + header[2];
v[3] = v[3] + v[4] + header[5]; v[14] = __swap_hilo(v[14] ^ v[3]); v[9] = v[9] + v[14];
v[3] = v[3] + __byte_perm_64(v[4] ^ v[9], 0x6543, 0x2107) + header[3];
hashOut[i * 4 + 1] = vectorize(0xbb67ae8584caa73b) ^ v[1] ^ (v[9] + __byte_perm_64(v[14] ^ v[3], 0x5432, 0x1076));
hashOut[i * 4 + 2] = vectorize(0x3c6ef372fe94f82b) ^ v[2] ^ (v[10] + __byte_perm_64(v[15] ^ v[0], 0x5432, 0x1076));
hashOut[i * 4 + 3] = vectorize(0xa54ff53a5f1d36f1) ^ v[3] ^ (v[11] + __byte_perm_64(v[12] ^ v[1], 0x5432, 0x1076));
return;
}

sp_

legendary

Activity: 2954

Merit: 1087

Team Black developer

Quote from: Genoil on July 13, 2016, 02:06:34 AM

Quote from: sp_ on July 13, 2016, 01:59:47 AM

Klaus_t's cuda port is doing 355MHASH on the 750ti(64 bit cuda 8.0). Genoils opencl kernal does around 303MHASH.
But the klaus_t version is getting alot of reject's on the pool.
https://github.com/KlausT/Sia-CUDA-Miner

What does it do with the funnel shift asm? Or did you already paste that in?
(btw it is NOT my kernel. i only provided windows build files and added pool support.)
my fork also is still single threaded. i wasn't actually planning to build a proper miner around it. just a bit of kernel play.

I managed to add some more hash by rewriting to uint2 like this:

Code:

#undef rotr64
#undef __byte_perm_64
#undef __swap_hilo

__inline__ __device__ uint2 rotr64(const uint2 a, const int offset)
{
uint2 result;
if (offset < 32) {
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.x) : "r"(a.x), "r"(a.y), "r"(offset));
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.y) : "r"(a.y), "r"(a.x), "r"(offset));
}
else {
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.x) : "r"(a.y), "r"(a.x), "r"(offset));
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(offset));
}
return result;
}

__device__ __forceinline__
uint2 __byte_perm_64(const uint2 source, const uint32_t grab1, const uint32_t grab2)
{
uint2 r;
asm("prmt.b32 %0, %1, %2, %3;" : "=r"(r.x) : "r"(source.x), "r"(source.y), "r"(grab1));
asm("prmt.b32 %0, %1, %2, %3;" : "=r"(r.y) : "r"(source.x), "r"(source.y), "r"(grab2));
return r;
}

__device__ __forceinline__
uint2 __swap_hilo(const uint2 source)
{
uint2 r;

r.x = source.y;
r.y = source.x;

return r;
}

Code:

__global__ void __launch_bounds__(blocksize, 4) nonceGrind_SP(const uint2 * __restrict__ headerIn, uint2 * __restrict__ hashOut, uint2 * __restrict__ nonceOut, const uint2 * __restrict__ v1, uint32_t target)
{
uint2 header[10], h[4], v[16];

uint32_t id = (blockDim.x * blockIdx.x + threadIdx.x)*npt;

#pragma unroll
for (int i = 0; i < 10; i++)
header[i] = headerIn[i];

for (int n = id; n < id + npt; n++)
{
((uint32_t*)header)[8] = n;
v[2] = vectorize(0x5BF2CD1EF9D6B596u) + header[4]; v[14] = __swap_hilo(~vectorize(0x1f83d9abfb41bd6bu) ^ v[2]); v[10] = vectorize(0x3c6ef372fe94f82bu) + v[14]; v[6] = __byte_perm_64(vectorize(0x1f83d9abfb41bd6bu) ^ v[10], 0x6543, 0x2107);
v[2] = v[2] + v[6] + header[5]; v[14] = __byte_perm_64(v[14] ^ v[2], 0x5432, 0x1076); v[10] = v[10] + v[14]; v[6] = rotr64(v[6] ^ v[10], 63);
v[3] = vectorize(0x130C253729B586Au) + header[6]; v[15] = __swap_hilo(vectorize(0x5be0cd19137e2179u) ^ v[3]); v[11] = vectorize(0xa54ff53a5f1d36f1u) + v[15]; v[7] = __byte_perm_64(vectorize(0x5be0cd19137e2179u) ^ v[11], 0x6543, 0x2107);
v[3] = v[3] + v[7] + header[7]; v[15] = __byte_perm_64(v[15] ^ v[3], 0x5432, 0x1076); v[11] = v[11] + v[15]; v[7] = rotr64(v[7] ^ v[11], 63);
v[0] = v1[0] + v1[5] + header[8]; v[15] = __swap_hilo(v[15] ^ v[0]); v[10] = v[10] + v[15]; v[5] = __byte_perm_64(v1[5] ^ v[10], 0x6543, 0x2107);
v[0] = v[0] + v[5] + header[9]; v[15] = __byte_perm_64(v[15] ^ v[0], 0x5432, 0x1076); v[10] = v[10] + v[15]; v[5] = rotr64(v[5] ^ v[10], 63);

....

etc

You Also need to copy some functions from ccminer.

Code:

__device__ __forceinline__ uint64_t devectorize(uint2 x)
{
uint64_t result;
asm("mov.b64 %0,{%1,%2}; \n\t"
: "=l"(result) : "r"(x.x), "r"(x.y));
return result;
}

__device__ __forceinline__ uint2 vectorize(const uint64_t x)
{
uint2 result;
asm("mov.b64 {%0,%1},%2; \n\t"
: "=r"(result.x), "=r"(result.y) : "l"(x));
return result;
}

static __device__ __forceinline__ uint2 operator^ (uint2 a, uint32_t b) { return make_uint2(a.x^ b, a.y); }
static __device__ __forceinline__ uint2 operator^ (uint2 a, uint2 b) { return make_uint2(a.x ^ b.x, a.y ^ b.y); }
static __device__ __forceinline__ uint2 operator& (uint2 a, uint2 b) { return make_uint2(a.x & b.x, a.y & b.y); }
static __device__ __forceinline__ uint2 operator| (uint2 a, uint2 b) { return make_uint2(a.x | b.x, a.y | b.y); }
static __device__ __forceinline__ uint2 operator~ (uint2 a) { return make_uint2(~a.x, ~a.y); }
static __device__ __forceinline__ void operator^= (uint2 &a, uint2 b) { a = a ^ b; }
static __device__ __forceinline__ uint2 operator+ (uint2 a, uint2 b)
{
uint2 result;
asm(
"add.cc.u32 %0,%2,%4; \n\t"
"addc.u32 %1,%3,%5; \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b.x), "r"(b.y));
return result;
}

static __device__ __forceinline__ uint2 operator+ (uint2 a, uint32_t b)
{
uint2 result;
asm("add.cc.u32 %0,%2,%4; \n\t"
"addc.u32 %1,%3,%5; \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b), "r"(0));
return result;
}

static __device__ __forceinline__ uint2 operator- (uint2 a, uint32_t b)
{
uint2 result;
asm("sub.cc.u32 %0,%2,%4; \n\t"
"subc.u32 %1,%3,%5; \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b), "r"(0));
return result;
}

static __device__ __forceinline__ uint2 operator- (uint2 a, uint2 b)
{
uint2 result;
asm("sub.cc.u32 %0,%2,%4; \n\t"
"subc.u32 %1,%3,%5; \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b.x), "r"(b.y));
return result;
}

sp_

legendary

Activity: 2954

Merit: 1087

Team Black developer

Wall timings for the 750ti.

Genoil: 60watt @ 303MHASH
Klaus_t:40watt @ 356MHASH

Genoil

sr. member

Activity: 438

Merit: 250

Quote from: sp_ on July 13, 2016, 01:59:47 AM

Klaus_t's cuda port is doing 355MHASH on the 750ti(64 bit cuda 8.0). Genoils opencl kernal does around 303MHASH.
But the klaus_t version is getting alot of reject's on the pool.

https://github.com/KlausT/Sia-CUDA-Miner

What does it do with the funnel shift asm? Or did you already paste that in?

(btw it is NOT my kernel. i only provided windows build files and added pool support.)

my fork also is still single threaded. i wasn't actually planning to build a proper miner around it. just a bit of kernel play.

Amph

legendary

Activity: 3248

Merit: 1072

klaus version still need more than one instance, should be an easy fix for someone that know how to code...

sp_

legendary

Activity: 2954

Merit: 1087

Team Black developer

Klaus_t's cuda port is doing 355MHASH on the 750ti(64 bit cuda 8.0). Genoils opencl kernal does around 303MHASH.
But the klaus_t version is getting alot of reject's on the pool.

https://github.com/KlausT/Sia-CUDA-Miner

cijulangboy

sr. member

Activity: 406

Merit: 250

Quote from: go6ooo1212 on June 08, 2015, 09:57:40 AM

Quote from: Taek on June 08, 2015, 09:50:13 AM

Quote from: djm34 on June 07, 2015, 09:27:18 AM

will it work with standard sgminer/ccminer (and getwork/stratum stuff) or is it something entirely different ?

It's entirely different. You can find binaries here: https://github.com/NebulousLabs/Sia-GPU-Miner/releases

Its working now to me either , but the rig shows activity only on one of the cards. Speed is about 32MH/s

ok thanks for information

miningpoolhub

legendary

Activity: 1456

Merit: 1006

Mining Pool Hub

Quote from: ?? on ??

Quote from: miningpoolhub on July 12, 2016, 07:31:47 PM

Any plan for stratum support?

miningpoolhub would like to support new stratum protocol and open siacoin pool.

I just request one thing for the spec. Pass the NETWORK diff, as well as the share diff, in there.

If working miner comes out, I can test and open the pool.
Maybe can you implement that?

Topic: [ANN] Sia GPU Miner (Read 10917 times)