Pages:
Author

Topic: [ANN] Sia GPU Miner (Read 10889 times)

legendary
Activity: 1176
Merit: 1015
July 13, 2016, 06:00:56 PM
#48
Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

Genoil,

If you ask me (you don't) that 50 eth could be much better spent during a long break instead of buying some lousy gpu's...

Bought camping gear Grin

There goes my baby... enjoy your break!!!
sr. member
Activity: 438
Merit: 250
July 13, 2016, 05:24:13 PM
#47
Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

Genoil,

If you ask me (you don't) that 50 eth could be much better spent during a long break instead of buying some lousy gpu's...

Bought camping gear Grin
full member
Activity: 167
Merit: 100
July 13, 2016, 04:27:05 PM
#46
Any ideas why i get only 420MH on saphhire x280?
What can i try to fix this?
It stays at 100% Usage
legendary
Activity: 1176
Merit: 1015
July 13, 2016, 04:06:42 PM
#45
Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.

Genoil,

If you ask me (you don't) that 50 eth could be much better spent during a long break instead of buying some lousy gpu's...
sr. member
Activity: 438
Merit: 250
July 13, 2016, 01:56:23 PM
#44
Mind you I'm not responsible for what Cryptomining blog posts. My fork is far from production ready. It's more like a sketch that actually happens to mine blocks and shares. Go with the GO miner, it is far more complete.

Hi Genoil
Happy to see you at this forum.
Is there any chance of implementing stratum enabled miner in near future?

Current polling (long-polling from some mod) is quite harming performance of server and client both.
I would like to open stratum compatible Siacoin pool but there's no stratum supported miner yet.

As there's no standard stratum protocol for Siacoin, just adapting ethereum stratum protocol would be good. (Luckily they are quite similar)
("Nicehash compatible stratum" or "stratum-mining-proxy from coinotron" protocol seems quite good)

Hey,

Not in the near future. I'm close to taking a long break. If SC is still hot when I return and nobody else did, I'll consider it.
legendary
Activity: 1456
Merit: 1006
Mining Pool Hub
July 13, 2016, 12:33:10 PM
#43
Mind you I'm not responsible for what Cryptomining blog posts. My fork is far from production ready. It's more like a sketch that actually happens to mine blocks and shares. Go with the GO miner, it is far more complete.

Hi Genoil
Happy to see you at this forum.
Is there any chance of implementing stratum enabled miner in near future?

Current polling (long-polling from some mod) is quite harming performance of server and client both.
I would like to open stratum compatible Siacoin pool but there's no stratum supported miner yet.

As there's no standard stratum protocol for Siacoin, just adapting ethereum stratum protocol would be good. (Luckily they are quite similar)
("Nicehash compatible stratum" or "stratum-mining-proxy from coinotron" protocol seems quite good)
sr. member
Activity: 438
Merit: 250
July 13, 2016, 10:47:39 AM
#42
Mind you I'm not responsible for what Cryptomining blog posts. My fork is far from production ready. It's more like a sketch that actually happens to mine blocks and shares. Go with the GO miner, it is far more complete.
full member
Activity: 229
Merit: 100
July 13, 2016, 09:45:01 AM
#41
So is there a EXE for those of us who want to hobby mine this and have ZERO understanding of the last 3 pages of text? lol  Grin Grin

Ummm Yeah! Me too.

Gominer exe for pools

https://github.com/Eliovp/gominer/tree/Eliovp-binaries


Genoils exe

http://cryptomining-blog.com/8089-new-sia-gpu-miner-for-opencl-forked-by-genoil-now-available/


Greetings
sr. member
Activity: 283
Merit: 250
July 13, 2016, 09:36:20 AM
#40
So is there a EXE for those of us who want to hobby mine this and have ZERO understanding of the last 3 pages of text? lol  Grin Grin

Ummm Yeah! Me too.
sr. member
Activity: 546
Merit: 250
Active Trading on EPIC5k and Spectre.Ai
July 13, 2016, 08:31:02 AM
#39
So is there a EXE for those of us who want to hobby mine this and have ZERO understanding of the last 3 pages of text? lol  Grin Grin
legendary
Activity: 3248
Merit: 1070
July 13, 2016, 08:23:32 AM
#38
because everyone crazy for siacoin? is worth nothing

1 SC = 0.00000091 BTC
 Huh Huh

because you can mine a shitton, with one stupid rig of six gpu you can do almost 30k coins a day

keep mining useless ethereum which is not profitable by a long shot anymore...
hero member
Activity: 798
Merit: 1000
July 13, 2016, 08:13:23 AM
#37
because everyone crazy for siacoin? is worth nothing

1 SC = 0.00000091 BTC
 Huh Huh
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
July 13, 2016, 02:30:24 AM
#36
But the klaus_t version is getting alot of reject's on the pool.

looks like the rejects are gone with this small code change.
There seems to be a bug in the multinonce retrieval. Here I only return only one solution per iteration, and it works.

Code:

if (*((uint32_t*)h) <= target)
{
int i;
/* uint64_t tmp = devectorize(header[4]);
for (i = 0; i < MAXRESULTS; i++)
{
tmp = atomicCAS(&((uint64_t *)nonceOut)[i], 0, tmp);
if (tmp == 0)
break;
}
*/
nonceOut[i] = header[4];
// uint64_t tmp = devectorize(header[4])

hashOut[i * 4] = h[0];
v[1] = v[1] + v[6] + header[0]; v[12] = __swap_hilo(v[12] ^ v[1]); v[11] = v[11] + v[12];
v[1] = v[1] + __byte_perm_64(v[6] ^ v[11], 0x6543, 0x2107) + header[2];
v[3] = v[3] + v[4] + header[5]; v[14] = __swap_hilo(v[14] ^ v[3]); v[9] = v[9] + v[14];
v[3] = v[3] + __byte_perm_64(v[4] ^ v[9], 0x6543, 0x2107) + header[3];
hashOut[i * 4 + 1] = vectorize(0xbb67ae8584caa73b) ^ v[1] ^ (v[9] + __byte_perm_64(v[14] ^ v[3], 0x5432, 0x1076));
hashOut[i * 4 + 2] = vectorize(0x3c6ef372fe94f82b) ^ v[2] ^ (v[10] + __byte_perm_64(v[15] ^ v[0], 0x5432, 0x1076));
hashOut[i * 4 + 3] = vectorize(0xa54ff53a5f1d36f1) ^ v[3] ^ (v[11] + __byte_perm_64(v[12] ^ v[1], 0x5432, 0x1076));
return;
}
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
July 13, 2016, 02:22:03 AM
#35
Klaus_t's cuda port is doing 355MHASH on the 750ti(64 bit cuda 8.0). Genoils opencl kernal does around 303MHASH.
But the klaus_t version is getting alot of reject's on the pool.
https://github.com/KlausT/Sia-CUDA-Miner
What does it do with the funnel shift asm? Or did you already paste that in?
(btw it is NOT my kernel. i only provided windows build files and added pool support.)
my fork also is still single threaded. i wasn't actually planning to build a proper miner around it. just a bit of kernel play.

I managed to add some more hash by rewriting to uint2 like this:

Code:
#undef rotr64
#undef __byte_perm_64
#undef __swap_hilo

__inline__ __device__ uint2 rotr64(const uint2 a, const int offset)
{
uint2 result;
if (offset < 32) {
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.x) : "r"(a.x), "r"(a.y), "r"(offset));
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.y) : "r"(a.y), "r"(a.x), "r"(offset));
}
else {
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.x) : "r"(a.y), "r"(a.x), "r"(offset));
asm("shf.r.wrap.b32 %0, %1, %2, %3;" : "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(offset));
}
return result;
}

__device__ __forceinline__
uint2 __byte_perm_64(const uint2 source, const uint32_t grab1, const uint32_t grab2)
{
uint2 r;
asm("prmt.b32 %0, %1, %2, %3;" : "=r"(r.x) : "r"(source.x), "r"(source.y), "r"(grab1));
asm("prmt.b32 %0, %1, %2, %3;" : "=r"(r.y) : "r"(source.x), "r"(source.y), "r"(grab2));
return r;
}

__device__ __forceinline__
uint2 __swap_hilo(const uint2 source)
{
uint2 r;

r.x = source.y;
r.y = source.x;

return r;
}


Code:
__global__ void __launch_bounds__(blocksize, 4) nonceGrind_SP(const uint2 * __restrict__ headerIn, uint2 * __restrict__ hashOut, uint2 * __restrict__ nonceOut, const uint2 * __restrict__ v1, uint32_t target)
{
uint2 header[10], h[4], v[16];

uint32_t id = (blockDim.x * blockIdx.x + threadIdx.x)*npt;

#pragma unroll
for (int i = 0; i < 10; i++)
header[i] = headerIn[i];

for (int n = id; n < id + npt; n++)
{
((uint32_t*)header)[8] = n;
v[2] = vectorize(0x5BF2CD1EF9D6B596u) + header[4]; v[14] = __swap_hilo(~vectorize(0x1f83d9abfb41bd6bu) ^ v[2]); v[10] = vectorize(0x3c6ef372fe94f82bu) + v[14]; v[6] = __byte_perm_64(vectorize(0x1f83d9abfb41bd6bu) ^ v[10], 0x6543, 0x2107);
v[2] = v[2] + v[6] + header[5]; v[14] = __byte_perm_64(v[14] ^ v[2], 0x5432, 0x1076); v[10] = v[10] + v[14]; v[6] = rotr64(v[6] ^ v[10], 63);
v[3] = vectorize(0x130C253729B586Au) + header[6]; v[15] = __swap_hilo(vectorize(0x5be0cd19137e2179u) ^ v[3]); v[11] = vectorize(0xa54ff53a5f1d36f1u) + v[15]; v[7] = __byte_perm_64(vectorize(0x5be0cd19137e2179u) ^ v[11], 0x6543, 0x2107);
v[3] = v[3] + v[7] + header[7]; v[15] = __byte_perm_64(v[15] ^ v[3], 0x5432, 0x1076); v[11] = v[11] + v[15]; v[7] = rotr64(v[7] ^ v[11], 63);
v[0] = v1[0] + v1[5] + header[8]; v[15] = __swap_hilo(v[15] ^ v[0]); v[10] = v[10] + v[15]; v[5] = __byte_perm_64(v1[5] ^ v[10], 0x6543, 0x2107);
v[0] = v[0] + v[5] + header[9]; v[15] = __byte_perm_64(v[15] ^ v[0], 0x5432, 0x1076); v[10] = v[10] + v[15]; v[5] = rotr64(v[5] ^ v[10], 63);

....

etc


You Also need to copy some functions from ccminer.

Code:

__device__ __forceinline__ uint64_t devectorize(uint2 x)
{
uint64_t result;
asm("mov.b64 %0,{%1,%2}; \n\t"
: "=l"(result) : "r"(x.x), "r"(x.y));
return result;
}


__device__ __forceinline__ uint2 vectorize(const uint64_t x)
{
uint2 result;
asm("mov.b64 {%0,%1},%2; \n\t"
: "=r"(result.x), "=r"(result.y) : "l"(x));
return result;
}

static __device__ __forceinline__ uint2 operator^ (uint2 a, uint32_t b) { return make_uint2(a.x^ b, a.y); }
static __device__ __forceinline__ uint2 operator^ (uint2 a, uint2 b) { return make_uint2(a.x ^ b.x, a.y ^ b.y); }
static __device__ __forceinline__ uint2 operator& (uint2 a, uint2 b) { return make_uint2(a.x & b.x, a.y & b.y); }
static __device__ __forceinline__ uint2 operator| (uint2 a, uint2 b) { return make_uint2(a.x | b.x, a.y | b.y); }
static __device__ __forceinline__ uint2 operator~ (uint2 a) { return make_uint2(~a.x, ~a.y); }
static __device__ __forceinline__ void operator^= (uint2 &a, uint2 b) { a = a ^ b; }
static __device__ __forceinline__ uint2 operator+ (uint2 a, uint2 b)
{
uint2 result;
asm(
"add.cc.u32 %0,%2,%4; \n\t"
"addc.u32 %1,%3,%5;   \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b.x), "r"(b.y));
return result;
}

static __device__ __forceinline__ uint2 operator+ (uint2 a, uint32_t b)
{
uint2 result;
asm("add.cc.u32 %0,%2,%4; \n\t"
"addc.u32 %1,%3,%5;   \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b), "r"(0));
return result;
}


static __device__ __forceinline__ uint2 operator- (uint2 a, uint32_t b)
{
uint2 result;
asm("sub.cc.u32 %0,%2,%4; \n\t"
"subc.u32 %1,%3,%5;   \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b), "r"(0));
return result;
}


static __device__ __forceinline__ uint2 operator- (uint2 a, uint2 b)
{
uint2 result;
asm("sub.cc.u32 %0,%2,%4; \n\t"
"subc.u32 %1,%3,%5;   \n\t"
: "=r"(result.x), "=r"(result.y) : "r"(a.x), "r"(a.y), "r"(b.x), "r"(b.y));
return result;
}
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
July 13, 2016, 02:07:00 AM
#34
Wall timings for the 750ti.

Genoil: 60watt @ 303MHASH
Klaus_t:40watt @ 356MHASH
sr. member
Activity: 438
Merit: 250
July 13, 2016, 02:06:34 AM
#33
Klaus_t's cuda port is doing 355MHASH on the 750ti(64 bit cuda 8.0). Genoils opencl kernal does around 303MHASH.
But the klaus_t version is getting alot of reject's on the pool.

https://github.com/KlausT/Sia-CUDA-Miner


What does it do with the funnel shift asm? Or did you already paste that in?

(btw it is NOT my kernel. i only provided windows build files and added pool support.)

my fork also is still single threaded. i wasn't actually planning to build a proper miner around it. just a bit of kernel play.
legendary
Activity: 3248
Merit: 1070
July 13, 2016, 02:04:12 AM
#32
klaus version still need more than one instance, should be an easy fix for someone that know how to code...
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
July 13, 2016, 01:59:47 AM
#31
Klaus_t's cuda port is doing 355MHASH on the 750ti(64 bit cuda 8.0). Genoils opencl kernal does around 303MHASH.
But the klaus_t version is getting alot of reject's on the pool.

https://github.com/KlausT/Sia-CUDA-Miner

sr. member
Activity: 406
Merit: 250
July 12, 2016, 11:00:56 PM
#30
will it work with standard sgminer/ccminer (and getwork/stratum stuff) or is it something entirely different ?



It's entirely different. You can find binaries here: https://github.com/NebulousLabs/Sia-GPU-Miner/releases

Its working now to me either , but the rig shows activity only on one of the cards. Speed is about 32MH/s


ok thanks for information
legendary
Activity: 1456
Merit: 1006
Mining Pool Hub
July 12, 2016, 08:55:43 PM
#29
Any plan for stratum support?

miningpoolhub would like to support new stratum protocol and open siacoin pool.

I just request one thing for the spec. Pass the NETWORK diff, as well as the share diff, in there.

If working miner comes out, I can test and open the pool.
Maybe can you implement that?
Pages:
Jump to: