Author

Topic: CCminer(SP-MOD) Modded NVIDIA Maxwell / Pascal kernels. - page 993. (Read 2347664 times)

sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
sorry

Add it after cudadevicereset()

in the ethash_cu_miner.cpp

Code:
cudaDeviceReset();
// cudaDeviceSetSharedMemConfig(cudaSharedMemBankSizeEightByte);
cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
cudaDeviceSetCacheConfig(cudaFuncCachePreferL1);

for maxwell comment out cudaSharedMemBankSizeEightByte and add cudaFuncCachePreferL1
sr. member
Activity: 329
Merit: 250
Nice work SP , I wish to test it , but can't compile unfortunately...
copy cuda_helper.h from sp's ccminer into libethash-cu directory and add the following line on top of keccak.cuh:
Code:
__constant__ uint64_t keccak_round_constants[24];
unfortunately no improvements on a 6 750ti rig (~52 mhs)

Use cuda 6.5. and remember to compile for compute5.0 (Seems like compute5.0 is included in (CMakeLists.txt) ). Use the version 2nd version I posted with #pragma unroll and bitselect.

And add cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync); If your cpu is weak high cpu will slow down alot.
never moved away from cuda 6.5 and yes, i'm sure it's compiled for compute 5.0...
sr. member
Activity: 329
Merit: 250
To improve the cpu usage you can try to add:

cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);

like this:

Code:
void run_ethash_search(
uint32_t blocks,
uint32_t threads,
cudaStream_t stream,
uint32_t* g_output,
hash32_t const* g_header,
hash128_t const* g_dag,
uint64_t start_nonce,
uint64_t target
)
{
cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
ethash_search
#if __CUDA_ARCH__ >= SHUFFLE_MIN_VER
ethash_search <<>>(g_output, g_header, g_dag, start_nonce, target);
#else
ethash_search <<>>(g_output, g_header, g_dag, start_nonce, target);
#endif
}

this change causes this:
Code:
  ✘  17:49:23|cudaminer1  Error CUDA mining: cannot set while device is active in this process
  ✘  17:49:23|cudaminer2  Error CUDA mining: cannot set while device is active in this process
  ✘  17:49:23|cudaminer3  Error CUDA mining: cannot set while device is active in this process
  ✘  17:49:23|cudaminer0  Error CUDA mining: cannot set while device is active in this process
  ✘  17:49:23|cudaminer4  Error CUDA mining: cannot set while device is active in this process
  ✘  17:49:23|cudaminer5  Error CUDA mining: cannot set while device is active in this process
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Hey there,
Thanks for having a look at my CUDA kernel. I actually took the keccak and ROTL64 code from your ccminer fork (or was it someone else's fork..not sure) a few months ago, so I'm surprised it can even be faster Smiley.

It must have been another fork. In my latest fork of ccminer(sp-mod 60) (maxcoin) Keccak256 (uint2/bitselect) is running at:

400MHASH gtx 970
155MHASH gtx 750ti

In old cudaminer it was something like 70MHASH on the 750ti and 200 on the 970.

Found a doc with the old keccak hashrates (maxcoin) in cudaminer:

In the old cudaminer a 780ti is doing 218MHASH (250watt)

https://docs.google.com/spreadsheets/d/1BIaD-12rmsoz3t64k3_hU79morafOURkEUjXKSdJVLo/edit?pli=1#gid=0


These algos can not be compared directly

The ether algo is doing a double round of modded keccak512 (24 iterations) + some some other stuff.
legendary
Activity: 1764
Merit: 1024
Ethereum has no GUI wallet, little windows support, difficult configurations..
I think it's too early to call it a coin, but it seems to have some money support on the exchanges. (pump)
A waste of time.

(Is Ethereum a volunteer botnet?) Join us and we will give you some cash every day.

lol... a lot of promises and hype, little to show for it. It reminds me of Star Citizen. But I go where the money does, so its' what I've been mining.

Also this coin really sucks without pools. Solo mining is for big miners or you get shafted hard. The one pool that is available has closed source code. If they wanted to be fair they should kick off everyone who has greater > 1GH. They can effectively solo mine quite easily. You know, if they actually cared about 51% attacks and being fair.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Nice work SP , I wish to test it , but can't compile unfortunately...
copy cuda_helper.h from sp's ccminer into libethash-cu directory and add the following line on top of keccak.cuh:
Code:
__constant__ uint64_t keccak_round_constants[24];
unfortunately no improvements on a 6 750ti rig (~52 mhs)

Use cuda 6.5. and remember to compile for compute5.0 (Seems like compute5.0 is included in (CMakeLists.txt) ). Use the version 2nd version I posted with #pragma unroll and bitselect.

And add cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync); If your cpu is weak high cpu will slow down alot.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
To improve the cpu usage you can try to add:

cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);

like this:

Code:
void run_ethash_search(
uint32_t blocks,
uint32_t threads,
cudaStream_t stream,
uint32_t* g_output,
hash32_t const* g_header,
hash128_t const* g_dag,
uint64_t start_nonce,
uint64_t target
)
{
cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
ethash_search
#if __CUDA_ARCH__ >= SHUFFLE_MIN_VER
ethash_search <<>>(g_output, g_header, g_dag, start_nonce, target);
#else
ethash_search <<>>(g_output, g_header, g_dag, start_nonce, target);
#endif
}
sr. member
Activity: 329
Merit: 250
Hey there,

Thanks for having a look at my CUDA kernel. I actually took the keccak and ROTL64 code from your ccminer fork (or was it someone else's fork..not sure) a few months ago, so I'm surprised it can even be faster Smiley.

Unfortunately I don't have a working Maxwell development environment to build and test, only Kepler.


i think making ethminer less cpu intensive (full cpu usage on a G3240 processor) would boost performance considerably as ccminer barely uses the cpu...
sr. member
Activity: 292
Merit: 250
Hey there,

Thanks for having a look at my CUDA kernel. I actually took the keccak and ROTL64 code from your ccminer fork (or was it someone else's fork..not sure) a few months ago, so I'm surprised it can even be faster Smiley.

Unfortunately I don't have a working Maxwell development environment to build and test, only Kepler.



The bug causing low hashrate on windows hope it can be solved.
member
Activity: 111
Merit: 10
For those living on the edge (at least with Linux)... the current git pull won't compile.  You need to remove or comment out the below from util.c (lines 1522-1524) and util.cpp (lines 1624-1626):
memset(hash, 0, sizeof hash);
animehash(&hash[0], &buf[0]);
printpfx("anime", hash);
Sorry sp_ I submitted two pull requests for this fix (it was late).  The second pull is to edit util.cpp.
sr. member
Activity: 329
Merit: 250
Nice work SP , I wish to test it , but can't compile unfortunately...
copy cuda_helper.h from sp's ccminer into libethash-cu directory and add the following line on top of keccak.cuh:
Code:
__constant__ uint64_t keccak_round_constants[24];
unfortunately no improvements on a 6 750ti rig (~52 mhs)
sr. member
Activity: 438
Merit: 250
Hey there,

Thanks for having a look at my CUDA kernel. I actually took the keccak and ROTL64 code from your ccminer fork (or was it someone else's fork..not sure) a few months ago, so I'm surprised it can even be faster Smiley.

Unfortunately I don't have a working Maxwell development environment to build and test, only Kepler.

legendary
Activity: 1400
Merit: 1000
Ethereum has no GUI wallet, little windows support, difficult configurations..
I think it's too early to call it a coin, but it seems to have some money support on the exchanges. (pump)
A waste of time.

(Is Ethereum a volunteer botnet?) Join us and we will give you some cash every day.

I am still new to Ethereum but I have read on separate occasions that somehow IBM and Samsung are tied to this.

Don't take that as 100% truth as I have not found anything personally to confirm it.

sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Ethereum has no GUI wallet, little windows support, difficult configurations..
I think it's too early to call it a coin, but it seems to have some money support on the exchanges. (pump)
A waste of time.

(Is Ethereum a volunteer botnet?) Join us and we will give you some cash every day.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
But there is more than an algo at play, I think.  The Ethereum suite is big and exists for more than mining. CCminer will have to link with and communicate with Geth, at least.  Since solo mining is the only option until a second pool opens up, or the first pool upgrades its capacity and re-opens to new addresses, CCminer will have to speak the language of the wallet.
--scryptr
Since it's no stratum support in Ethereum, no pool software seems to work. This meens that in order to include ethereum into ccminer I need to port alot of code and it will be a mess.
sounds like the same sort of issue with sia ...
#crysx

And monero...
legendary
Activity: 2940
Merit: 1091
--- ChainWorks Industries ---
But there is more than an algo at play, I think.  The Ethereum suite is big and exists for more than mining. CCminer will have to link with and communicate with Geth, at least.  Since solo mining is the only option until a second pool opens up, or the first pool upgrades its capacity and re-opens to new addresses, CCminer will have to speak the language of the wallet.
--scryptr

Since it's no stratum support in Ethereum, no pool software seems to work. This meens that in order to include ethereum into ccminer I need to port alot of code and it will be a mess.

sounds like the same sort of issue with sia ...

#crysx
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
But there is more than an algo at play, I think.  The Ethereum suite is big and exists for more than mining. CCminer will have to link with and communicate with Geth, at least.  Since solo mining is the only option until a second pool opens up, or the first pool upgrades its capacity and re-opens to new addresses, CCminer will have to speak the language of the wallet.
--scryptr

Since it's no stratum support in Ethereum, no pool software seems to work. This meens that in order to include ethereum into ccminer I need to port alot of code and it will be a mess.
legendary
Activity: 2940
Merit: 1091
--- ChainWorks Industries ---
is this now in git sp? ...
#crysx

Not on git. Only on bitcointalk here in this thread. would be nice if somebody could test the speed for me as I can't compile on windows. It sghould run alot faster on Maxwell boards compute 5.0 or bether. Not sure about the other models..

the sourcecude I modded can be found here:

https://github.com/Genoil/cpp-ethereum/tree/cudaminer

im still quite ill at the moment - so i havent been to the office ( been bedridden - except for the occasional foray here ) for about 12 days now ...

i can compile existing git clones - but only in linux ...

setting remote work is almost impossible from this home system - frustrates me ... and so ill be here for the next couple of days still until im well enough to travel to the office again ...

are there any instructions of how to compile and mine eth? ... i can try and give it a go here ...

tanx ...

edit - i missed the additional edit you added while i was typing the response Smiley ...

unless others can try this - i will have a go tomorrow ...

#crysx
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
is this now in git sp? ...
#crysx

Not on git. Only on bitcointalk here in this thread. would be nice if somebody could test the speed for me as I can't compile on windows/too lazy. It should run alot faster on Maxwell boards compute 5.0 or bether. Not sure about the other models..

the sourcecude I modded can be found here:

https://github.com/Genoil/cpp-ethereum/tree/cudaminer

You need to copy the file cuda_helper.h into the sourcecode folder (it's in the ccminer project)
and replace the two files i linked further up in the thread:


The folder of the modded files is called libethash-cu/


keccak.cuh
ethash_cu_miner_kernel.cu
legendary
Activity: 2940
Merit: 1091
--- ChainWorks Industries ---
Here is some more speed(untested):

keccak.cuh

Code:
#define bitselect(a, b, c) ((a) ^ ((c) & ((b) ^ (a))))

__device__ __forceinline__ void keccak_f1600_block(uint2* s, uint32_t out_size)//, uint32_t in_size, uint32_t out_size)
{
uint2 t[5], u, v;

#pragma unroll 3
for (int i = 0; i < 24; i++)
{
/* theta: c = a[0,i] ^ a[1,i] ^ .. a[4,i] */
t[0] = s[0] ^ s[5] ^ s[10] ^ s[15] ^ s[20];
t[1] = s[1] ^ s[6] ^ s[11] ^ s[16] ^ s[21];
t[2] = s[2] ^ s[7] ^ s[12] ^ s[17] ^ s[22];
t[3] = s[3] ^ s[8] ^ s[13] ^ s[18] ^ s[23];
t[4] = s[4] ^ s[9] ^ s[14] ^ s[19] ^ s[24];

/* theta: d[i] = c[i+4] ^ rotl(c[i+1],1) */
/* theta: a[0,i], a[1,i], .. a[4,i] ^= d[i] */
u = t[4] ^ ROL2(t[1], 1);
s[0] ^= u; s[5] ^= u; s[10] ^= u; s[15] ^= u; s[20] ^= u;
u = t[0] ^ ROL2(t[2], 1);
s[1] ^= u; s[6] ^= u; s[11] ^= u; s[16] ^= u; s[21] ^= u;
u = t[1] ^ ROL2(t[3], 1);
s[2] ^= u; s[7] ^= u; s[12] ^= u; s[17] ^= u; s[22] ^= u;
u = t[2] ^ ROL2(t[4], 1);
s[3] ^= u; s[8] ^= u; s[13] ^= u; s[18] ^= u; s[23] ^= u;
u = t[3] ^ ROL2(t[0], 1);
s[4] ^= u; s[9] ^= u; s[14] ^= u; s[19] ^= u; s[24] ^= u;

/* rho pi: b[..] = rotl(a[..], ..) */
u = s[1];

s[1] = ROL2(s[6], 44);
s[6] = ROL2(s[9], 20);
s[9] = ROL2(s[22], 61);
s[22] = ROL2(s[14], 39);
s[14] = ROL2(s[20], 18);
s[20] = ROL2(s[2], 62);
s[2] = ROL2(s[12], 43);
s[12] = ROL2(s[13], 25);
s[13] = ROL2(s[19], 8);
s[19] = ROL2(s[23], 56);
s[23] = ROL2(s[15], 41);
s[15] = ROL2(s[4], 27);
s[4] = ROL2(s[24], 14);
s[24] = ROL2(s[21], 2);
s[21] = ROL2(s[8], 55);
s[8] = ROL2(s[16], 45);
s[16] = ROL2(s[5], 36);
s[5] = ROL2(s[3], 28);
s[3] = ROL2(s[18], 21);
s[18] = ROL2(s[17], 15);
s[17] = ROL2(s[11], 10);
s[11] = ROL2(s[7], 6);
s[7] = ROL2(s[10], 3);
s[10] = ROL2(u, 1);

// squeeze this in here
/* chi: a[i,j] ^= ~b[i,j+1] & b[i,j+2] */
u = s[0]; v = s[1]; s[0] ^= (~v) & s[2];

/* iota: a[0,0] ^= round constant */
s[0] ^= vectorize(keccak_round_constants[i]);
if (i == 23 && out_size == 1) return;

// continue chi
s[1] ^= (~s[2]) & s[3]; s[2] ^= (~s[3]) & s[4]; s[3] ^= (~s[4]) & u; s[4] ^= (~u) & v;
u = s[5]; v = s[6]; s[5] ^= (~v) & s[7]; s[6] ^= (~s[7]) & s[8]; s[7] ^= (~s[8]) & s[9];

if (i == 23) return;
s[8] ^= (~s[9]) & u; s[9] ^= (~u) & v;
u = s[10]; v = s[11]; s[10] = bitselect(s[10] ^ s[12], s[10], s[11]); s[11] = bitselect(s[11] ^ s[13], s[11], s[12]); s[12] = bitselect(s[12] ^ s[14], s[12], s[13]); s[13] = bitselect(s[13] ^ u, s[13], s[14]); s[14] = bitselect(s[14] ^ v, s[14], u);
u = s[15]; v = s[16]; s[15] = bitselect(s[15] ^ s[17], s[15], s[16]); s[16] = bitselect(s[16] ^ s[18], s[16], s[17]); s[17] = bitselect(s[17] ^ s[19], s[17], s[18]); s[18] = bitselect(s[18] ^ u, s[18], s[19]); s[19] = bitselect(s[19] ^ v, s[19], u);
u = s[20]; v = s[21]; s[20] = bitselect(s[20] ^ s[22], s[20], s[21]); s[21] = bitselect(s[21] ^ s[23], s[21], s[22]); s[22] = bitselect(s[22] ^ s[24], s[22], s[23]); s[23] = bitselect(s[23] ^ u, s[23], s[24]); s[24] = bitselect(s[24] ^ v, s[24], u);
}
}

is this now in git sp? ...

#crysx
Jump to: