Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE - page 13.

palgin

sr. member

Activity: 266

Merit: 250

Quote from: sp_ on October 05, 2017, 03:25:41 PM

I don't undestand why you spend so much time on a non profitable shitcoin. So many coders, and no progress. I buildt the shit on windows in 30 minutes and it works perfectly fine... 5 pages already with puppet postings.

So, sell it for 200 bucks/copy and claim bounties, make yourself richer, you know it Grin

It's just fun, debugging always train your skills Wink

anorganix

copper member

Activity: 970

Merit: 287

Per aspera ad astra

Dear sp_,

Stop being an ass for once.
Seriously now, is it worth it?!

ans34

full member

Activity: 161

Merit: 100

Quote from: sp_ on October 05, 2017, 03:25:41 PM

I don't undestand why you spend so much time on a non profitable shitcoin. So many coders, and no progress. I buildt the shit on windows in 30 minutes and it works perfectly fine... 5 pages already with puppet postings.

If it is shitcoin and non profit why don't you share binaries?

sp_

legendary

Activity: 2954

Merit: 1087

Team Black developer

I don't undestand why you spend so much time on a non profitable shitcoin. So many coders, and no progress. I buildt the shit on windows in 30 minutes and it works perfectly fine... 5 pages already with puppet postings.

palgin

sr. member

Activity: 266

Merit: 250

Quote from: ZenFr on October 05, 2017, 02:32:06 PM

Quote from: car1999 on October 04, 2017, 09:22:56 PM

it works fine with cuda8, ubuntu16,suprnova, gtx1080ti get 5.6m @core clock+100, 200watts.

Compilation with no problem under Ubuntu 16.04 and Cuda 8.
But, when I start the ccminer, he ask me for the Cuda 7.5 library...
How do you make to use the Cuda 8 library (Cuda 7.5 is not installed on this new RIG installation) ?

Check /usr/local for cuda folders. Default coming in configure.sh is cuda 7.5

ZenFr

legendary

Activity: 1260

Merit: 1046

Quote from: car1999 on October 04, 2017, 09:22:56 PM

it works fine with cuda8, ubuntu16,suprnova, gtx1080ti get 5.6m @core clock+100, 200watts.

Compilation with no problem under Ubuntu 16.04 and Cuda 8.
But, when I start the ccminer, he ask me for the Cuda 7.5 library...
How do you make to use the Cuda 8 library (Cuda 7.5 is not installed on this new RIG installation) ?

milliza

newbie

Activity: 1

Merit: 0

i try to build it on visual studio 2013 but i get next error

error LNK1181: cannot open input file 'libcurl.x64.lib' ccminer

can someone help me ?

magnacartic

newbie

Activity: 15

Merit: 0

Somehow I got stuck at the api.cpp stage during compilation:

ccminer-util.o: In function `stratum_handle_method':
util.cpp:(.text+0x4af3): undefined reference to `gpu_power(cgpu_info*)'
util.cpp:(.text+0x4b51): undefined reference to `nvml_get_current_clocks(int, unsigned int*, unsigned int*)'
util.cpp:(.text+0x4b59): undefined reference to `gpu_info(cgpu_info*)'
ccminer-api.o: In function `gpustatus(int)':
api.cpp:(.text+0x554): undefined reference to `gpu_busid(cgpu_info*)'
api.cpp:(.text+0x560): undefined reference to `gpu_temp(cgpu_info*)'
api.cpp:(.text+0x56b): undefined reference to `gpu_fanpercent(cgpu_info*)'
api.cpp:(.text+0x57b): undefined reference to `gpu_fanrpm(cgpu_info*)'
api.cpp:(.text+0x587): undefined reference to `gpu_power(cgpu_info*)'
ccminer-api.o: In function `gpuhwinfos(int)':
api.cpp:(.text+0xbf0): undefined reference to `gpu_busid(cgpu_info*)'
api.cpp:(.text+0xbfc): undefined reference to `gpu_temp(cgpu_info*)'
api.cpp:(.text+0xc07): undefined reference to `gpu_fanpercent(cgpu_info*)'
api.cpp:(.text+0xc17): undefined reference to `gpu_fanrpm(cgpu_info*)'
api.cpp:(.text+0xc23): undefined reference to `gpu_pstate(cgpu_info*)'
api.cpp:(.text+0xc2f): undefined reference to `gpu_power(cgpu_info*)'
api.cpp:(.text+0xc42): undefined reference to `gpu_info(cgpu_info*)'
collect2: error: ld returned 1 exit status
make[2]: *** [ccminer] Error 1

Any ideas on how to fix this? Compiling on Ubuntu 14.04 & CUDA 8.0

anorganix

copper member

Activity: 970

Merit: 287

Per aspera ad astra

It has something to do with platform target (x86/x64) for sure.
See below how it behaves for x86.

palgin

sr. member

Activity: 266

Merit: 250

Quote from: sickofscamcoins on October 05, 2017, 01:23:26 PM

Quote from: palgin on October 05, 2017, 01:01:08 PM

Next part of check, now GPU part. All the same, x17.cu is following:

http://paste.ubuntu.com/25681179/

Thank you, you're really helping, will check now.

UPDATE: Hmm, behaviour is the same, but what is more strange, pool accepts shares? That shouldn't be this way, my mind blows off, think everybody should wait for krnlx comments, he's the code author Grin

sickofscamcoins

newbie

Activity: 7

Merit: 0

Quote from: palgin on October 05, 2017, 01:01:08 PM

Next part of check, now GPU part. All the same, x17.cu is following:

http://paste.ubuntu.com/25681179/

Nesp

full member

Activity: 124

Merit: 100

Quote from: anorganix on October 05, 2017, 12:59:06 PM

Quote from: Nesp on October 05, 2017, 12:57:04 PM

Any binaries?

Yes, we have 0 and 1.

Hahah, nice one

palgin

sr. member

Activity: 266

Merit: 250

Next part of check, now GPU part. All the same, x17.cu is following:

Code:

/**
* X17 algorithm (X15 + sha512 + haval256)
*/

extern "C" {
#include "sph/sph_blake.h"
#include "sph/sph_bmw.h"
#include "sph/sph_groestl.h"
#include "sph/sph_skein.h"
#include "sph/sph_jh.h"
#include "sph/sph_keccak.h"

#include "sph/sph_luffa.h"
#include "sph/sph_cubehash.h"
#include "sph/sph_shavite.h"
#include "sph/sph_simd.h"
#include "sph/sph_echo.h"

#include "sph/sph_hamsi.h"
#include "sph/sph_fugue.h"

#include "sph/sph_shabal.h"
#include "sph/sph_whirlpool.h"

#include "sph/sph_sha2.h"
#include "sph/sph_haval.h"
}

#include "miner.h"
#include "cuda_helper.h"
#include "x11/cuda_x11.h"

#define NBN 2

// Memory for the hash functions
static uint32_t *d_hash[MAX_GPUS];
static uint32_t *d_resNonce[MAX_GPUS];
static uint32_t *h_resNonce[MAX_GPUS];

extern void x13_hamsi_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x14_shabal512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x15_whirlpool_cpu_init(int thr_id, uint32_t threads, int mode);
extern void x15_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x15_whirlpool_cpu_free(int thr_id);

extern void x17_sha512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x17_haval256_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t* resNonce, uint64_t target);
extern void bmw256_cpu_hash_32_full(int thr_id, uint32_t threads, uint32_t *g_hash);
extern void quark_bmw512_cpu_hash_64x(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_hash);
extern void quark_groestl512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void groestl512_cpu_init(int thr_id, uint32_t threads);
extern void groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_skein512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void qubit_luffa512_cpu_hash_80(int thr_id, uint32_t threads, uint32_t startNounce, uint32_t *d_outputHash);
extern void x11_cubehash512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_init(int thr_id, uint32_t threads);
extern void xevan_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_hamsi512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_sha512_cpu_hash_64(int thr_id, int threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void xevan_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t *resNonce, uint64_t target);
extern void xevan_groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64_A(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void quark_blake512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_outputHash);
extern void quark_groestl512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_luffa512_cpu_hash_128(int thr_id, uint32_t threads,uint32_t *d_hash);

// X17 CPU Hash (Validation)
extern "C" void x17hash(void *output, const void *input)
{
uint32_t _ALIGN(64) hash[32]; // 128 bytes required
const int dataLen = 128;
//return;
sph_blake512_context ctx_blake;
sph_bmw512_context ctx_bmw;
sph_groestl512_context ctx_groestl;
sph_skein512_context ctx_skein;
sph_jh512_context ctx_jh;
sph_keccak512_context ctx_keccak;
sph_luffa512_context ctx_luffa;
sph_cubehash512_context ctx_cubehash;
sph_shavite512_context ctx_shavite;
sph_simd512_context ctx_simd;
sph_echo512_context ctx_echo;
sph_hamsi512_context ctx_hamsi;
sph_fugue512_context ctx_fugue;
sph_shabal512_context ctx_shabal;
sph_whirlpool_context ctx_whirlpool;
sph_sha512_context ctx_sha512;
sph_haval256_5_context ctx_haval;

//print_hash(input,20);
sph_blake512_init(&ctx_blake);
sph_blake512(&ctx_blake, input, 80);
sph_blake512_close(&ctx_blake, hash);
//print_hash(hash,32);
memset(&hash[16], 0, 64);

sph_bmw512_init(&ctx_bmw);
sph_bmw512(&ctx_bmw, hash, dataLen);
sph_bmw512_close(&ctx_bmw, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_groestl512_init(&ctx_groestl);
sph_groestl512(&ctx_groestl, hash, dataLen);
sph_groestl512_close(&ctx_groestl, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;

sph_skein512_init(&ctx_skein);
sph_skein512(&ctx_skein, hash, dataLen);
sph_skein512_close(&ctx_skein, hash);

//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_jh512_init(&ctx_jh);
sph_jh512(&ctx_jh, hash, dataLen);
sph_jh512_close(&ctx_jh, hash);
//print_hash(hash,32);

sph_keccak512_init(&ctx_keccak);
sph_keccak512(&ctx_keccak, hash, dataLen);
sph_keccak512_close(&ctx_keccak, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_luffa512_init(&ctx_luffa);
sph_luffa512(&ctx_luffa, hash, dataLen);
sph_luffa512_close(&ctx_luffa, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_cubehash512_init(&ctx_cubehash);
sph_cubehash512(&ctx_cubehash, hash, dataLen);
sph_cubehash512_close(&ctx_cubehash, hash);
//print_hash(hash,32);
sph_shavite512_init(&ctx_shavite);
sph_shavite512(&ctx_shavite, hash, dataLen);
sph_shavite512_close(&ctx_shavite, hash);
//print_hash(hash,32);
sph_simd512_init(&ctx_simd);
sph_simd512(&ctx_simd, hash, dataLen);
sph_simd512_close(&ctx_simd, hash);
//print_hash(hash,32);
sph_echo512_init(&ctx_echo);
sph_echo512(&ctx_echo, hash, dataLen);
sph_echo512_close(&ctx_echo, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_hamsi512_init(&ctx_hamsi);
sph_hamsi512(&ctx_hamsi, hash, dataLen);
sph_hamsi512_close(&ctx_hamsi, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_fugue512_init(&ctx_fugue);
sph_fugue512(&ctx_fugue, hash, dataLen);
sph_fugue512_close(&ctx_fugue, hash);
//print_hash(hash,32);
sph_shabal512_init(&ctx_shabal);
sph_shabal512(&ctx_shabal, hash, dataLen);
sph_shabal512_close(&ctx_shabal, hash);
//print_hash(hash,32);
sph_whirlpool_init(&ctx_whirlpool);
sph_whirlpool(&ctx_whirlpool, hash, dataLen);
sph_whirlpool_close(&ctx_whirlpool, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_sha512_init(&ctx_sha512);
sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
sph_sha512_close(&ctx_sha512,(void*) hash);
//print_hash(hash,32);
sph_haval256_5_init(&ctx_haval);
sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,32);

memset(&hash[8], 0, dataLen - 32);

sph_blake512_init(&ctx_blake);
sph_blake512(&ctx_blake, hash, dataLen);
sph_blake512_close(&ctx_blake, hash);

//print_hash(hash,32);

sph_bmw512_init(&ctx_bmw);
sph_bmw512(&ctx_bmw, hash, dataLen);
sph_bmw512_close(&ctx_bmw, hash);

sph_groestl512_init(&ctx_groestl);
sph_groestl512(&ctx_groestl, hash, dataLen);
sph_groestl512_close(&ctx_groestl, hash);

sph_skein512_init(&ctx_skein);
sph_skein512(&ctx_skein, hash, dataLen);
sph_skein512_close(&ctx_skein, hash);

sph_jh512_init(&ctx_jh);
sph_jh512(&ctx_jh, hash, dataLen);
sph_jh512_close(&ctx_jh, hash);

sph_keccak512_init(&ctx_keccak);
sph_keccak512(&ctx_keccak, hash, dataLen);
sph_keccak512_close(&ctx_keccak, hash);

sph_luffa512_init(&ctx_luffa);
sph_luffa512(&ctx_luffa, hash, dataLen);
sph_luffa512_close(&ctx_luffa, hash);

sph_cubehash512_init(&ctx_cubehash);
sph_cubehash512(&ctx_cubehash, hash, dataLen);
sph_cubehash512_close(&ctx_cubehash, hash);

sph_shavite512_init(&ctx_shavite);
sph_shavite512(&ctx_shavite, hash, dataLen);
sph_shavite512_close(&ctx_shavite, hash);

sph_simd512_init(&ctx_simd);
sph_simd512(&ctx_simd, hash, dataLen);
sph_simd512_close(&ctx_simd, hash);

sph_echo512_init(&ctx_echo);
sph_echo512(&ctx_echo, hash, dataLen);
sph_echo512_close(&ctx_echo, hash);

sph_hamsi512_init(&ctx_hamsi);
sph_hamsi512(&ctx_hamsi, hash, dataLen);
sph_hamsi512_close(&ctx_hamsi, hash);

sph_fugue512_init(&ctx_fugue);
sph_fugue512(&ctx_fugue, hash, dataLen);
sph_fugue512_close(&ctx_fugue, hash);

sph_shabal512_init(&ctx_shabal);
sph_shabal512(&ctx_shabal, hash, dataLen);
sph_shabal512_close(&ctx_shabal, hash);

sph_whirlpool_init(&ctx_whirlpool);
sph_whirlpool(&ctx_whirlpool, hash, dataLen);
sph_whirlpool_close(&ctx_whirlpool, hash);

sph_sha512_init(&ctx_sha512);
sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
sph_sha512_close(&ctx_sha512,(void*) hash);

//print_hash(hash,32);
sph_haval256_5_init(&ctx_haval);
sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,8);
memcpy(output, hash, 32);
}

static bool init[MAX_GPUS] = { 0 };

void print_hash(unsigned int *data,int size){
for(int i=0;i   gpulog(LOG_WARNING, 0,"%x ",data[i]);
gpulog(LOG_WARNING, 0,"-------------");
}

extern "C" int scanhash_x17(int thr_id, struct work* work, uint32_t max_nonce, unsigned long *hashes_done){

int dev_id = device_map[thr_id];

uint32_t *pdata = work->data;
uint32_t *ptarget = work->target;
const uint32_t first_nonce = pdata[19];
/*
uint32_t default_throughput = 1<<20;

if (strstr(device_name[dev_id], "GTX 970")) default_throughput+=256*256*6;
if (strstr(device_name[dev_id], "GTX 980")) default_throughput =1<<22;

uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
*/
uint32_t default_throughput;
if(device_sm[dev_id]<=500) default_throughput = 1<<20;
else if(device_sm[dev_id]<=520) default_throughput = 1<<21;
else if(device_sm[dev_id]>520) default_throughput = (1<<22) + (1<<21);

if((strstr(device_name[dev_id], "1070")))default_throughput = 1<<20;
if((strstr(device_name[dev_id], "1080")))default_throughput = 1<<20;

uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
if (init[thr_id]) throughput = min(throughput, max_nonce - first_nonce);

throughput&=0xFFFFFF70; //multiples of 128 due to simd_echo kernel

if (opt_benchmark)
((uint32_t*)ptarget)[7] = 0xff;

gpulog(LOG_INFO,thr_id,"target %x %x %x",ptarget[5], ptarget[6], ptarget[7]);
   gpulog(LOG_INFO,thr_id,"target %llx",*(uint64_t*)&ptarget[6]);

if (!init[thr_id])
{
cudaSetDevice(device_map[thr_id]);
if (opt_cudaschedule == -1 && gpu_threads == 1) {
cudaDeviceReset();
// reduce cpu usage
cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
// cudaDeviceSetCacheConfig(cudaFuncCachePreferShared);
}
gpulog(LOG_INFO,thr_id, "Intensity set to %g, %u cuda threads", throughput2intensity(throughput), throughput);

// x11_simd_echo_512_cpu_init(thr_id, throughput);
x15_whirlpool_cpu_init(thr_id, throughput, 0);
groestl512_cpu_init(thr_id, throughput);
x11_simd512_cpu_init(thr_id, throughput);
//for(;;);
CUDA_SAFE_CALL(cudaMalloc(&d_hash[thr_id], 8 * sizeof(uint64_t) * throughput));
CUDA_SAFE_CALL(cudaMalloc(&d_resNonce[thr_id], NBN * sizeof(uint32_t)));
h_resNonce[thr_id] = (uint32_t*) malloc(NBN * 8 * sizeof(uint32_t));
if(h_resNonce[thr_id] == NULL){
gpulog(LOG_ERR,thr_id,"Host memory allocation failed");
exit(EXIT_FAILURE);
}
init[thr_id] = true;
}

uint32_t _ALIGN(64) endiandata[20];
for (int k=0; k < 20; k++)
be32enc(&endiandata[k], pdata[k]);
// endiandata[k]=0;
// print_hash(endiandata,20);
quark_blake512_cpu_setBlock_80(thr_id, endiandata);
cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));
// x11_simd512_cpu_init(thr_id, throughput);
// for(;;);
do {
// Hash with CUDA

quark_blake512_cpu_hash_80(thr_id, throughput, pdata[19], d_hash[thr_id]);//A

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id],16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);//A //fast

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));
// keccak_xevan_cpu_hash_64_A(thr_id, throughput, d_hash[thr_id]);//A

//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));
// x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //P
//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//for(;;);

x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A 256

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//P slow r2

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A slow r3

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

// cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

// xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //slow r1

// cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
// print_hash(h_resNonce[thr_id],16);

  // cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//A

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

// cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
// print_hash(h_resNonce[thr_id],16);

//for(;;);

   x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast ++

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //opt2

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

xevan_haval512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

// xevan_blake512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//BAD
quark_blake512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//BAD

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

//
   quark_bmw512_cpu_hash_64x(thr_id, throughput, NULL, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

// xevan_groestl512_cpu_hash(thr_id, throughput, d_hash[thr_id]);
quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);

// xevan_skein512(thr_id, throughput, d_hash[thr_id]);
   quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

// keccak_xevan_cpu_hash_64_A(thr_id, throughput, d_hash[thr_id]);
// x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
   x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//move to shared

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

// xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
   x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

   xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

/*
for(int i = 10000;i< 10016;i++){
   cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][16*i], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id],8);
}
for(;;);

*/
xevan_haval512_cpu_hash_64_final(thr_id, throughput, d_hash[thr_id],d_resNonce[thr_id],*(uint64_t*)&ptarget[6]);

cudaMemcpy(h_resNonce[thr_id], d_resNonce[thr_id], NBN*sizeof(uint32_t), cudaMemcpyDeviceToHost);

print_hash(h_resNonce[thr_id], 16);

if (h_resNonce[thr_id][0] != UINT32_MAX){
const uint32_t Htarg = ptarget[7];
const uint32_t startNounce = pdata[19];
uint32_t vhash64[8];
be32enc(&endiandata[19], startNounce + h_resNonce[thr_id][0]);
x17hash(vhash64, endiandata);
// *hashes_done = pdata[19] - first_nonce + throughput + 1;
// pdata[19] = startNounce + h_resNonce[thr_id][0];
gpulog(LOG_WARNING, 0,"NONCE FOUND ");
// return 1;
if (vhash64[7] <= Htarg && fulltest(vhash64, ptarget)) {
int res = 1;
*hashes_done = pdata[19] - first_nonce + throughput + 1;
work_set_target_ratio(work, vhash64);
pdata[19] = startNounce + h_resNonce[thr_id][0];
if (h_resNonce[thr_id][1] != UINT32_MAX) {
pdata[21] = startNounce+h_resNonce[thr_id][1];
if(!opt_quiet)
gpulog(LOG_BLUE,dev_id,"Found 2nd nonce: %08x", pdata[21]);
be32enc(&endiandata[19], pdata[21]);
x17hash(vhash64, endiandata);
if (bn_hash_target_ratio(vhash64, ptarget) > work->shareratio[0]){
work_set_target_ratio(work, vhash64);
xchg(pdata[19],pdata[21]);
}
res++;
}
return res;
}
else {
gpulog(LOG_WARNING, thr_id, "result for %08x does not validate on CPU!", h_resNonce[thr_id][0]);
cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));
}
}

pdata[19] += throughput;
} while (!work_restart[thr_id].restart && ((uint64_t)max_nonce > (uint64_t)throughput + pdata[19]));

*hashes_done = pdata[19] - first_nonce + 1;

return 0;
}

// cleanup
extern "C" void free_x17(int thr_id)
{
if (!init[thr_id])
return;

cudaDeviceSynchronize();

free(h_resNonce[thr_id]);
cudaFree(d_resNonce[thr_id]);
cudaFree(d_hash[thr_id]);

x11_simd_echo_512_cpu_free(thr_id);
x15_whirlpool_cpu_free(thr_id);
cudaDeviceSynchronize();
init[thr_id] = false;
}

Thanks for everybody who helps in investigation.

UPDATE: think I've found the bug, it's in xevan_haval512_cpu_hash_64_final function, so h_resNonce[0] and h_resNonce[1] are always random on Win with exactly the same input data... Need confirmation that behaviour is different on *nix, of course. CUDA memcpy bug?

anorganix

copper member

Activity: 970

Merit: 287

Per aspera ad astra

Quote from: Nesp on October 05, 2017, 12:57:04 PM

Any binaries?

Yes, we have 0 and 1.

Nesp

full member

Activity: 124

Merit: 100

Any binaries?

sickofscamcoins

newbie

Activity: 7

Merit: 0

welp i tried for my cookie anyways. Anything else I can do to help lmk

palgin

sr. member

Activity: 266

Merit: 250

UPDATE: Hashing is exactly the same as on Win, so it comes out to be very weird problem. The one thing I suspect is ulong and uint compiler handling difference on Win and on *nix, for example, ulong will be equal to uint64_t on 64-bit build, but different on 32-bit build. Will look into this way, maybe will come up with something.

sickofscamcoins

newbie

Activity: 7

Merit: 0

Quote from: palgin on October 05, 2017, 11:37:47 AM

Quote from: sickofscamcoins on October 05, 2017, 11:36:25 AM

http://paste.ubuntu.com/25680610/

this is results of output, replacing x17.cu and recompile with one posted palgin.

Thank you, will check it!

http://paste.ubuntu.com/25680643/

no escape characters this time

integrale

full member

Activity: 144

Merit: 100

Eager to learn

copied from my Linux build x17.cu hopefully it helps

Code:

/**
* X17 algorithm (X15 + sha512 + haval256)
*/

extern "C" {
#include "sph/sph_blake.h"
#include "sph/sph_bmw.h"
#include "sph/sph_groestl.h"
#include "sph/sph_skein.h"
#include "sph/sph_jh.h"
#include "sph/sph_keccak.h"

#include "sph/sph_luffa.h"
#include "sph/sph_cubehash.h"
#include "sph/sph_shavite.h"
#include "sph/sph_simd.h"
#include "sph/sph_echo.h"

#include "sph/sph_hamsi.h"
#include "sph/sph_fugue.h"

#include "sph/sph_shabal.h"
#include "sph/sph_whirlpool.h"

#include "sph/sph_sha2.h"
#include "sph/sph_haval.h"
}

#include "miner.h"
#include "cuda_helper.h"
#include "x11/cuda_x11.h"

#define NBN 2

// Memory for the hash functions
static uint32_t *d_hash[MAX_GPUS];
static uint32_t *d_resNonce[MAX_GPUS];
static uint32_t *h_resNonce[MAX_GPUS];

extern void x13_hamsi_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x14_shabal512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x15_whirlpool_cpu_init(int thr_id, uint32_t threads, int mode);
extern void x15_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x15_whirlpool_cpu_free(int thr_id);

extern void x17_sha512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x17_haval256_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t* resNonce, uint64_t target);
extern void bmw256_cpu_hash_32_full(int thr_id, uint32_t threads, uint32_t *g_hash);
extern void quark_bmw512_cpu_hash_64x(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_hash);
extern void quark_groestl512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void groestl512_cpu_init(int thr_id, uint32_t threads);
extern void groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_skein512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void qubit_luffa512_cpu_hash_80(int thr_id, uint32_t threads, uint32_t startNounce, uint32_t *d_outputHash);
extern void x11_cubehash512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_init(int thr_id, uint32_t threads);
extern void xevan_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_hamsi512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_sha512_cpu_hash_64(int thr_id, int threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void xevan_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t *resNonce, uint64_t target);
extern void xevan_groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64_A(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void quark_blake512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_outputHash);
extern void quark_groestl512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_luffa512_cpu_hash_128(int thr_id, uint32_t threads,uint32_t *d_hash);

// X17 CPU Hash (Validation)
extern "C" void x17hash(void *output, const void *input)
{
uint32_t _ALIGN(64) hash[32]; // 128 bytes required
const int dataLen = 128;
//return;
sph_blake512_context ctx_blake;
sph_bmw512_context ctx_bmw;
sph_groestl512_context ctx_groestl;
sph_skein512_context ctx_skein;
sph_jh512_context ctx_jh;
sph_keccak512_context ctx_keccak;
sph_luffa512_context ctx_luffa;
sph_cubehash512_context ctx_cubehash;
sph_shavite512_context ctx_shavite;
sph_simd512_context ctx_simd;
sph_echo512_context ctx_echo;
sph_hamsi512_context ctx_hamsi;
sph_fugue512_context ctx_fugue;
sph_shabal512_context ctx_shabal;
sph_whirlpool_context ctx_whirlpool;
sph_sha512_context ctx_sha512;
sph_haval256_5_context ctx_haval;

//print_hash(input,20);
sph_blake512_init(&ctx_blake);
sph_blake512(&ctx_blake, input, 80);
sph_blake512_close(&ctx_blake, hash);
//print_hash(hash,32);
memset(&hash[16], 0, 64);

sph_bmw512_init(&ctx_bmw);
sph_bmw512(&ctx_bmw, hash, dataLen);
sph_bmw512_close(&ctx_bmw, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_groestl512_init(&ctx_groestl);
sph_groestl512(&ctx_groestl, hash, dataLen);
sph_groestl512_close(&ctx_groestl, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;

sph_skein512_init(&ctx_skein);
sph_skein512(&ctx_skein, hash, dataLen);
sph_skein512_close(&ctx_skein, hash);

//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_jh512_init(&ctx_jh);
sph_jh512(&ctx_jh, hash, dataLen);
sph_jh512_close(&ctx_jh, hash);
//print_hash(hash,32);

sph_keccak512_init(&ctx_keccak);
sph_keccak512(&ctx_keccak, hash, dataLen);
sph_keccak512_close(&ctx_keccak, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_luffa512_init(&ctx_luffa);
sph_luffa512(&ctx_luffa, hash, dataLen);
sph_luffa512_close(&ctx_luffa, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_cubehash512_init(&ctx_cubehash);
sph_cubehash512(&ctx_cubehash, hash, dataLen);
sph_cubehash512_close(&ctx_cubehash, hash);
//print_hash(hash,32);
sph_shavite512_init(&ctx_shavite);
sph_shavite512(&ctx_shavite, hash, dataLen);
sph_shavite512_close(&ctx_shavite, hash);
//print_hash(hash,32);
sph_simd512_init(&ctx_simd);
sph_simd512(&ctx_simd, hash, dataLen);
sph_simd512_close(&ctx_simd, hash);
//print_hash(hash,32);
sph_echo512_init(&ctx_echo);
sph_echo512(&ctx_echo, hash, dataLen);
sph_echo512_close(&ctx_echo, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_hamsi512_init(&ctx_hamsi);
sph_hamsi512(&ctx_hamsi, hash, dataLen);
sph_hamsi512_close(&ctx_hamsi, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_fugue512_init(&ctx_fugue);
sph_fugue512(&ctx_fugue, hash, dataLen);
sph_fugue512_close(&ctx_fugue, hash);
//print_hash(hash,32);
sph_shabal512_init(&ctx_shabal);
sph_shabal512(&ctx_shabal, hash, dataLen);
sph_shabal512_close(&ctx_shabal, hash);
//print_hash(hash,32);
sph_whirlpool_init(&ctx_whirlpool);
sph_whirlpool(&ctx_whirlpool, hash, dataLen);
sph_whirlpool_close(&ctx_whirlpool, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
sph_sha512_init(&ctx_sha512);
sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
sph_sha512_close(&ctx_sha512,(void*) hash);
//print_hash(hash,32);
sph_haval256_5_init(&ctx_haval);
sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,32);

memset(&hash[8], 0, dataLen - 32);

sph_blake512_init(&ctx_blake);
sph_blake512(&ctx_blake, hash, dataLen);
sph_blake512_close(&ctx_blake, hash);

//print_hash(hash,32);

sph_bmw512_init(&ctx_bmw);
sph_bmw512(&ctx_bmw, hash, dataLen);
sph_bmw512_close(&ctx_bmw, hash);

sph_groestl512_init(&ctx_groestl);
sph_groestl512(&ctx_groestl, hash, dataLen);
sph_groestl512_close(&ctx_groestl, hash);

sph_skein512_init(&ctx_skein);
sph_skein512(&ctx_skein, hash, dataLen);
sph_skein512_close(&ctx_skein, hash);

sph_jh512_init(&ctx_jh);
sph_jh512(&ctx_jh, hash, dataLen);
sph_jh512_close(&ctx_jh, hash);

sph_keccak512_init(&ctx_keccak);
sph_keccak512(&ctx_keccak, hash, dataLen);
sph_keccak512_close(&ctx_keccak, hash);

sph_luffa512_init(&ctx_luffa);
sph_luffa512(&ctx_luffa, hash, dataLen);
sph_luffa512_close(&ctx_luffa, hash);

sph_cubehash512_init(&ctx_cubehash);
sph_cubehash512(&ctx_cubehash, hash, dataLen);
sph_cubehash512_close(&ctx_cubehash, hash);

sph_shavite512_init(&ctx_shavite);
sph_shavite512(&ctx_shavite, hash, dataLen);
sph_shavite512_close(&ctx_shavite, hash);

sph_simd512_init(&ctx_simd);
sph_simd512(&ctx_simd, hash, dataLen);
sph_simd512_close(&ctx_simd, hash);

sph_echo512_init(&ctx_echo);
sph_echo512(&ctx_echo, hash, dataLen);
sph_echo512_close(&ctx_echo, hash);

sph_hamsi512_init(&ctx_hamsi);
sph_hamsi512(&ctx_hamsi, hash, dataLen);
sph_hamsi512_close(&ctx_hamsi, hash);

sph_fugue512_init(&ctx_fugue);
sph_fugue512(&ctx_fugue, hash, dataLen);
sph_fugue512_close(&ctx_fugue, hash);

sph_shabal512_init(&ctx_shabal);
sph_shabal512(&ctx_shabal, hash, dataLen);
sph_shabal512_close(&ctx_shabal, hash);

sph_whirlpool_init(&ctx_whirlpool);
sph_whirlpool(&ctx_whirlpool, hash, dataLen);
sph_whirlpool_close(&ctx_whirlpool, hash);

sph_sha512_init(&ctx_sha512);
sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
sph_sha512_close(&ctx_sha512,(void*) hash);

//print_hash(hash,32);
sph_haval256_5_init(&ctx_haval);
sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,8);
memcpy(output, hash, 32);
}

static bool init[MAX_GPUS] = { 0 };

void print_hash(unsigned int *data,int size){
for(int i=0;i gpulog(LOG_WARNING, 0,"%x ",data[i]);
gpulog(LOG_WARNING, 0,"-------------");
}

extern "C" int scanhash_x17(int thr_id, struct work* work, uint32_t max_nonce, unsigned long *hashes_done){

int dev_id = device_map[thr_id];

uint32_t *pdata = work->data;
uint32_t *ptarget = work->target;
const uint32_t first_nonce = pdata[19];
/*
uint32_t default_throughput = 1<<20;

if (strstr(device_name[dev_id], "GTX 970")) default_throughput+=256*256*6;
if (strstr(device_name[dev_id], "GTX 980")) default_throughput =1<<22;

uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
*/
uint32_t default_throughput;
if(device_sm[dev_id]<=500) default_throughput = 1<<20;
else if(device_sm[dev_id]<=520) default_throughput = 1<<21;
else if(device_sm[dev_id]>520) default_throughput = (1<<22) + (1<<21);

if((strstr(device_name[dev_id], "1070")))default_throughput = 1<<20;
if((strstr(device_name[dev_id], "1080")))default_throughput = 1<<20;

uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
if (init[thr_id]) throughput = min(throughput, max_nonce - first_nonce);

throughput&=0xFFFFFF70; //multiples of 128 due to simd_echo kernel

if (opt_benchmark)
((uint32_t*)ptarget)[7] = 0xff;

gpulog(LOG_INFO,thr_id,"target %x %x %x",ptarget[5], ptarget[6], ptarget[7]);
gpulog(LOG_INFO,thr_id,"target %llx",*(uint64_t*)&ptarget[6]);

if (!init[thr_id])
{
cudaSetDevice(device_map[thr_id]);
if (opt_cudaschedule == -1 && gpu_threads == 1) {
cudaDeviceReset();
// reduce cpu usage
cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
// cudaDeviceSetCacheConfig(cudaFuncCachePreferShared);
}
gpulog(LOG_INFO,thr_id, "Intensity set to %g, %u cuda threads", throughput2intensity(throughput), throughput);

// x11_simd_echo_512_cpu_init(thr_id, throughput);
x15_whirlpool_cpu_init(thr_id, throughput, 0);
groestl512_cpu_init(thr_id, throughput);
x11_simd512_cpu_init(thr_id, throughput);
//for(;;);
CUDA_SAFE_CALL(cudaMalloc(&d_hash[thr_id], 8 * sizeof(uint64_t) * throughput));
CUDA_SAFE_CALL(cudaMalloc(&d_resNonce[thr_id], NBN * sizeof(uint32_t)));
h_resNonce[thr_id] = (uint32_t*) malloc(NBN * 8 * sizeof(uint32_t));
if(h_resNonce[thr_id] == NULL){
gpulog(LOG_ERR,thr_id,"Host memory allocation failed");
exit(EXIT_FAILURE);
}
init[thr_id] = true;
}

uint32_t _ALIGN(64) endiandata[20];
for (int k=0; k < 20; k++)
be32enc(&endiandata[k], pdata[k]);
// endiandata[k]=0;
// print_hash(endiandata,20);
quark_blake512_cpu_setBlock_80(thr_id, endiandata);
cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));
// x11_simd512_cpu_init(thr_id, throughput);
// for(;;);
do {
// Hash with CUDA

quark_blake512_cpu_hash_80(thr_id, throughput, pdata[19], d_hash[thr_id]);//A
quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);//A //fast
// keccak_xevan_cpu_hash_64_A(thr_id, throughput, d_hash[thr_id]);//A

//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));
// x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //P
//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A
//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//for(;;);

x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A 256
xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//P slow r2
x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A slow r3

// cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

// xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //slow r1

// cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
// print_hash(h_resNonce[thr_id],16);

// cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//A

// cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
// print_hash(h_resNonce[thr_id],16);

//for(;;);

x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast ++
x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //opt2
xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
xevan_haval512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

// xevan_blake512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//BAD
quark_blake512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//BAD

//
quark_bmw512_cpu_hash_64x(thr_id, throughput, NULL, d_hash[thr_id]);
// xevan_groestl512_cpu_hash(thr_id, throughput, d_hash[thr_id]);
quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

// xevan_skein512(thr_id, throughput, d_hash[thr_id]);
quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
// keccak_xevan_cpu_hash_64_A(thr_id, throughput, d_hash[thr_id]);
// x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//move to shared
x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

// xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

/*
for(int i = 10000;i< 10016;i++){
cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][16*i], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id],8);
}
for(;;);

*/
xevan_haval512_cpu_hash_64_final(thr_id, throughput, d_hash[thr_id],d_resNonce[thr_id],*(uint64_t*)&ptarget[6]);

cudaMemcpy(h_resNonce[thr_id], d_resNonce[thr_id], NBN*sizeof(uint32_t), cudaMemcpyDeviceToHost);

if (h_resNonce[thr_id][0] != UINT32_MAX){
const uint32_t Htarg = ptarget[7];
const uint32_t startNounce = pdata[19];
uint32_t vhash64[8];
be32enc(&endiandata[19], startNounce + h_resNonce[thr_id][0]);
x17hash(vhash64, endiandata);
// *hashes_done = pdata[19] - first_nonce + throughput + 1;
// pdata[19] = startNounce + h_resNonce[thr_id][0];
gpulog(LOG_WARNING, 0,"NONCE FOUND ");
// return 1;
if (vhash64[7] <= Htarg && fulltest(vhash64, ptarget)) {
int res = 1;
*hashes_done = pdata[19] - first_nonce + throughput + 1;
work_set_target_ratio(work, vhash64);
pdata[19] = startNounce + h_resNonce[thr_id][0];
if (h_resNonce[thr_id][1] != UINT32_MAX) {
pdata[21] = startNounce+h_resNonce[thr_id][1];
if(!opt_quiet)
gpulog(LOG_BLUE,dev_id,"Found 2nd nonce: %08x", pdata[21]);
be32enc(&endiandata[19], pdata[21]);
x17hash(vhash64, endiandata);
if (bn_hash_target_ratio(vhash64, ptarget) > work->shareratio[0]){
work_set_target_ratio(work, vhash64);
xchg(pdata[19],pdata[21]);
}
res++;
}
return res;
}
else {
gpulog(LOG_WARNING, thr_id, "result for %08x does not validate on CPU!", h_resNonce[thr_id][0]);
cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));
}
}

pdata[19] += throughput;
} while (!work_restart[thr_id].restart && ((uint64_t)max_nonce > (uint64_t)throughput + pdata[19]));

*hashes_done = pdata[19] - first_nonce + 1;

return 0;
}

// cleanup
extern "C" void free_x17(int thr_id)
{
if (!init[thr_id])
return;

cudaDeviceSynchronize();

free(h_resNonce[thr_id]);
cudaFree(d_resNonce[thr_id]);
cudaFree(d_hash[thr_id]);

x11_simd_echo_512_cpu_free(thr_id);
x15_whirlpool_cpu_free(thr_id);
cudaDeviceSynchronize();
init[thr_id] = false;
}

palgin

sr. member

Activity: 266

Merit: 250

Quote from: sickofscamcoins on October 05, 2017, 11:36:25 AM

http://paste.ubuntu.com/25680610/

this is results of output, replacing x17.cu and recompile with one posted palgin.

Thank you, will check it!

Topic: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE - page 13. (Read 30142 times)