Pages:
Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 22. (Read 3426944 times)

sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
I'm pretty sure the output of the last algorithm is used as the input of the next one for X11, precisely so you can't do that.

Yes you can. If each thread is working on a different hash.

example
4 threads 4 hashes

HASH1: x1->x2->x3->
HASH2: x4->x5->x6->
HASH3: x7->x8->x9->
HASH4: x10->x11

Swap the 4 hashes

HASH4: x1->x2->x3->
HASH1: x4->x5->x6->
HASH2: x7->x8->x9->
HASH3: x10->x11

Swap the 4 hashes

HASH3: x1->x2->x3->
HASH4: x4->x5->x6->
HASH1: x7->x8->x9->
HASH2: x10->x11

Swap the 4 hashes

HASH2: x1->x2->x3->
HASH3: x4->x5->x6->
HASH4: x7->x8->x9->
HASH1: x10->x11
Complete
Have you tried this wolf0?
No, because it doesn't make sense for GPU - it WOULD, however, make TONS of sense for FPGA or ASIC.

What if you have 4 gpu's in your rig and each thread is executed on a seperate gpu. x11 is then reduced to x2+.

advantages:
-Smaller kernals, bether register usage, less memory needed, more cache hits, more paralell threads
-Hybrid mining is possible. (run AES algos on the AMD, and the rest on NVIDIA)

disadvangtages:

-throughput must be passed from gpu to gpu trough the pci-E to memory and back.
-You need 4 gpu's (but the algorithm can be scalable to support x gpu's)
hero member
Activity: 524
Merit: 500
but... that require changes in each kernels (host launch code), and is not 50% faster, only a few percents
On AMD, for whirlpool it seems ALUs are sitting idle for ~half of time, waiting data from constants cache and local data storage.
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
Thats called cuda streams :p) and seems to be handled better on linux here



http://devblogs.nvidia.com/parallelforall/how-overlap-data-transfers-cuda-cc/


but... that require changes in each kernels (host launch code), and is not 50% faster, only a few percents
hero member
Activity: 524
Merit: 500
No, because it doesn't make sense for GPU - it WOULD, however, make TONS of sense for FPGA or ASIC.
It does. Well, just change wording a bit Wink
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
I'm pretty sure the output of the last algorithm is used as the input of the next one for X11, precisely so you can't do that.

Yes you can. If each thread is working on a different hash.

example
4 threads 4 hashes

HASH1: x1->x2->x3->
HASH2: x4->x5->x6->
HASH3: x7->x8->x9->
HASH4: x10->x11

Swap the 4 hashes

HASH4: x1->x2->x3->
HASH1: x4->x5->x6->
HASH2: x7->x8->x9->
HASH3: x10->x11

Swap the 4 hashes

HASH3: x1->x2->x3->
HASH4: x4->x5->x6->
HASH1: x7->x8->x9->
HASH2: x10->x11

Swap the 4 hashes

HASH2: x1->x2->x3->
HASH3: x4->x5->x6->
HASH4: x7->x8->x9->
HASH1: x10->x11

Complete

Have you tried this wolf0?
sr. member
Activity: 506
Merit: 252
Why do you want to run cudaminer? All the algos in cudaminer are slow and not profitable.

ccminer is the best miner for NVIDIA


Hi sp_ ,

could you look into scrypt-jane algos and include them into ccminer?

It's the last algo in cudaminer which is still profitable to mine with.

sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
My modded xmr miner does more than 500 on the 980x. Donate 0.2BTC and I will send it to you. Windows 32bit exe


member
Activity: 75
Merit: 10
Why do you want to run cudaminer? All the algos in cudaminer are slow and not profitable.

ccminer is the best miner for NVIDIA

http://cryptomining-blog.com/4546-updated-windows-binary-of-the-ccminer-1-5-43-git-fork-by-sp-for-maxwell/

bitcointalk thread:

https://bitcointalksearch.org/topic/ccminersp-mod-modded-nvidia-maxwell-pascal-kernels-826901


Download and mine quark or qubit.

QUARK algo @yaamp multipool: (13-april-2015)

AMD GPU miners: 0.72%
CPU miners: 0.09%
ccminer: 99,19% (sp-mod: 95,79%)




I currently am running ccminer that was mentioned in this thread https://bitcointalksearch.org/topic/m.7429706 but I am only getting 500h/s on a gtx-980 and I think it should be more it is ccminer cryptonight by tsiv ... i have played with the -l (launch config number of threads x number of thread blocks ) and cant find a combo that gets me above the 500 mark


I am trying to mine Monero coin not bitcoin
legendary
Activity: 1400
Merit: 1050
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work.
it fails every time so I am guessing not.. or maybe I missed a key step ?

the precompiled dependencies in the OP will only work for VS 2010.

You would have to build all the dependencies with VS 2013 if you want to use this newer IDE version.

Christian


if someone else runs the project through their VS 2010 and then shares it will it work on my pc ?

if so is there a source somewhere..

I found one link to a downloadable program but it would not pass my antivirus i couldnt even open the webpage with chrome so i disabled its protection. then the resulting download was infested as well. it seems most stuff is nowadays. sad
there is a "convert" link in vs2013 to convert a vs2010 project to vs2013, it should be more or less alone (might have a couple of errors to correct).
Regarding the dependencies, I don't think there will be any problem (I still use the one I downloaded at last year)
sp_
legendary
Activity: 2926
Merit: 1087
Team Black developer
Why do you want to run cudaminer? All the algos in cudaminer are slow and not profitable.

ccminer is the best miner for NVIDIA

http://cryptomining-blog.com/4546-updated-windows-binary-of-the-ccminer-1-5-43-git-fork-by-sp-for-maxwell/

bitcointalk thread:

https://bitcointalksearch.org/topic/ccminersp-mod-modded-nvidia-maxwell-pascal-kernels-826901


Download and mine quark or qubit.

QUARK algo @yaamp multipool: (13-april-2015)

AMD GPU miners: 0.72%
CPU miners: 0.09%
ccminer: 99,19% (sp-mod: 95,79%)


member
Activity: 75
Merit: 10
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work.
it fails every time so I am guessing not.. or maybe I missed a key step ?

the precompiled dependencies in the OP will only work for VS 2010.

You would have to build all the dependencies with VS 2013 if you want to use this newer IDE version.

Christian


if someone else runs the project through their VS 2010 and then shares it will it work on my pc ?

if so is there a source somewhere..

I found one link to a downloadable program but it would not pass my antivirus i couldnt even open the webpage with chrome so i disabled its protection. then the resulting download was infested as well. it seems most stuff is nowadays. sad
hero member
Activity: 756
Merit: 502
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work.
it fails every time so I am guessing not.. or maybe I missed a key step ?

the precompiled dependencies in the OP will only work for VS 2010.

You would have to build all the dependencies with VS 2013 if you want to use this newer IDE version.

Christian
member
Activity: 75
Merit: 10
I am trying to load the cudaminer.sln in visual studio Commmunity 2013.. will this work.
it fails every time so I am guessing not.. or maybe I missed a key step ?

sr. member
Activity: 506
Merit: 252
Using good old cudaminer for nfactor 16 yacoin.

The reason?
It's an Asus 750ti 4GB card.

-L 8 -l t64x1 -b 16384 -i 0

The settings give like 0.36 to 0.38 khash/s.

The interesting fact tho is that these are the same settings and the same hash a standard 2GB model can hash.

Setting -L any lower is possible and vram useage goes up but the hashing is way slower.

Any insight?

Code:
typedef struct scrypt_aligned_alloc_t {
uint8_t *mem, *ptr;
} scrypt_aligned_alloc;

#if defined(SCRYPT_TEST_SPEED)
static uint8_t *mem_base = (uint8_t *)0;
static size_t mem_bump = 0;

/* allocations are assumed to be multiples of 64 bytes and total allocations not to exceed ~1.01gb */
static scrypt_aligned_alloc
scrypt_alloc(uint64_t size) {
scrypt_aligned_alloc aa;
if (!mem_base) {
mem_base = (uint8_t *)malloc((1024 * 1024 * 1024) + (1024 * 1024) + (SCRYPT_BLOCK_BYTES - 1));
if (!mem_base)
scrypt_fatal_error("scrypt: out of memory");
mem_base = (uint8_t *)(((size_t)mem_base + (SCRYPT_BLOCK_BYTES - 1)) & ~(SCRYPT_BLOCK_BYTES - 1));
}
aa.mem = mem_base + mem_bump;
aa.ptr = aa.mem;
mem_bump += (size_t)size;
return aa;
}

static void
scrypt_free(scrypt_aligned_alloc *aa) {
mem_bump = 0;
}
#else
static scrypt_aligned_alloc
scrypt_alloc(uint64_t size) {
static const size_t max_alloc = (size_t)-1;
scrypt_aligned_alloc aa;
size += (SCRYPT_BLOCK_BYTES - 1);
if (size > max_alloc)
scrypt_fatal_error("scrypt: not enough address space on this CPU to allocate required memory");
aa.mem = (uint8_t *)malloc((size_t)size);
aa.ptr = (uint8_t *)(((size_t)aa.mem + (SCRYPT_BLOCK_BYTES - 1)) & ~(SCRYPT_BLOCK_BYTES - 1));
if (!aa.mem)
scrypt_fatal_error("scrypt: out of memory");
return aa;
}

static void
scrypt_free(scrypt_aligned_alloc *aa) {
free(aa->mem);
}
#endif

why is it limited to 1gb?
sr. member
Activity: 506
Merit: 252
Using good old cudaminer for nfactor 16 yacoin.

The reason?
It's an Asus 750ti 4GB card.

-L 8 -l t64x1 -b 16384 -i 0

The settings give like 0.36 to 0.38 khash/s.

The interesting fact tho is that these are the same settings and the same hash a standard 2GB model can hash.

Setting -L any lower is possible and vram useage goes up but the hashing is way slower.

Any insight?
full member
Activity: 126
Merit: 100
hero member
Activity: 556
Merit: 501
hi,
where I can find the latest version ccminer of djm34 for windows?
Thanks!

here you can find various builds
http://cryptomining-blog.com/tag/ccminer-fork/page/1/
thanks but it's not what I want. Wink
full member
Activity: 126
Merit: 100
hi,
where I can find the latest version ccminer of djm34 for windows?
Thanks!

here you can find various builds
http://cryptomining-blog.com/tag/ccminer-fork/page/1/
newbie
Activity: 2
Merit: 0
Please cudaminer for collecting coppelak or double keccak
hero member
Activity: 556
Merit: 501
hi,
where I can find the latest version ccminer of djm34 for windows?
Thanks!
Pages:
Jump to: