Pages:
Author

Topic: [Working] Improved CryptoNight CUDA Miner (based on tsiv's work) - page 2. (Read 18388 times)

sp_
legendary
Activity: 2912
Merit: 1087
Team Black developer
I think I can optimize this. Who wants a faster miner? Anybody willing to donate 0.1 BTC?
Note that tsiv merged my changes and worked more on it, I think. But I'd be interested to see what you come up with.

My uint keccak implementation is doing 172MHASH on a standard 750ti. Yours/tsivs doeas 115MHASH (in the cryptonight fork).
So mine is 50% faster. (the sourcecode is in my mod)
sp_
legendary
Activity: 2912
Merit: 1087
Team Black developer
I think I can optimize this. Who wants a faster miner? Anybody willing to donate 0.1 BTC?

hero member
Activity: 644
Merit: 500
can you make windows build of your miner wolf0 ?

and does it support cuda 2.1 ?

Tsiv merged the code changes from wolf0 Wink So his latest miner will have these boosts.
And it's also compiled for older cards here: https://github.com/tsiv/ccminer-cryptonight/releases
full member
Activity: 155
Merit: 100
can you make windows build of your miner wolf0 ?

and does it support cuda 2.1 ?
hero member
Activity: 687
Merit: 500
novag
Whrere is working boost miner for 750ti?
legendary
Activity: 1106
Merit: 1014
Been following all the posts about recent updates from tsiv and Wolf0, but I still can't figure out whether there's any actual performance boost for 750 Ti? Smiley Is it better for older cards only?
full member
Activity: 168
Merit: 100
Wolf0,

You still working on this or are you waiting for published code from tsiv before continuing?

He already did Wink https://github.com/tsiv/ccminer-cryptonight/commit/96b2cedd2206311231bbda7e32709584b20e6ade
You guys should work together :p

I "swear" I've refreshed this a few times and didn't see the changes.  Guess it's time for glasses. Smiley

Man, and I just got done testing nvMiner and was about to push it on http://cudamining.cc. Smiley

I had made some changes to CN based on ideas from Wolf0.  I'll set them aside and put in tsiv's new code and see how it works.
member
Activity: 81
Merit: 1002
It was only the wind.
I updated keccak with a better rotate implementation for newer GPUs, used 8 threads for copying to shared memory instead of just four, and removed the overly complex hash validation from the GPU code in favor of the far simpler one done in cpuminer.

EDIT: Doing some testing on my 860M, which is quite close to a 750 or 750Ti, it seems 16x24 works well, too.
hero member
Activity: 687
Merit: 500
novag
Who compile this miner?
hero member
Activity: 644
Merit: 500
Wolf0,

You still working on this or are you waiting for published code from tsiv before continuing?

He already did Wink https://github.com/tsiv/ccminer-cryptonight/commit/96b2cedd2206311231bbda7e32709584b20e6ade
You guys should work together :p
member
Activity: 81
Merit: 1002
It was only the wind.
Ok, seems like tsiv's original was better. I'm testing them OC'd now, and I thought yours was better (latest source, but similar results for prev), but that's because I was comparing it to non-OC tsiv-ccminer.
I tried other settings, like "6x48,8x64,8x64", "12x40,16x60,16x60", etc... but they all ended up worse. I can't even launch "16x64" on the 750ti's as it crashes upon not being able to allocate that many threads. Maybe something wrong on my end? Windows 7.  

Image: http://i.imgur.com/t7DEsKq.png (left is tsiv, right is wolf)

From your screenshot, they seem about the same - which is about what I expected from Maxwell. I would always use 8 for the number of threads - for thread blocks, just try different ones.

EDIT: Looking at that screenshot again, it looks like you have a 760. Try comparing just that one, I'm interested to see how it does on non-Maxwell cards.
full member
Activity: 168
Merit: 100
Wolf0,

You still working on this or are you waiting for published code from tsiv before continuing?
sr. member
Activity: 329
Merit: 250
on Kepler, around 15% better.
on my kepler card (12x48) it's ~10h/s worse...

What hash rate are you getting with that card?  I have a GTX 660 SC 2GB that's getting ~220 H/s using 16x40 . That's on both Tsiv's latest and Wolf0's latest is ~230 H/s.
the hash rate varies a lot over time on both miners but with tsiv's i get ~240h/s peaks while with wolf's ~230h/s...

edit: after latest changes i'm getting additional ~40h/s with tsiv's miner while wolf's stayed about the same...
member
Activity: 81
Merit: 1002
It was only the wind.
Your cpuminer.config is still broken for me. I had to re-use the one from tsiv. Error on alloca.h not being found.
Tsiv undeffed it, maybe that's why? https://github.com/tsiv/ccminer-cryptonight/blob/master/cpuminer-config.h

Okay, I'll copy the one from his and push it - I re-ran autogen, probably why. Also, I just pushed another commit - re-wrote the AES key expansion.
Ooh man, and I've been here trying to make nice screenshots with freshly compiled builds to show off what you've already accomplished, but you're way too fast to keep up with Cheesy

EDIT: Oh lol, just saw this. Is the .config supposed to be a HTML? :p https://github.com/wolf9466/ccminer-cryptonight/blob/master/cpuminer-config.h
Sorry for being a PITA Wink I know how to fix it easily, but maybe not everyone.
This might be easier copypaste, bottom of the page: http://pastebin.com/b7g3DLGy

I just downloaded his and pushed it. But go ahead and show screenshots, even if they're old versions - might want to include the commit ID, though.
newbie
Activity: 22
Merit: 0
on Kepler, around 15% better.
on my kepler card (12x48) it's ~10h/s worse...

What hash rate are you getting with that card?  I have a GTX 660 SC 2GB that's getting ~220 H/s using 16x40 . That's on both Tsiv's latest and Wolf0's latest is ~230 H/s.

Good to hear some good news, finally  Cheesy

EDIT: That new middle loop seems to perform about the same, sadly.

Just a quick question Wolf. What version of Cuda Toolkit are you using? I have it compiled with 5.5

6.0. Also, I made the scratchpad pointer in the second loop restricted, it seems to provide a small hashrate bump.

Okay, I will pull it and give it a test on my card.

Update: The new one did give me a boost to an avg. of 235 H/s in the miner. Average at the pool is 250 H/s.

Update 2: I decided to compile a new version of tsiv's. I am getting 255 H/s on the miner and 280 H/s at the pool.
newbie
Activity: 22
Merit: 0
on Kepler, around 15% better.
on my kepler card (12x48) it's ~10h/s worse...

What hash rate are you getting with that card?  I have a GTX 660 SC 2GB that's getting ~220 H/s using 16x40 . That's on both Tsiv's latest and Wolf0's latest is ~230 H/s.

Good to hear some good news, finally  Cheesy

EDIT: That new middle loop seems to perform about the same, sadly.

Just a quick question Wolf. What version of Cuda Toolkit are you using? I have it compiled with 5.5
member
Activity: 81
Merit: 1002
It was only the wind.
Your cpuminer.config is still broken for me. I had to re-use the one from tsiv. Error on alloca.h not being found.
Tsiv undeffed it, maybe that's why? https://github.com/tsiv/ccminer-cryptonight/blob/master/cpuminer-config.h

Okay, I'll copy the one from his and push it - I re-ran autogen, probably why. Also, I just pushed another commit - re-wrote the AES key expansion.
newbie
Activity: 22
Merit: 0
on Kepler, around 15% better.
on my kepler card (12x48) it's ~10h/s worse...

What hash rate are you getting with that card?  I have a GTX 660 SC 2GB that's getting ~220 H/s using 16x40 . That's on both Tsiv's latest and Wolf0's latest is ~230 H/s.
sr. member
Activity: 329
Merit: 250
on Kepler, around 15% better.
on my kepler card (12x48) it's ~10h/s worse...
member
Activity: 81
Merit: 1002
It was only the wind.
Pushed my re-written versions of keccak() and keccakf(). I'm told it gives a 15% speed bump or so on Kepler, but fuck if I know.
Pages:
Jump to: