Not gonna lie, working on this algo is making a huge dent on the whiskey fund. Cheers mate
Here's 0.5 BTC in hope the excessive Whiskey supply will keep you short of our own performance benchmark
Transaction-ID 7fdaf9602034832a8045887c7b592b62d53b74377ddbf3d958129b9ad8d4ed55-000
seriously, great work on your ccminer forks. Keep it up!
Christian
Wow, didn't see that coming. It's not every day you take a man's work, more or less turn it against him and then get paid by him. Always wondered how much I was stepping on your toes with my release, guess that answers that question. Thank you, good sir, thank you very much
Anyone played around with the launch config stuff for the TSIV version? I'm finding that 4x80 is far from optimal on certain systems. 6x60 gave me about a 25% boost on a GTX 770 and GTX 780, which on a GTX 860M 4x40 basically tripled my performance (from 50 H/s to 170 H/s). It would be great to hear what others are seeing with the -l parameter.
someone needs to come up with an autotune. Just sayin'...
NOTE: separate autotuning would be required for the 3 kernels of the algorithm.
Thought of that on the side, might be doable but some configs do so badly it might be TDR city all over again. I've managed values that give like 30 H/s compared to the inexplicably optimal ones that give around 280 H/s. Should probably take a poke at it anyway, at some point.
Tried to paid for a few beer and longdrinks in bitcoins tonight. Didn't work because the stupid Windows Phone wallet software confused decimal dot and comma in the German version of Windows phone. After 2 embarassing attempts that ended in a failure message, I shelled out 30 Euros in cash.
Oh well... the sad sad state of Windows Phone. Of course switching the entire phone over to US English localization would have worked.
The slight relief of seeing I'm not the only one getting fsck'd by Microsoft software, priceless
anyone actually compare CUDA 5.5 to CUDA 6.0 compiles? see if there really is a speed difference?
I actually compiled ccminer using 6.0 for quite some time, until I finally got fed up with editing the VC project files every time a new version came out. Nothing gained and nothing lost on going 5.5 -> 6.0 as far as I could tell.
Obviosly this setting is too much for the 2gb ram of ti 750. But the new miner works fine with the old settings 40 blocks 8 treads.
Should be fine in theory, but if I'm not mistaken cudaMalloc requires a contiguous chunk of memory or it fails. So if there is even a tiniest allocation somewhere in the middle, the big allocation fails if there isn't a contiguous chunk of around 1.5 GB on either side of the smaller allocation. I might be wrong on this but 2 GB should be enough for 8x96. I can do 8x120 on my Linux rig with a 2 GB 750 Ti:
FB Memory Usage
Total : 2047 MiB
Used : 1970 MiB
Free : 77 MiB