It was a nightmare for me, but what does that mean for you? Christmas may have come early.
I've compiled my optimized code for each of the following architectures: generic, core2, corei7, corei7-avx, k8, k8-sse3, and barcelona (k10).
The Windows x64 binaries are available from here:
https://mega.co.nz/#F!YsZSHYKA!IC8LK_MBGwqC-gWOpO7zoQ
https://www.dropbox.com/sh/wtxvxvkirxax2vj/9P_Rxb9V1y (dropbox mirror)
Use whichever corresponds to your processor architecture.
If you're unsure, here's a quick guide based on your CPU manufacturer and age:
Intel: [Older] core2 -> corei7 -> corei7-avx [Newer]
AMD: [Older] k8 -> k8-sse3 -> barcelona (k10) [Newer]
Worst case scenario, use generic.
I will not be releasing any 32-bit builds. "High performance 32-bit" is an oxymoron.
Let me know what sort of improvements you see!
Some builds may be slightly slower than the stock jh00 miner. It all depends on your CPU.
I've added another 5-10% optimization to the code and hopefully fixed the issue with AMD builds.
It seems that mingw-gcc thinks that AMD processors don't support SSE, so it left them disabled. I've fixed that.
The code optimizations have been pushed to the github repo.
The new binaries have been uploaded. (same URLs)