might have a look at your cuckoo... if the bounty is still there...
Yep, still there at
https://github.com/tromp/cuckoo"GPU Speed Parity Bounty
$500 for an open source implementation for a consumer GPU that is as fast as a high-end Intel Core i7 running 16 threads. Again with N ranging over {2^28,2^30,2^32}.
Note that there is already a cuda_miner.cu, my attempted port of the edge trimming part of cuckoo_miner.c to CUDA, but while it seems to run ok for medium graph sizes, it crashes my computer at larger sizes, and I haven't had the chance to debug the issue yet (I prefer to let my computer work on solving 8x8 connect-4:-) Anyway, this could make a good starting point.
These bounties are to expire at the end of 2015. They are admittedly modest in size, but then claiming them might only require one or two insightful tweaks to my existing implementations."
It could be that my existing cuda code is working, but crashes at larger settings due to lack of free GPU memory on my iMac. So, if you could just run my code on a discrete nVidia GPU card and let me know if it works at larger settings, that would be helpful.