Honestly, I can't understand all the secrecy on the CUDA / OpenCL / Whatever GPU enabled versions of bitcoin. Sure, it's nice to take advantage of the extra edge you get if you want to build an efficient mining farm, but really, what's up with everyone on this regard?
So, not really having the time to do it, I decided to take my first cup of CUDA (as all I have as hw to test is a macbook pro with a nvidia inside), and I'm attaching the initial version here. Yes, the source, for you to do as you please! Just remember the reason you got it in the first place, which was that someone didn't take it and hide!
Anyways, it's *really* crude, has static compile instructions on the makefile, only for osx and with the cuda dev sdk in the default place, compiled for x86_64. But I'm sure you can quickly tweak it to compile for your system, though...
The code that runs in the GPU is a 1:1 copy of the Cryptopp source, only slightly tweaked. There's a lot of room for improvement, I have some 20 hours total of working with CUDA so I don't have the faintest idea of what optimizations I could achieve, but at least in the memory layout there's a lot to do. I compared the resulting hashes with Cryptopp and it matches, so I'm assuming it can generate blocks. I'm tired and thus haven't tried a local network, but I will soon.
I get 1400khs from the CPU with all cores combined, and 1800khs from the GPU alone, so it's pretty nice. It does take a lot of cpu still, for some reason, so run it with cores-1 or you'll loose performance. I *think* the hashes per second calculation is correct, but may not be.
The first processor is always the cuda, it will not run if you don't have the GPU / kernel, no error handling, ugly hackish code throughout, but it serves as a start. Lets get this production grade to include in the main clients, shall we?
Now, if you want to thank me for doing this just head to
http://taabl.datlatec.com and place a few bets. I'm starting to think the time I spent developing the lottery wasn't worth it, as it got too little interest so far, so it would be super if you all gave it a try.
If you want me to continue to pursue this, then toss a good amount of coins my way
or send me some hardware. I have linux and windows machines, but not the GPU for them, so *wink* *wink*
Most of all, share your code!