If you are still hiring, I got this working some time ago. The implementation works according to your specs (Linux+GPU+NVIDIA CUDA). We can either:
- provide the source code for 10k
- provide a compiled binary for 5k
- generate the each of the addresses you need for a fee (not sure how much it would be, but we'll have to calculate it on prefix length, estimated computing time and real computing time)
Unfortunately, this was done for private use and the code base is pretty rudimentary, I would need to work it out a bit to deliver it without lots of hacks. Probably a weekend or two of work.
Unfortunately cannot PM You, also you'd have to come down on the price.