A GPU miner for PRimecoin implies to parallelize the 2 execution instances of BN_mod_exp(&r, &a, &e, &n, pctx). Paralleling of sieve would be an improvement, but it is not the central part , as is done in other OpenCL/CUDA primality tests where sieve is done in CPU.
So, it is indicative when someone open a thread about parelellize a code and reports are about mingw and accesorials and anecdotical affairs with Windows building.
Well, for parallelizing BN_mod_exp(&r, &a, &e, &n, pctx) this guy should to have rewrotten (at least) BigNum lib for OpenCL ....
Good luck with the project and people who donated !!
Yup, if someone manages to use some level of parallelism for mod_exp on a GPU, then that will be a big thing. I'm personally still a bit doubtful whether it can be done efficiently because doing the mod operation requires division. Usually those algorithms involve unpredictable branching and that's going to slow things down on the GPU.