Too early to tell. The current GPU solver writes + reads about 27GB of data within 1 second on a 1080Ti, which has a bandwidth of 484GB/s. So there are still other bottlenecks to overcome...
Ahh, I haven't read enough to know it is cuda dependent. I'll try to catch up in the next week, what the best link for that? I think was the last thing I read on this.
http://cryptorials.io/beyond-hashcash-proof-work-theres-mining-hashing/
EDIt: I just remembered this is hardware agnostic isn't it? It should be IIRC. Were you just using the Nvidia as an example?
Pretty agnostic indeed.
My benching was done on intel Core i5/i7 and that NVidia, and I happen to have the memory bandwidth data
available for the latter.
I'm rewriting my Cuckoo Cycle paper to have more details on mean miner. For now, it's best to study its source code.
I wish I still could, unfortunately those days are past for me.
I'll just keep an eye on grin and assume (yeah I know) that we will see the latest mirroring there as well.
BTW this threads OP needs some +M.