If you directly port the scrypt cpu code to gpu it's faster on the cpu using the l2 cache. So, Artforz thought it was faster on cpu than gpu. With an optimized algorithm that reads sequentially and has on the fly lookup table reconstruction it's not true. Artforz was pretty embarrassed by this lapse in technical prowess and left the scene, which has been rife with conspiracy theory since.
There is no way to easily make a cpu only coin either, see my "memcoin" thread where I learned why it was a bad idea with scrypt
Thanks. I found some of your old posts about it. You had some Intel data in one post that showed a radix sort or a tree search algorithm ran better on CPUs. Did ATI/NVIDIA wind up beating those too?
https://bitcointalksearch.org/topic/m.786472