The idea of Scrypt is it requires lots of low latency, high bandwidth memory. It's completely uneconomical on an FPGA. An ASIC using external DRAM wouldn't be much faster than a CPU with external DRAM. An ASIC with on-die RAM would be very expensive in terms of silicon and therefore monetary cost.
In other words, Scrypt already is pretty close to a CPU-optimized hash. GPUs can accelerate it some, but not tremendously because memory is still a bottleneck. ASIC versions wouldn't come out enough cheaper in $/Hps or J/H to justify the development; at almost any level it's easier to use cheap commodity hardware. I'm not saying it's impossible; you just won't see the massive speedups like are possible with SHA.
Scrypt would probably benefit from an increased memory requirement. Adjusting the memory requirement based on difficulty would probably make it an adequate CPU (and to a lesser degree GPU, which are just specialized CPUs after all) friendly hash for the foreseeable future.
Increasing the memory requirement of scrypt will just push it to either the ddr3 on the mobo or the gddr on the graphics card. Obviously the gddr is faster. The reason the current implementation works quickly on cpus is because it fits into the level 2 cache well, which has a bandwidth that is about 30% of that of a gpu.
Writing an encryption algorithm that runs faster on a cpu than a gpu is a much more difficult task than most people realize -- Intel is constantly trying to battle the gpu companies and show that a cpu is at least as fast as a gpu at many parallizable tasks, but has had extensive difficulty actually proving any algorithms behave like this even for well studied algorithms like sorts or pathfinding algorithms.