scrypt asics are on the horizon, but we'll see how long it actually takes, and how much of a multiplier they run for cost vs. efficiency.
at the moment, ATI stream processors do quite well at mining, full chart of the cards can be found at
https://litecoin.info/Mining_hardware_comparison.
As far as selecting a card to run, the 280x's/7970/7950's run at a nice high density and fairly good pricepoint at times; requires larger power supplies in multigpu rigs, but at the same time requires fewer processors and boards per MH/s.
this:
Firstly, AMD designs GPUs with many simple ALUs/shaders (VLIW design) that run at a relatively low frequency clock (typically 1120-3200 ALUs at 625-900 MHz), whereas Nvidia's microarchitecture consists of fewer more complex ALUs and tries to compensate with a higher shader clock (typically 448-1024 ALUs at 1150-1544 MHz). Because of this VLIW vs. non-VLIW difference, Nvidia uses up more square millimeters of die space per ALU, hence can pack fewer of them per chip, and they hit the frequency wall sooner than AMD which prevents them from increasing the clock high enough to match or surpass AMD's performance. This translates to a raw ALU performance advantage for AMD:
AMD Radeon HD 6990: 3072 ALUs x 830 MHz = 2550 billion 32-bit instruction per second
Nvidia GTX 590: 1024 ALUs x 1214 MHz = 1243 billion 32-bit instruction per second
This approximate 2x-3x performance difference exists across the entire range of AMD and Nvidia GPUs. It is very visible in all ALU-bound GPGPU workloads such as Bitcoin, password bruteforcers, etc.
Secondly, another difference favoring Bitcoin mining on AMD GPUs instead of Nvidia's is that the mining algorithm is based on SHA-256, which makes heavy use of the 32-bit integer right rotate operation. This operation can be implemented as a single hardware instruction on AMD GPUs (BIT_ALIGN_INT), but requires three separate hardware instructions to be emulated on Nvidia GPUs (2 shifts + 1 add). This alone gives AMD another 1.7x performance advantage (~1900 instructions instead of ~3250 to execute the SHA-256 compression function).
Combined together, these 2 factors make AMD GPUs overall 3x-5x faster when mining Bitcoins.
... was taken from
https://en.bitcoin.it/wiki/Why_a_GPU_mines_faster_than_a_CPUedit: that was for SHA256 based coins, like bitcoin, but even though scrypt is a different algorithm, the advantage for the Red Team originates from the same source.