watt = power KW = power
watt hours = energy kWh = energy
If you pay 5.7 cents per kWh then FPGA are a non-issue (at least in the near term).
The true metric is total cost of ownership over expected lifespan. So capital cost + all energy cost / # of hashes produced in say 18 months or 3 years. With very low power rates you can keep a low TCO without FPGAs.
TL/DR version is all that matter is the total cost (equpment + electricity + repairs + labor/management) to produce 1 PH (1 quadrillion hashes) over the lifetime (either actual or economic) over the equipment (or life of under warranty replacements.
To answer your direction question (some ballpark figures):
Unoptimized non-dedicated GPU rig: <2 MH/W (gaming computer w/ 1 or 2 GPUs that someone used for mining)
Optimized GPU rig: 2MH/W to 3 MH/W (requires good planning and part selection)
Underclocked & Undervolted rig: 4MH/W to 5 MH/W (requires good planning and part selection)
BFL Single: ~9 MH/W (unknown chp, my guess is it is a "last gen" 60nm FPGA)
ztex, 6500, Icarus boards: ~20 MH/W (all based on 45nm Spartan-6 FPGA)
RigBox: ~20 MH/w (untested specs, unknown chip)
Artix-7 based board: ~30MH/W to 40 MH/W (estimate based on theoretical improvement from 45nm to 32nm FPGA)
LargeCoin: ~200 MH/W (sASIC design, untested spec)
Full Custom ASIC: ~500 MH/W+ (theoretical estimate based on SHA-2 "testbed" processor)