If the bottleneck for producing the proof of work weren't energy, then it wouldn't be so "wasteful". For example, switching mining to FPGA's and ASIC's change the bottleneck to engineering resources instead of energy. The problem is that a determined rogue government would have no problem acquiring a lot of either.
The ultimate resource that ought to go into creating proof of work would have to be individual human attention.
Anything that strives to minimize the amount of proof of work needed, would have be something along the lines of having blocks digitally signed, and network participants consciously giving more weight to blocks signed by trusted signers. This way, someone creating disruptive blocks could have their blocks voted out more efficiently than just hoping they don't control most CPU.
If the adversary is a government with the capacity to acquire resources by commandeering them by force from others, such an adversary will always have an advantage. The only way to level out that kind of advantage would be for there to be a democratic force to take it away.
Which is why a proof of stake requirement could be used to directly increase the monetary cost without consuming anything.
Consider a protocol that required one have 30 days output to mine at a specific speed. Speed could be tracked decentralized by a 1 difficulty share chain. The details aren't important at this point just at this stage accept there is a method to ensure every miner has funds at risk when they mine. Say that "proof of stake" was 30 days output. A 1 GH miner will produce (at current difficulty) ~ 1 BTC per day so when they mine a block 30 BTC would be taken from an address they provide and added to the reward (50 BTC) and the entire thing "escrowed" by protocol rules which prohibit coinbase transactions from being spent for 120 blocks.
This in effect is making the up front capital costs HIGHER and as a result energy costs are smaller portion of the lifecyle costs. Say a 1 GH rig costs about 200 BTC. At 2 MH/W and 0.025 BTC per kwh over it's life cycle (say 3 years) it will consume about 330 BTC in power. Total cost for 3 years of hashing power is 200 BTC + 330 BTC = 530 BTC. A 30 BTC escrow raises the "cost" of the hardware by 15% (although miner gets it all back if there is no attack). Prior to proof of stake energy makes up 62% of total network cost. With 30 day proof of stake requirement energy makes up only 58%.
Another way to look at it is from attackers perspective. 1GH of hardware no longer costs 200 BTC. It costs 230 BTC a 15% premium. In essence a 30 day proof of stake raises the cost to attack the network by 15%. The network is 15% "stronger" . A larger proof of stake (say 90 days) would put a larger premium on capital costs (45%). Using a method similar to difficulty the network could adapt the proof of stake based on how much funds miners have available. Miners could make the network stronger simply by keeping funds available.
TL/DR version:
Today cost to attack network is:
Hardware Capital Costs <- equally shared by defenders and attackers
Electrical Costs <- since attack is short lived and hashing continues forever this costs is mostly borne by defenders
With a proof of stake it is:
(Hardware Capital Costs + Proof of Stake Costs) <- equally shared by defenders and attackers
Electrical Costs <- since attack is short lived and hashing continues forever this costs is mostly borne by defenders
While it doesn't "solve" the OP problem nor does it "solve" the threat of nations it does make the network more efficient (less energy consumed for a given amount of security) and makes any attack by a rogue government (or other non-economic attack) more expensive. It also has the effect of making economic double spends (double spending w/ intent to profit) a non-issue. To have 51% of hashing power if Bitcoin has a 30 day "proof of stake" would require an attacker to put ~100K coins ($400K USD) at risk. A 90 day proof of stake would raise the cost of a such an attack by $1.2M. In any double spend those "proof of stake funds" would be locked for 120 blocks meaning the attacker is guaranteed to lose a significant portion as the value of Bitcoin crashes.