Consider this attack: I have the ability to isolate the network connectivity many different nodes and I can pretend to be all of their peers (this isn't an especially hard attack— e.g. it's one people at large ISPs can pull)
I make a one time investment to fork the chain someplace after the highest of the currently used checkpoints, and mine enough blocks to get it down to a difficulty I can sustain on my own. Once I've done this I can isolate fresh nodes and get them onto my fantasy chain (the fact that it's the sum-difficulty used for comparison is irrelevant because I've isolated them) and I can trigger their prudent must-have-n-confirms behavior before considering a txn safe.
Say I need to reduce it by 1024x to get it to where I can mine it on my own (which is about right, 1/1024 puts 10GH at ~11 minutes/block). This would currently cost 134,268.75 BTC (the simple forgone income from the same amount of computation: 2016*(50/4^0) + 2016*(50/4^1) + 2016*(50/4^2) + 2016*(50/4^3) + 2016*(50/4^4.)).
If you switch to a sliding window with the same overall behavior your change clamp at each block will need to be at 0.25^1/2016. You would still need to mine ~10080 blocks but total cost would be 72615 BTC because of the far fewer blocks calculated at the 'too high' difficulty.
So it would ~halve the cost of this attack.
The clamps in bitcoin are what make these attacks costly, but the clamps also represent exciting non-linearities in the payout of the system which miners could exploit for increased profits. The fact that the clamps are hard to reach currently makes it a non-issue, but with a sliding window the clamps would have to be very near a factor of 1.0 to preserve the resistance to these forged chain attacks, so the system would almost always be operating in the non-linear region.
Tightening the clamps to keep the attack cost the same would only worsen the non-linearity.
Moreover, (ignoring the screwed up calculation) the window plus node timestamp enforcement (the limitation against blocks from the future) limits the maximum gain miners miners can get from lying about the time to a couple percent. A sliding window would make this worse because it would provide an incentive to lie for every block, and not just the final ones in a cycle.
People need to stop fixating on weirdness in other chains which are more or less irrelevant for bitcoin and realize that the design of bitcoin isn't an accident. Every one of the features of the distributed algorithm has a complicated relationship with all the others.