[For the entirety of Bitcoin's history, it has produced blocks smaller than the protocol limit.
Why didn't the average size of blocks shoot up to 1 MB and stay there the instant Satoshi added a block size limit to the protocol?
I'm not sure what you're getting at. Clearly there just hasn't been the demand for 1 MB worth of transactions per block thus far, but that could change relatively soon., and thus the debate over lifting the 1 MB cap before we get to that point. If suddenly the block limit were to drop to 50kb, I think we'd start seeing a whole lot of 50kb blocks, no?
Justus is, I believe, pointing out that until very recently bitcoin has effectively had no block size limit, as blocks near the protocol limit were almost non-existent. More recently we tend to get a few a day, mostly from F2Pool.
Those claiming we'll have massive runaway blocks full of one satoshi / free transactions have never adequately explained why it wasn't true historically when the average block size was 70k, and why people still felt the need to pay fees then.
Anyone trying to send free / very low fee transactions recently will know from having it backfire that they have to think long and hard about taking the risk if they want confirmation in a reasonable time, and that's the way it should be and likely always will be. Each incremental transaction increases miner risk, and therefore has a cost, and that's natural and good, and enough for an equilibrium to be found.
Heck, were the cap completely removed, and some major pools concerned about spam (aren't we all?) stated that, for their own values of X, Y and Z, that they'd not relay blocks larger than (say) 500KB that pay total fees of less than X satoshis per kilobyte, and would not even build on blocks paying fees of less than Y per kilobyte unless they had managed to become Z blocks deep, would have a huge deterrent effect of making it expensive to try to spam the network. Not many people are willing to risk 25 BTC to make a point, never mind be willing to continue to do so repeatedly. X, Y and Z wouldn't need to be uniform across pools, and of course could change with time and technology changes. An equilibrium would be found and blocks would achieve a natural growth rate than no central planner can properly plan.