There's no real reason the dataset cannot be pruned - i've been playing with a DB copy of a blockchain, looking at ways of "removing" the records for accounts with a nil balance (amount out = total amounts in) where date is > 30 days ago
I don't *need* as a _user_ of bitcoins the whole blockchain, if I could get "balances at point in time" and the journal entries after that.
I similarly dont really *need* (although I might _choose) to have all my addresses maintain a chain of completed transactions - if I could have a way to "move" the balance to another address within my wallet, I could then "discard" that address.
My "real life" wallet doesnt care which ATM each of the notes came out of, I have a "balance" (occasionally) in there I can spend - in fact having that "tracking" decreases anonymity significantly.
A lightweight client is a key to mass adoption (amongst a number of other things).
It's great that my children can empty a moneybox and see a £2 coin and know from the year "that was the one from Grandma on my 6th birthday". It rapidly becomes irrelevant when its a pile of coins getting spent on a beachball - getting the sand out of your shoes becomes much more pressing. In the same way I dont need to know that 4mBTC came from me testing -QT to a non QT client - it's just 4mBTC to be spent, which due to the size and age of the bitcoins will probably cost me more in fees to use than its worth.
Imagine buying a car for £5000 and taking 500x£10 notes to the deaer but you find they cant sell it to you because they came from 702 different amounts of change from your wages, and some are "worth" less when spending than £10 because they're notes only printed that morning, or were made up of 200x5p transactions... in the "real" world £5k is £5k is £5k not some variable equivalent that might eventually be 5k