IMHO, the short answer is that Bitcoin is a distributed network. If you want lots of people around the world to work on the same problem, you need to parallelize it one way or another.
You are aware that they are not working on the same problem? Every miner is working on a different problem (with a different bounty transaction, different time stamp etc.). This cannot be the reason.
This question has already been asked in many ways. If you can limit mining to a single network node (CPU), dedicated miners will just set up multiple nodes for themselves.
Exactly. This improves the probabilistic convergence speed of the algorithm. That's just my claim. Hopefully I can come around to produce a more formal proof for this the next days.
Moreover, limiting generation per person throws away all incentives for developing the network. In its current form, the network is stronger against attacks, because people are rewarded for spending computational power on it. In other words, who is going to do any work, if you just give everyone a big chunk of money?
The point is: The current proof of work scheme makes it possible to parallelize and have pools. A pool could thus become a very strong adversary which is not what we want - right? A non-parallelizable proof of work scheme has the consequence that nobody can become stronger than a, say, 4.5 GHz overclocked single core pentium. This is what we want.