The number of transactions included in a block is entirely up to the miner (upper limit defined by the amount available).
The amount of pending transactions can indeed be really large. Some transactions may never confirm, and can potentially stay in the pending pool forever. It really depends on the implementation. As far as I remember bitcoinQT only forgets a pending transaction when restarted. I imagine that blockchain.info has its own implementation, and maybe they persist pending transactions in a database. They may have transactions around that no current miner knows about.
One of my nodes currently has 1560 pending transactions, 500 connections, and has been running for months. I cannot really tell how old the oldest ones are though.
So probably part of the transactions will never be accepted by any miner. But I cannot believe that if a block with only one transaction is found right after another, that there was no transactions in the pool whatsoever. Or can it be a strategy to not check the last block's integrity and which transactions it contains, and try to mine an empty block as fast as possible?
Speed is essential, so maybe some miners don't check integrity, but hope that it is valid. If it isn't both blocks will be rejected by validating nodes. There is an argument for not including any transactions at all. When you create a large block it has to propagate quickly to competing miners so they base their work on your block instead of a competing block generated at about the same time. A small block propagates faster than a large one (bandwidth, latency, verification time etc). A large block could have more fees (profit). So it is all a question of whether the probability of loosing the propagation race can be paid by the transaction fees.
Michael Grønager posted this excellent description on the dev mailing list some time ago:
Secondly, transactions that are never going to be accepted don't disappear from the network after some timeout period?
They disappear as they are removed from the memory pool of the various nodes out in the wild. As far as I remember bitcoinQT only removes them when it is restarted. Note that a node that receives a transaction that spends some of the same inputs as one in its memory pool will discard the new tx as a double spend, and will not relay / mine it.
And third, I understand that miners can implement their own way of making blocks. But when for example a miner only accepts blocks with 10 transactions, is there some regulating force that counter balances that?
I think you mean "creates" instead of "accepts". The only "regulating" forces are that the miner has an incentive to keep the network healthy, and an incentive to increase his profits by collecting the fees.
According to Michaels calculations the fees are right now too low compared to the probability of loosing the propagation race.