Miners are trying to find the nonce just as the last block has been created.
There is no problem in instantly finding a nonce, and creating a block with zero transactions inside (the only downside would be to lose the transaction fees, but it is worth anyway, because they will win the 12.5 btc for creating a block).
I wrote about this in more detail in this thread
summary on Proof of Worksurely they need a static list of txs in order to make guessing effective?
Miners are trying to find nonce at an absurd rate of hashes per second. They are updating their candidate block with the mempool at every second, and using their hashing power to find the nonce everytime. Every second counts. They are literally running to find the nonce and create the block, no matter how many transactions are on the mempool. It makes almost no difference for them, because the reward for finding the block is enormous.