If the attacker decides to fork much earlier, let's say 2,500 blocks earlier, he has to catch up and eventually overtake the main chain. The attacker cannot do so however, because the difficulty must be adjusted if he mines blocks too fast. So, even if 2016 blocks are mined every 2 weeks on average, this applies to both chains. So rather than catching up with the main chain, the attacker will fight a loosing battle, since the other chain eventually grows as fast as his.
I know this is only a theoretical problem, but I was wondering whether the blocktime itself (or how you want to call it) imposes limitations to the 51 percent attack. Thanks!
I wrote a basic explaination about how 51% attacks work: How does a double spend 51% attack work ? Explanation and examples.
Article by Vitalik Buterin that covers some of what you are interested in: https://blog.ethereum.org/2014/07/11/toward-a-12-second-block-time/
The blockchain follows the heaviest chain. This means the chain with the highest difficulty which is usually longer as well.
This means that the difficulty plays an important role in which chain is accepted as the "winning chain". The difficulty is adjusted (depending on the code) every block or x number of blocks. This means that a higher difficulty chain has preference over a lower difficulty chain. Over a space of time a higher difficulty chain will be longer than a lower difficulty chain provided that the hashrate remains constant (or increases on the higher difficulty chain).
If you followed the progress of the BCH versus BSV you can see how that worked out publicly. https://cash.coin.dance/
The BCH & BSV fork was effectively a lot of the characteristics of a very public version of a 51% attack. (Ignoring the differences in code) and has provided a lot of interesting data.
Despite the increase in difficulty due to higher hashrate - in general a higher hashrate blockchain adds blocks to the blockchain faster than a lower hashrate chain ( longest / heaviest chain is accepted as the majority)