He's touched on some of the topics, and given a reasonable overview, but he has his own personal understanding about the limitations of hard forks and soft forks and he has an agenda that he tries to push: to discourage soft forks and encourage hard forks. As such, he's left out important details, and he's included some propaganda in his response.
As an example of where Franky has made a mistake...
A softfork uses a flag indicating that a certain type of transaction should NOT be checked. Instead the transaction is blindly passed on by any node that is not running the new software. This trick then allows anything to happen simply by using the flag that tells old nodes not to check whatever data appears after it.
Softforks absolutely can be completely compatible with old nodes. The general rule is that a soft fork requires enough hashpower to agree with the change that they can overcome any competing blocks generated by miners that don't agree. Technically this only requires 51% of the global hashpower, but it is more frequently triggered with a much larger percentage (to keep the change stable and behavior predictable).
Here's a theoretical example of the difference between a soft fork and a hard fork:
Currently the block subsidy is 12.5 BTC. However, the protocol does not require that the miner or pool collect the full 12.5 BTC if they don't want to. A miner could just collect 10 new BTC from a solved block, and the entire network would be perfectly happy with that. The protocol only prevents a miner from collecting MORE THAN the sum of the block subsidy plus the transaction fees of all the transactions in the block.
So, if we wanted to increase the block subsidy from 12.5 BTC to 15 BTC...
That would be a HARD FORK. Every node on the entire network (both miners and non-miners) would reject those blocks as being invalid. If there was only 1 node, and 1 miner left running the old software, then the old rules would be enforced by that 1 node and 1 miner. The blockchain would split and there would be 2 blockchains with 2 new types of blocks (those with 12.5 BTC subsidy and those with 15 BTC subsidy).
Both chains would share a common history since the new software would still accept the old (pre-fork) blocks that had LESS than the 15 BTC subsidy. Since the miners that are running the new software have more hash power than the miner that is running the old software, the "new block type" chain will always be longer. This longer chain means that even though the 12.5 BTC blocks could be considered "valid" by the new rules, they will be orphaned and ignored in favor of the longer chain. Meanwhile, since the miner running the old software sees the new blocks from the other miners as "invalid", they will never add those blocks to their own chain no matter how long the competing chain gets. A long chain of invalid blocks is still invalid.
The only way to prevent the Ether mess would be to have such an overwhelming amount of support for the new rules that the few nodes and few miners that refuse to switch become meaningless and powerless. They have their own coins that the rest of the world doesn't care about and won't accept. This overwhelming lack of interest in the "old bitcoin" results in a complete collapse of its value and it becomes one of those millions of ignored and worthless altcoins, while the "new bitcoin" becomes the coin that the world recognizes as the "real" bitcoin.
On the other hand, if we wanted to decrease the block subsidy from 12.5 BTC to 10 BTC...
That would be a SOFT FORK. All the existing nodes and miners running the old software will just see it as a perfectly acceptable (although silly) decision to pay themselves less than the maximum. Meanwhile the nodes and miners running the new software will refuse to accept any blocks with a subsidy larger than 10 BTC. As long as the new software has less than 51% of the hash power, the miners running the old software will be able to continue to create blocks with 12.5 BTC and the blockchain will fork. The miners running the new software could ignore those blocks, but since they have less than half the hash power, the "old block type" chain will be long enough for the old miners to still recognize it as "valid". Once the new software has more than 51% of the hash power their blockchain will be longer than the blockchain created by the old software. Since the new blockchain is creating blocks that BOTH the new and old software recognize as "valid" blocks, this longer chain will win on both forks and the entire network will settle in on this chain. Even if NONE of the non-mining nodes upgrade, and 49% of the hash power doesn't upgrade. All the blocks created by miners running the old software will end up getting orphaned, and the miners with the new software (and 51% of the hashpower) will effectively enforce the new rules on the entire network.
As you can see, there is no "flag" and no "transactions" involved in this. It's simply a matter of:
If the change to the consensus rules can be made in a way that the old software doesn't mind, then it is a soft fork and requires only 51% of the global hashpower to enforce if you want to prevent a split chain.
If the change to the consensus rules is not acceptable to the old software, then it is a hard fork and requires an overwhelming majority of the economy (nodes, merchants, miners, etc) to enforce if you want to prevent a split chain.