EDIT: improved based on suggestions of XertroV
To me it seems that the fundamental problem is that the entire block reward (coinbase + fees) is going to a single miner. Note I'm using the word "miner" in the strict sense of the word, i.e., the entity which decides what transactions to put in a block and gets to distribute the reward, i.e., a mining pool operator.
A solution therefore might be to have multiple PoWs and multiple rewards per block, for instance three (that's quite an arbitrary number, perhaps five would work better, but for the sake of the example I'll go with three PoWs per block).
A way to split up the PoWs could be as follows. Let there be PoW1, PoW2 and PoW3. Any one PoW is valid for only a subset of transactions (1/3 of all transactions in the case of three PoWs). However the miner must not know in advance to which subset of transactions it is searching a solution for. So, each miner selects all transactions from the mempool just as-is, and includes transactions to distribute the block reward. If a miner finds a solution (a nonce), then it is broadcast so other miners know that a PoW has been completed, so they can switch to other PoWs that have not yet been found.
The miner that finds the final PoW can broadcast the block. The protocol enforces that the first miner that broadcasts a solution locks in the reward for that PoW. This as an incentive for a miner to immediately broadcast a solution once it finds it. If two or more miners broadcast a solution for a PoW at the same time, the protocol may randomly choose one.
It must be impossible for a miner to know which transactions its PoW is going to be valid for. Therefore, establishing the set of transactions approved by each PoW should be done as late as possible, so only when the block is finalized by the miner that found the last missing PoW and includes all PoWs in the block and broadcasts it. The mapping between a PoW and a set of transactions should be random, deterministic and impossible to manipulate. Since the exact value of a nonce is impossible to manipulate, the mapping should somehow be derived from it. Also, the transaction inputs and outputs can not be manipulated by the miner, so these are also useful. We could take the hash mod 3 of such immutable property of the transaction plus the final nonce, rendering 0, 1 or 2 for each transaction. So hash(tx_property, final_nonce) %3 gives 0, 1 or 2 for each transaction, which in turn dictates if a transaction belongs to PoW1, 2 or 3. This way, a PoW controls only 1/3 of all transactions.
Example
Suppose the miner that found PoW1 included transactions 1, 2, 3, 5, the miner that found PoW2 included 1, 2, 3, 4, 5, 6 and the miner that found PoW3 included 1, 2, 5. Suppose as well, after applying hash(tx_property, final_nonce) %3 to each transaction, it turns out that PoW1 is valid for tx 1, 2, PoW2 is valid for 3, 4 and PoW3 is valid for 5, 6. The set of valid transactions will then be 1, 2 (PoW1), 3, 4 (PoW2) and 5 (PoW3). Transaction 6 is not included in the block as valid, since it belongs to PoW3, and PoW3 did not include transaction 6.
The effect of this will be that controlling > 50% of the network hash rate is not enough to consistently control all transactions. Increasing the number of PoWs per block from three to five reduces the number of transactions under the control of a single large miner even further. Of course, for a large enough share all of hash power, the miner will again be able to control all blocks, but this number will be 80% for five PoWs, plus that it can never be sure it finds the PoW for a malicious self-inserted transaction before another miner, as it can not predict which set of transactions the PoW it finds is valid for. This reduces the ability to abuse its mining power.