Miners have incentive to reorganize the blockchain if the previous block has high miner fees. For example, if the previous block has 50 btc in fees, and the current block has 12.5 btc in fees, then a miner with 33% hashing power has a greater expected return from mining the previous block (50 * .33^2 = 5.445, 12.5*.33 = 4.125). This is not likely to erode trust in the network, because small block conflicts happen frequently (something like 2% of blocks share a parent with another block) and the miner does not need to change the transaction ordering at all, they are effectively only stealing from other miners, and not stealing from the users of the network.
For simplicity, I assume that the only block reward is from mining fees. Having a block subsidy makes the problem much less significant. I also assume that blocks do not have size limits, which means that a miner reorging the previous block can include all of the previous fees and all of the current fees into the previous block. Constrained block sizes also make the problem less significant.
The expected return on mining the current block is (current fees) * (hashing percent).
The expected return on mining the previous block is (current fees + previous fees) * (hashing percent)^2, because you not only need to mine the previous block, but you also need to mine an additional block to become the longest fork.
In a continuous network with no inflation, immediately after the first block is mined, the number of fees on the second block is 0. The only miner directly incentivized to mine on the longest fork is the miner that found the block, because the new block has 0 value in fees, and therefore 0 expected return for the other miners. They would instead continue mining on the parent, in hopes of reorganizing the chain and claiming the fees for themselves. As transactions with fees start to roll in, it becomes more profitable to start mining on the new block, because the expected return on mining the new block will be growing much faster than the expected return on mining the previous block.
When hashing power is 10%, it becomes remains valuable to mine on the previous block when (prev fees + current fees) > 10 * current fees
When hashing power is 40%, the tradeoff is at (prev fees + current fees) > 1.5 * current fees
My proposed solution is to have a 'mining fee pool', which gets all of the mining fees from each block, and then each block pays a constant fraction of its contents to the block miner. Take for example a pool that pays out 50% each block:
Pool starts at 0 coins
Block 1 has a fee of 8 coins
Block 2 has a fee of 20 coins
Block 3 has a fee of 0 coins
For block 1, 10 coins go into the pool, and then 50% of the pool goes to the miner of block 1. At the end, the miner is rewarded 4 coins, and the pool has 4 coins remaining.
For block 2, 20 coins go into the pool, and then 50% of the pool goes to the miner of block 2. At the end, the miner is rewarded 12 coins, and the pool has 12 coins remaining.
For block 3, 0 coins go into the pool, then the miner is rewarded 6 coins, and the pool has 6 coins remaining.
With a 50% decay, the expected return on mining the current block is (previous reward/2 + new fees/2) * (hashing percent).
The expected return on mining the previous block is (previous reward + new fees / 2) * (hashing percent)^2.
Even when there are no new fees, the expected return on mining the current block will be greater when hashing percent is lower than 50%.
This comes at a cost. The immediate reward for miners introducing a transaction into the blockchain is reduced by half. The percent of the other half of the transaction fee that a miner can expect to claim is equivalent to the miners hashing power. A miner with 10% hashing power including a transaction with a fee of 10 coins can expect to get 5.5 of those coins: the first half is awarded immediately, and out of the remaining 5 coins the miner's expected reward is 10%. Miners with less hashing power are less incentivized to include low fee transactions into their blocks because of the risks caused by increased propagation times.
Interestingly enough, this seems to actually increase decentralization, because the miners accepting the most risk for low-value transactions are the ones with more hashing power. Miners with little hashing power are not going to include low value transactions because their long term expected payout from doing so is less than their long term expected payout from higher value transactions. Their total expected reward per hash is higher, because they still get to reap the benefits from the bigger miners adding the low value transactions:
Assume a world where there is one miner with 34% hashing power, and 66 miners with 1% hashing power. All transactions are the same size. The risk of including a transaction into a block is 10 coins. Each block, 1 transaction with a 20 coin fee is added, and 1 transaction with a 15 coin fee is added. The 1% miners are inventivized to add the high fee transaction, because the expected rewards outweigh the costs. The 34% miner is incentivized to include all transactions, because the expected long term reward for the miner outweighs the risk even with the low fee transaction. The collective expected return for the miners is:
How to build the equation: (expected payout of including transaction fee in blocks you find + expected payout of pool decay from other miners including those transactions - expected cost of including a transaction because it's heavy), done for each transaction.
66 miners with 1%: (1/2 * 66/100 * 20 + 1/2 * 66/100 * 20 + 1/2 * 66/100 * 15 - 10 * 66/100) = 11.55, or .175 per hashing %
1 miner with 34%: (1/2 * 34/100 * 20 + 1/2 * 34/100 * 20 + 1/2 * 34/100 * 15 + 1/2 * 100/100 * 15 - 10 * 34/100 - 10 * 100/100 ) = 3.45, or .101 per hashing %
The 1 miner is happy including the heavy transactions because the total number of fees is going up, and the total expected payout is going up. The little miners are benefiting from the big miner adding these small transactions because they get to share in the wealth created by the decay, even though they are not participating in the cost of including the transaction.