1- A pool will be paid to include a transaction with a small fee (or not fee at all)
2- A pool reserves a small percentage for low fee transactions.
3- A person mess around with a tx and do not put the charge back output into the tx. So the fees goes moon.
I'm going to deal with them:
1- That's something I've already see and asked about it: https://bitcoin.stackexchange.com/questions/93471/ive-found-two-mined-txs-with-no-fee.
But no problem: As you can see in this section https://mempoolexplorer.com/block/last/BITCOIND I can easily see the txs which has not been relayed to my node when the block arrived.
I've already discounted them in the column "Lost reward excl. not in our mempool txs" from here: https://mempoolexplorer.com/block/BITCOIND but not here yet: https://mempoolexplorer.com/miner. I'll do it.
2- If a pool reserves a small percentage for low fees tx there's nothing can be done. But I think this must be taken into the "final result" to see how much money they are spending onto this.
3- Easy: Check if a transaction deviates a lot from block average. The threshold will be high since I've seen in a regular basis tx's paying a lot more than it's needed.
I have a lot to do. since this needs more graphs and a lot of other things I need to solve first.
But I'd like somebody answer the question if running several instances in different geographical places and compare the results would be a good way to start and make the problem of block propagation time less concerning.
Thanks in advance.
3 - A pool that is solely taking the on-chain tx fee into consideration for which transactions to include in their blocks are solving something similar to a knapsack problem. CPFP transactions make this a little more tricky, especially if there is a chain of transactions, some of which may cause the transaction date to get confirmed (for example if the tx fee values were: Low, Low, Low, High, Low, Low, High, High, Low, Low, Very High). In general, pools will choose which transactions to include in a greedy manner, by choosing the highest tx fee rate transactions first, until there is very little block space left, at which point they may bypass a higher tx fee rate transaction in favor of a slightly lower tx fee rate transaction if it means the total tx fee increases compared to including the higher tx fee rate transaction (for example, a pool may choose to not include a transaction that would result in one unit of fees in favor of including two transactions that result in two units of fees being paid to the pool -- this may be possible if the former resulted in some block space being unused and the later resulted in lesser block space being unused).
If you do some research on the knapsack problem, you will discover that the average tx fee is not going to give you much information on if a transaction has an unusually high tx fee. There may be other measure that might hint at if a transaction has an unusually high tx fee.