This is a clever hack but the cost to implement is too high.
If miners restricted themselves to powers of 2 transactions, then the cost would be zero (well 1 extra transaction).
In the period where minting fees dominate, it doesn't cause that big a problem.
Miners could just stop when they hit a power of 2, unless they have enough to get to the next power of 2.
Even when tx fees dominate, if it was a network rule, then miners could optimize to try to get the block up to 1MB.
The cost is a tradeoff. I am thinking of a situation where a node just wants to download and verify the headers.
If the extra field is in the coinbase transaction, they you have to provide the entire merkle path down to the coinbase.
If a block has 65 - 128 transactions, then the tree depth is 7. The path to the coinbase is 32 * 7 = 224 bytes.
You also need to provide the coinbase.
I guess it isn't that big a cost. The "extended" header would be 80 + 224 + coinbase_size.
A compromise would be to add an extra transactions to block right after the coin-base.
0: coinbase
1: tx-pad
This would allow the coinbase to be large without it causing the virtual header from being larger.
With 4096 transactions, that is a depth of 12, the the total header size would be
Path: 11 * 32
Input tx: 32
Aux header: 32
Extra length: 1
That gives 417 bytes (plus the extra fields). The original proposal would give 145 byte headers (plus extra fields).
Could you name some cases which we need extra header fields, and putting the info in coinbase (like block v2) is not enough? (e.g. Your sub-block proposal)
The sub-block thing doesn't require a header change (the latest proposal definitely). It just requires distribution of headers for blocks that didn't quite meet POW.
I was thinking in general.
I think it would be a good idea to reserve 32 bytes in the coinbase transaction as the hash of the extra header info.
The "high hash highway" is a pretty good idea but it adds a 2nd link into the block header.