If a transactions is broadcasted and everyone receives the transaction, why not add a hash (256 bits) of the transaction in a block body? Propagation time of a block would be similar, the number of transactions send in a block is increased by a factor 8-ish.
The problem is that not everyone receives the transaction.
And even if they do, they may not store it in their memory pool.
Having learning about the new block (that these days usually contains over thousand txs), you will most likely
always be missing some of them - even if your node has been online for weeks.
You need each single transaction to verify the new block and that is why the protocol sends you all the txs along with each block.
There is obviously a space for some optimization here, but the problem is actually quite complex and I think that is why nobody has implemented any solution for it yet.
Just think of a specific scenario:
Your node receives a notification announcing a new block. You know it is a new block, but you still don't know what transactions are inside it. So you need to ask the peer who knows that block already for the list of the tx_ids inside it - best case scenario, it will come back after the ping time, plus the time needed to transfer the data (which often would exceed 100KB).
Now, having the list of tx_ids already, the odds are that you will miss some of the txs and you will need to ask the peer for them. This requires another ping time plus the time to transfer back the data - and that is the problem.
What I am saying is that in this scenario you need two request to acquire the content of the block: first for txs list, second for txs you didn't know.
You cannot do these two in parallel - need to proocess the answer for the first request, to issue the second one.
And currently you need only one request: asking for the entire block, with all the transactions.
It obviously depends on the specifics of the connection to the peer (ping vs bandwidth), but in general it is questionable whether the scenario with two requests would give you the entire content of the block faster (aka block propagation speed). It would most likely require to send less data, but causing an additional ping-long delay for the additional request.