Erlay is something different, dealing with the duplication of relaying transaction invs to/from multiple peers.
Each node receives each block once. So in aggregate compact blocks reduces block related network bandwidth usage to roughly half (it would be half if it were perfectly efficient, taking zero bytes per txn).
The 1/2 bandwidth itself isn't very important but it made people more confident that the increased effective block size limit from segwit would not make things much worse than they had been. (Also, block related bandwidth is only a small part of a nodes usage: as the erlay writeup notes: nodes spend a lot of bandwidth on INV messages, because unlike blocks and transaction bodies they need to be exchanged between each peer instead of received only once)
There isn't really any complexity in figuring out the exact bandwidth savings for you, the debug logging for compact blocks is sufficient to figure it out. In practice it's pretty close to the size of the block minus the marginal size of the compact block (6 bytes per transaction).
The bigger effect however is on latency: The latency to relay a block is the time it takes to transfer it plus processing. The transmission serialization delay goes from two megabytes to ~13kb, which is a substantial speedup. The fact that the information needed to relay a block is made so small allows nodes to request a limited number of peers send them new blocks without asking if the already have it first, resulting in a bit of waste but eliminating a half round trip time.
Block transmission latency is important because delays in transmission create an advantage for higher hashpower miners over lower hashpower miners, a source of centralization pressure.
The reduced size also allows getting the block from multiple peers concurrently without waiting for a long timeout, which improves robustness to some attacks.
Even when BIP152 was created we knew how to reduce the size much further, e.g. the original writeup that lead to BIP152
https://nt4tn.net/tech-notes/201512.efficient.block.xfer.txt (and the related
https://nt4tn.net/tech-notes/201512.lowlatency.block.xfer.txt) describe additional techniques that bring sizes down much further (the writeup says <2kb, but subsequent prototyping( showed under 900 bytes is realistic-- though getting to that size requires miners to construct blocks in a predictable order, which they usually do). But these extra steps come at considerable code and computational complexity, and might not even reduce latency much except on the fastest computers because of the extra cpu time needed to decode.
(those techniques showed up as part of erlay, for transaction relay, which isn't latency-limited so the fact that they can be slow to decode doesn't obviate their benefits)