The solution to your problem is to get a more poweful CPU and alot more RAM. To be quite honest, that's the only solution to your problem.
It is not the only solution. For example: if non-interactive transaction joining would be available on the protocol level, then the same number of transactions per second, could be written on-chain, with smaller amount of bytes. Which means, that if you can batch things, then you can have quite small block size, but confirming a lot of transactions. And the same is true, when it comes to data compression: if you compress for example historical address reuse, then it makes Initial Blockchain Download faster.
but how downloading the entire blockchain is apparently some type of CPU/RAM/SSD-HDD intensive process
Because it is written in a download-and-verify trustless style. If you just select your bitcoin directory, and do copy-paste on your hard drive, it could take minutes, maybe hours, depending on your hardware. However, if you want to get that data from another peer, over the Internet, and make it trustless, and also assume, that some node may be malicious, and serve some invalid data, then guess what: you have to verify it! And that verification is the main bottleneck.
And of course, you can skip verification, if you really want. Many altcoin users shared their bitcoin directory, with already created and verified database. But guess what: in this way, you don't verify, you just trust. And the whole problem of verification time is not related to existing users and nodes, which already know, that the history is correct. It is more related to new nodes, which has to verify everything for the first time, and to some pruned nodes, which may need to re-download and re-verify the chain, if their pruned node will crash for whatever reason (or if there would be any chain reorganization, which is deeper than their pruning point).
Why not break it into steps:
1) download all the raw data
2) once download is complete, then process the data
to me that seems more reasonable than trying to do both things at once.
Because then, you may waste a lot of resources, if you encounter some malicious node. For example: some ASIC user may want to test "the longest chain rule". It is easy to produce one million blocks with minimal difficulty. Then, if you don't verify anything, you will download a lot of gigabytes, only to notice, that since block 123,456, something is wrong, and there is some heavier chain, which is shorter, but contains a bigger chainwork. And it is not only about that, there are many attacks, where something could go wrong, and where you may end up downloading a lot of data from some peer, to conclude in the end, that the whole chain is invalid since block number N. Another example: sigops limit per block. Even some mining pools sometimes get it wrong:
https://bitcointalksearch.org/topic/error-connectblock-too-many-sigops-invalidchainfound-invalid-block-5447129