Hold on..I think i misunderstood what you said first.
You say the header is calculated first then the rest of the header is calculated with the nonce. What I meant was can the whole of the header be calculated separately and then nonce separate and then combined. It seems unlikely but I thought I'd ask here in case someone figured out how to
Please explain what you mean when you say "compute the header". The header is not "computed", it is built. The only piece of the header that is "computed" is the Merkle Root. The nonce is also not "computed", it is chosen.
So, first the header is built, then the nonce is chosen, then a double SHA256 hash is computed of the assembled header and nonce together. If the hash is lower than the current difficulty target, then the block is complete and gets broadcast. If it is not lower than the current difficulty target, then a new nonce is chosen and the double SHA256 hash is computed again. This process of choosing a nonce and computing the hash is repeated until all possible nonce values have been tried or a hash is found to be lower than the current difficulty target.
Since the nonce is at the very end of the header, and since the process of computing a SHA256 hash involved processing 64 bytes at a time, it is possible to get half way through the first calculation of the SHA256 hash and store that value. Then each time a new nonce is chosen, it's possible to start from the stored value rather than needing to recompute the entire hash from the beginning.