regarding cpld:
There are no small pincount but many macrocell cplds available.
The smallest I have found is from latice with QFN 32, that has enough space to fit a i2c slave and the receive logic.
But if it's possible to go without cpld it would be the better way - but will need some rework of the isr handling.
I have another idea I wanted to test but due to work load and how long it takes me to get chips here I haven't pursued it. I wanted to use an 8-pin serial SRAM and 4 normal I/O pins on the PIC. In this setup the ASIC output needs to go thru NOR gate and Schmitt input tristate buffer with OE hooked to PIC.
SI - goes to PIC and output of RES_N buffer
SO - goes to PIC
SCK - goes to PIC and output of NOR via buffer
OE (of buffer) - goes to PIC
In operation the PIC the disables the ASIC output and sends data to the SRAM to configure it for writing. Then it enables the ASIC output and waits, monitoring the SI line for change. When data comes from the ASIC it writes into the SRAM (sequential mode, auto increments) which can handle up to 16 MHz. Once 32 bits are written, the PIC takes control and sends signals to Read the data back, and re-init for write mode again.
This method has a brief dead zone after each nonce obviously but allows high speed capture with no timing issues, and uses only a small 8pin part with no-programming necessary. Regarding the dead zone, we have that with current method too as while the first nonce is being sent to host the PIC won't be able to respond quickly to result interrupts. The dead zone could be removed by using 2 SRAM chips but it starts getting complicated.
Here's a low cost serial SRAM - 0.57 in qty 100. That was the cheapest I found.
http://mouser.com/ProductDetail/Microchip-Technology/23K640-I-SN/?qs=sGAEpiMZZMs6Aik9Fp479oRJ8qzeMKM7vL%2fWRv1ed7o%3dCurrently, I have to re-write code so that as much as possible is handled in the main polling loop, outside ISR time. This means eg. I2C code will have to be run in the main loop but state changes are triggered by interrupt. Any interrupt needs to be only a couple uS. USB will need to use the polling method.