Originally I had disabled the USB interrupts during result capture (32 bits, 4 bytes, 8uS/byte) as I wanted to ensure that nothing disturbed that. That was unacceptable to the USB stack as presumably it has to react to bus hardware conditions. After I removed that code the disconnects stopped but I feared the results would overrun if the USB stack took too much time. That doesn't appear to be an issue. I had the UART interrupt handling one byte but now have modified it to handle both bytes the PIC could have in it's FIFO before letting go. So as long as the USB doesn't take >16uS right in the middle of a result nonce it won't overrun, and if it does then it gets detected and counted. So far I've not seen any counted overruns so it seems to not be the reason for HW errors, which do occur still sometimes.
Looking at a screen shot on the forums for a USB Erupter running under cgminer it indicated around 10% HW errors, so I wonder if higher HW errors is normal compared to GPU mining where I can go for weeks with no HW errors. If Avalons typically run with HW errors then I'm maybe wasting my time trying to get it down to GPU levels? Any input from Avalon owners about typical HW error counts?
My 3 module Avalons vary between 1.1% and 2.2% HW according to formula of CK: 100 * HW / (diff1shares + HW)
I am not trying to be smart, just trying to help with my common sense (and a long gone history of assembly hacking)
I had a quick look at the PIC16(L)F1454/5/9 preliminary data sheet, focussing on interrupts. If I understand correctly, an interrupt occurs, it does automatic context saving, it jumps to the interrupt service routine(ISR) and there you need to determine the source of the interrupt by polling the interrupt flag bits, handle the interrupt, reset the global interrupt enable bit(GIE) and return from interrupt(RETFIE) which restores the context. If a new interrupt is pending, it will be handled after one instruction has been executed after returning from the interrupt.
Maybe your 'unacceptable to the USB stack' was caused by the fact that a second interrupt became pending while you were still handling the first interrupt ?