Create a tiny interrupt handling routine that gets and stores one byte in a buffer and increment the store memory pointer and returns from the interrupt again. Mean and lean!
something like:
constant int dataBufferLength=32;
byte interruptDataBuffer[dataBufferLength];
byte* interruptDataBufferAddress=&interruptDataBuffer;
byte* lastReadBufferAddress=&interruptDataBuffer;
handleInterrupt:
reg a = getDataByteFromInterrupt();
get ix from interruptDataBufferAddress;
store a at ix;
increment ix;
if (ix == &interruptDataBuffer+dataBufferLength) ix=interruptDataBufferAddress;
store ix at interruptDataBufferAddress;
returnFromInterrupt;
I know I am simplifying things, but interrupt handlers should be as short as possible.
I also have a question:
It is possible, although unlikely, that more than one chip has a result at the same time. Do you know from which chip the data is coming ? If so, this should be taken into account in the above pseudo code. If not, could it be the reason for your data issue?
Good luck solving the issue at hand!
My biggest score today was realizing that since the data is inverted I need to initialize the FIFO with 0xFF instead of 0x00 so that any missed bits are shifted as inverted zeros not ones. Once I added a reset init of xFF after every result the error rate dropped way down. But even after all this it's still running around 3-5%. When I scoped the bad result cases I saw that the data appeared shifted by 1 bit - so a single bit was being left over from one result to another, or captured sometime in the space between results (noise?). By resetting to 0xFF after each capture it primes the FIFO and ensures that one error bit doesn't create a long string of error results by staying out of sync.
The case of two results happening at once is sufficiently rare that it doesn't matter. Either the ASIC has circuitry to detect busy (which I doubt given how they've gone minimal on anything related to comm.) or the collision just nukes both results. The probability of that happening is so low that it has no bearing on error rates in the > 0.1% range. If the ASICs are actually in sync due to being driven by the same clock source (which I doubt) then it is possible that one result has priority over the other and succeeds. Anyway, the probabilty right now with 4 chips is about 128/(nonce range size), or 1:8388608. (128 because the result clock appears to be hashclock/128, so 1 result takes the time of 128 hashes).
I believe something else is at work here to create the errors. Either PLL noise, instability, or some error on shifting data into the ASIC. Remember that any corruption on data in is going to give error results out when the host tries to verify back to the correct original data. I've spent a lot of time looking at scope traces. The only thing I've definitely been able to detect that way so far is that sometimes a result is captured one bit out of sync, ie. it had an extra first bit that pushed all the consequent bits off. But visually that first bit doesn't look different than cases where it captured fine. There is no extra clock bit and it's not out of position. I only know it's one bit off by writing down the scoped bits pattern and comparing to the result captured. There isn't even noise, and it doesn't happen during a capture but sometime before a capture starts, or after it ends, corrupting the next one. Are my antenna like wires connecting the red board causing spurious clk edges?
Writing this just gave me an idea.