Im using july 20th FW
regarding guy with bad hash unit. i dont know how much you can trust the reported voltage from the HW. its connected to an ADC which will have inaccuracies of its own. Best to measure VDD_CORE with a precise meter instead. I dont want to take my only unit apart, but for those people that do, I posted instructions on where to probe a few posts up. But, even if the ADC is innacurate, its good to see that whatever voltage it is measuring is very stable.
also, a few pages back, i went into why i dont think we will be able to overclock the A9's much more based on the information in the A9 chip datasheet.
if it were a spi issue, lowering the speed may or may not solve the problem because it could be a clocking issue (ie: setup and hold time for clock/data does not comply with spec (which is not listed in datasheet of course))....and it might also add more issues by itself
I Don't trust it, but like you said, its good to see its stable. I dont have a precise multi meter with me and it wouldnt give me enough information to make the teardown worthwhile. I need an good oscilloscope so I can see fluctuations and watch the spi data move down the chain.
I'm not actually trying to overclock it. I'm trying to affect the HW error rate in anyway (under/overclocking and under/overvolting) to get a foot in the door on the nature of the problem. Same with SPI. I wanted to see if slowing it down affected HW error rate.
Can you elaporate more on your POV regarding clocking issues?
My Current understanding from the Datasheet and reading about other asics. These things need 3 things to function.
1. Good, stable levels. (VCore 0.9+/-10%, Spi/Reset 1.8V),
2. Clean Clock Source
3. functional SPI.
Thats about it. Voltage levels and clock signal will be super easy to verify. Good Spi signals will be more tricky. Thats why im doing some testing. From what I gather, Spi commands are sent to chip1, passed from one chip to the other down to 12 and back up to the Controller. I honestly think SPI is working fine or there would be data reporting issues and SPI errors. Some chips are just returning no or bad results. But those same chips seem happy to pass SPI data along for good chips down the chain. Unless these arent setup in a SPI daisy chain. Pretty sure they are but it will be easy to verify when i teardown again. My guts telling me that some of these chips are just crap but we will see.
It seems hashing rate is directly correlated to Vcore and PLL speed. There is a chart in the datasheet which shows 3 different level of voltages and PLL combos to hit a specific hashing rate.
Vcore: I think the silicon would just have a spec for clean voltage ripple on Vcore. +/-10% is a HUGE margin on a vcore voltage spec. There is a chance the ADC is averaging samples which makes the voltage readout so clean, so yes, maybe you might want to get a cheapo scope to see what the voltage ripple looks like on Vcore. I want to say it wont be more than +/-20mV unless its a really badly designed board. I'm not sure how much high frequency noise affects Vcore vs PLL hashing power as long as the average voltage is stable and as long as you are not going over the VMAX limit (causing damage to the chip). I think a simple cheapo multimeter is actually perfectly fine to use for this exercise, but a scope would be nice to quantify voltage ripple.
SPI: this is such a low speed bus the only thing you probably need to be concerned with is setup and hold timing. it looks like you are getting error readouts from every single chip in the miner, so I also agree with your statement that SPI is probably working fine...otherwise the system wouldnt be able to spit numbers back out at you.
Clean clock: these chips have an internal PLL, and each chip's PLL will have slightly different clock speeds from the "ideal". If you are concerned with the signal that the PLL is taking to generate the clock, probe the CLOCK_L pin on the chip. I suspect its fine.
Temperature: I havent found much of a correlation with temperature. Dropping my unit 20C from 70C to 50C didnt affect stability or hashing rate. Im also 99% sure the fan speed does not affect hashing power as the fan runs off of the 12V rail that is connected to the controller card, not the ASIC cards. If the increased current from faster fan speed was causing a voltage drop, it would be on the controller card which has no power connection to the ASICs.