Hmm, thinking the main firmware on the hash board is fine, but it is the individual chain / chip firmware that's corrupted. Each individual ASIC chip looks like it has its own test ICSP connection. I think they get a copy of the app that is sent to each chip, that runs in its own memory space on its own hardware.
Not sure why they would all fail at the same time, or if it's the chain that failed somehow. Need to do a deeper analysis of the data on these ICSP connectors soon.
It's currently reading the main PIC, and it appears to be just fine. Then it tries to "restore" the 3200 block "app" package to the chains / ASICs, and that's where it fails. Then it goes on to trying to test the ASICs and reads nothing because their operating software is corrupt or badly written.
Chain[J4] has 0 asic
retry Chain[J4] has 0 asic
retry Chain[J4] has 0 asic
retry Chain[J4] has 0 asic
retry Chain[J4] has 0 asic
retry Chain[J4] has 0 asic
retry Chain[J4] has 0 asic
miner total rate=0GH/s fixed rate=0GH/s
I think that's the sign that the asic app software is corrupted / munged up.
It's another avenue of attack anyway, will post results when I have the time.
I have a center hash board that is failing in the same exact manner.. log is able to connect to the hashboard reading pic firmware 0x02 but says chain has 0 asic. The board never lights up red and fails to show up in miner status. This all happened a few days after updating to the new auto-tune firmware as well...
All the components seem fine on the hash board, voltage is not seen past the inductor like on my working hash boards.. I believe the software in the PIC controller has become corrupted or a safety mechanism detects a short circuit in the asic chain and prevents it from powering them up.
I've also ordered a pickit 3 to try to restore the PIC firmware, should know by the end of the week if it works out or not. Please keep me updated on your situation as well!
Will do! I have a picit 3 I'm planning on ordering as well, I think that's what's needed at this point, as the software Bitmain publicly released isn't sufficient to diagnose or repair software issues.
I initially thought there was a way to access the hash boards through ssh directly from the controller, and while this is possible, the software needed hasn't been and likely won't ever be released by Bitmain. Someone with more programming experience might be able to do it "in situ" without any disassembly or external hardware, but I lack that experience.
The controller which we have root access to, has direct read / write capabilities to the PIC and the hash board chain firmware. The right piece of software could make this an easy process. The included single-board-test software DOES have the exact read / write code needed, but it's locked up in a closed source file.
The newest source code that Bitmain released is not the new autotune version, so no data or files that are relevant to PIC writing as that was introduced later.
(UPDATE)
Looks like someone was trying a similar thing, as single-board-test shows up in online disassembler
https://onlinedisassembler.com/odaweb/HP7GAkINLooks like it's from a very early version of single-board-test, looks older than anything I have
So, I got the pickit 3 and it's pretty straight forward to backup and restore the PIC firmware on these cards. I've copyied the firmware from a working card to the "dead" card and the card still doesn't light up ;(. The PIC controller seems fine, just not sure why the voltage isn't going through, I highly suspect there to be a short circuit in one of the asic chains triggering some kind of protection mechanism, no idea which chip however ;(... also, one thing I forgot to mention is I let these run way too hot for a few days... > 130C, let's just say the temperature monitoring on these aren't the most accurate, and I notice thermal adhesive melted onto the asic pins.. the adhesive seems to be non-conductive though and I see the same signs on my two other hash boards, no other damage. To be continued...
There is a second software it writes to the ASIC chains during the single-board-test process, that I believe is the actual hashing app itself. It is this sub software that I believe to be faulty, not the main PIC firmware. And the Bitmain software to correct this very issue doesn't seem to work. It's built in restore process either fails or is insufficient to repair whatever software damage occurred.
The file it tries to write to the hashboard mining chain is the "hash_s8_app.txt" file. However after failing for whatever reason it seems to be unable to flash properly in subsequent attempts. The red light won't blink at all if this software is kaput.
My units haven't been much above 100c, even at their highest. I don't believe in my case that it's a short and from other people that have had the similar post firmware update failure I wouldn't suspect a hardware issue in most cases.
Best idea is that without a functional hashing app, the board just sits idle, it won't draw much power unless it's actually hashing. All of the ASICs are essentially idling without anything to do.
The oldest autotune firmwares (<11/08) kept rerunning the restore process until it succeeded. Which to my mind they knew was a problem. Later firmwares run it a few times, then fail over to start mining with whatever boards passed the test. That would leave failed boards failed, but let the good boards continue running. At least in theory.