Pages:
Author

Topic: Cheap and simple repair of S7 hash board - page 3. (Read 28524 times)

legendary
Activity: 3374
Merit: 1859
Curmudgeonly hardware guy
S5 are unregulated; since this is a repair for the regulator it can't be applied to S5.
hero member
Activity: 496
Merit: 500
My S7 and also my friend's S7 stopped hashing many times, even when boards arrived from RMA, so I decided to stop expensive sending of boards to Bitmain and wait so long without mining. This mod is only suitable for 135 chips version.

After repair I am able to adjust S7 very efficient, my B8 runs at 600MHz (4.05TH/s) with 0.22W/GHs DC, so it is around 0.235W/GHs at the wall.

Repair is very simple, every board I'v seen had malfunctioning or not working PIC microcontroller adjusting voltage for chips, so I decide to override this by 50k potentiometer. Now I can adjust voltage for each board manually.

This is original board without fix. You need to connect potentiometer between U2--R17 connection and GND which can be found on C76.

http://pantin.cz/20160209_155344.jpg

Firstly, use silicone or any other suitable glue to glue potentiometer to the board. After it dries out, you can solder its pins to R17 and C76.

http://www.pantin.cz/20160401_095409.jpg

Once you are done, you can use small screwdriver and turning clockwise potentiometer will adjust lowest voltage, about 9.3V which should be enough to start miner at 500MHz.
You can adjust voltage for each board even when miner is running and check instantly number of HW errors during operation.


I hope it will help you. My opinion is that PIC malfunction is intentional from Bitmain to lower diff after RMA period.

Hello RadekG
Can you please update the links to your pics they are not working Sad
This fix works on Antminer s5 hashing boards too ?
legendary
Activity: 3752
Merit: 2667
Evil beware: We have waffles!
I thought that was the inductor (copper coil with magnet in middle) and the PIC is lower down to the left ?

Ref your pic, is not a magnet, is a ferrite core for the inductor. The regulators run around 50-100kHz and iron laminations are useless at those freq.

I believe the PIC is on the other side of the board off to the right of 3 other larger chips. Follow the traces from the 6-pin programming connection. I have a s7 board pulled and will take it into work tomorrow to look at the actual chip numbers with a microscope. DAMN that writing is tiny .

EDIT: Just looked at the chips under a video scope and you are right. That little chip U3 on the inductor side of the board is the PIC.
full member
Activity: 279
Merit: 107
You may want to look at Sidehack's Modding s7 thread https://bitcointalksearch.org/topic/m.15360994
Lots of good info in there including how to change Vcore with a simple firmware mod.

I thought that was the inductor (copper coil with magnet in middle) and the PIC is lower down to the left ?

legendary
Activity: 3752
Merit: 2667
Evil beware: We have waffles!
Hi guys,

How do i check that the pic is faulty, at what to points do you test ?
Is it a resistance test or do you power board and test the voltage ??
ANY help much appreciated
Thanks
You may want to look at Sidehack's Modding s7 thread https://bitcointalksearch.org/topic/m.15360994
Lots of good info in there including how to change Vcore with a simple firmware mod.
full member
Activity: 279
Merit: 107
Hi guys,

How do i check that the pic is faulty, at what to points do you test ?

Is it a resistance test or do you power board and test the voltage ??

ANY help much appreciated

Thanks
newbie
Activity: 56
Merit: 0
Hi, any repair guides for S7 boards showing the #48 asics? If im not wrong we have only the 54 chip version of the board. Thank you

There is essentially not a lot you can do without QFN rework facilities and understanding the board. If it's a chip fallen off situation (desoldered), you may be able to get away with a DIY fix, but even then it's not easy. Likely faults in high-temp environments on the 54 chip boards from what I've heard are:
 Cap failure - which is not impossible to fix, the heatsinks on the back come off fairly easily.
 Failure of the tiny boost converter which does the I/O voltages for the last (couple?) of sets of chips. Hard to fix unless you know what you're doing.
 Failure of an ASIC (solder joint issue). Possible to flux and reflow, but cleaning without softening the adhesive holding all the rest of the heatsinks on is not fun.
 Failure of an actual ASIC chip. Find the chip and replace. Not going to happen if you don't already do fine-pitch SMD work.

There is no PIC or voltage control on a 54 chip board.

I haven't seen a 54 chip board with black adhesive, but I don't have a large sample size.

You can get some diagnostic information by reading the voltages between each "set of 3" heatsinks - there should be around .666 volts per chip (some small amounts of variance is normal). However, even then, there's not a lot you can do without SMD rework.

Check the kernel log - I would expect to see timeouts because it can't see any of the chain - hence why it's defaulting to 48 or 30 chips.

newbie
Activity: 56
Merit: 0
np, easy to mis-read.
Of course from a repair standpoint, that makes pulling chips a royal PITA. Thass why Sidehack has no interest in doing a stick using the s7/s9 chips.

It's not _that_ bad. As long as it's up to solder melting temperature, the black stuff is soft enough. Problem is, at that temp I think the package has started to weaken a little too. And you need some way to remove the black stuff from each chip which is... Less than fun. That combined with the pin-pitch makes this definitely not a hobby job.

Cheers,

Allan.
member
Activity: 135
Merit: 11
Hi, any repair guides for S7 boards showing the #48 asics? If im not wrong we have only the 54 chip version of the board. Thank you
full member
Activity: 279
Merit: 107
second board is fixed , i have it running at 500 freq/1.2 th , dont have my volt meter to set the voltage to run it at 700, so ill take care of that later tonight.

this isnt an easy fix by any means soldering to r17 is a royal pain , c76 isnt to bad but id imagine u could just solder to gnd on the pcie plug instead of the side of c76. that would make it a lil easier

that being said im sending the remainder of my boards to sidehack for pic replacement , instead of soldering the rest myself

[...]

Who are 'sidehack' mate ?

Are they better then an RMA to Btmain ?

Thanks
legendary
Activity: 3752
Merit: 2667
Evil beware: We have waffles!
np, easy to mis-read.
Of course from a repair standpoint, that makes pulling chips a royal PITA. Thass why Sidehack has no interest in doing a stick using the s7/s9 chips.
full member
Activity: 279
Merit: 107
Read it again.

There *is* thermal contact between the chip and heatsink. The thermal epoxy. To minimize chance of the heatsinks coming off they use a LOT of it. So much that yes there is excess actually bonding the board to the sinks. Assuming that the heat sinks were properly pressed down onto the chips to give a very thin even layer, the added contact between the board and sink actually helps move a little more heat away from the package.

Sorry, missed that. Thanks
legendary
Activity: 3752
Merit: 2667
Evil beware: We have waffles!
What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?

Read it again.

There *is* thermal contact between the chip and heatsink. The thermal epoxy. To minimize chance of the heatsinks coming off they use a LOT of it. So much that yes there is excess actually bonding the board to the sinks. Assuming that the heat sinks were properly pressed down onto the chips to give a very thin even layer, the added contact between the board and sink actually helps move a little more heat away from the package.
full member
Activity: 279
Merit: 107
Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?
full member
Activity: 279
Merit: 107
This fix does not apply to S7 Batch 1 boards, since it deals with the board's voltage regulator and there is none on the 54-chip boards, just the 45-chip boards from later batches.

Hi

Where is this faulty IC chip located and where exactly do i measure the voltage ??

I have 4 faulty S7 600Mhz boards, batch 5.....all show fine no faults, all asics ok, temps low, just NO Gh Huh

I want to mark components especially the 'faulty' pic voltage controller with string UV paste ( i can get from work) and send it off for RMA so we ALL can see what they are actually replacing.

Cheers guys, great thread !!!!
full member
Activity: 279
Merit: 107
This fix does not apply to S7 Batch 1 boards, since it deals with the board's voltage regulator and there is none on the 54-chip boards, just the 45-chip boards from later batches.

Is there a list of boards.batches available that someone can post ? I have 4 600Mhz !!!! Boards that that showed no temp or no hash speed all 45 asics, no x or -, after a F/W upgrade they all now show temp but no hash Huh

The ONLY chart i can find is this -



Cheers
newbie
Activity: 56
Merit: 0

wow, this is much more extensive than I was expecting. thanks! I do have a scope and will probably be checking this out at some point.

Sort of too bad that... I imagine you couldn't put a big diode or something with a similar IV curve in its place and route the CLK and D around the dead chip?

As long as your diode can cope with 45+A through it continuously, possibly (assuming you're jumpering out one voltage level). However, I am unsure if you can match the on-load requirements closely enough.

I am unsure if you can jumper using the exposed diagnostic points, my feeling is probably no but would be interesting to try. Removing the chips and soldering jumper wires does not appeal at that scale! Also, I don't have enough hardware to test all of this properly, so some of this is just guesswork. It'd be great if someone who had played with this in real life would confirm some of this stuff. Or if I could get hold of a few more burnt / dead boards!

Interestingly (for me), the LEDs are connected at the end of the chain and seem to be just the busy lines from the last chip.
member
Activity: 117
Merit: 10
The bus is serial, with a couple of status pins. Easiest way to locate the bad one is to initially look at the board status - as far as I've seen, splits only happen "up" the voltage ladder (you can't get a failed chip near ground then have anything work after it). This assumes that the entire chip has gone bad - sometimes it's just the hashing elements. In this case, you'll get an X in the status display. Once this happens, the voltage level of the entire chain becomes suspect, so...

Take a multimeter and measure the voltage on each heatsink set of 3 (either reference ground or the set of chips prior). You'll usually get one or two sets of "different" voltages. The challenge from there is figuring out what's wrong. If you have a scope, have a look on the power rail to each chip group (VCCIO is hardly ever the issue - it's done from what I recall with a simple resistive dropper). CLKOut from each chip is also a good troubleshooting step - when the chip is working, it passes a regenerated clock out of that port. When it's not hashing due to power issues, you'll see what looks like a sawtooth on that pin (never gets to a good clock, just oscillates at the switching frequency or so)

It's all quite difficult to diagnose as a unit, because generally once one or two chips die in interesting ways, the power performance of the chain in it's entirety is suspect. Even when a chip is just not hashing ("scission of chips" caused by inter-chip comms failures), it can bring the entire chain down or make it unreliable.

You can also follow the reset / busy lines down the board as well as the CI/CO data.

Short version: It's difficult. Smiley

wow, this is much more extensive than I was expecting. thanks! I do have a scope and will probably be checking this out at some point.

Sort of too bad that... I imagine you couldn't put a big diode or something with a similar IV curve in its place and route the CLK and D around the dead chip?
newbie
Activity: 56
Merit: 0
The bus is serial, with a couple of status pins. Easiest way to locate the bad one is to initially look at the board status - as far as I've seen, splits only happen "up" the voltage ladder (you can't get a failed chip near ground then have anything work after it). This assumes that the entire chip has gone bad - sometimes it's just the hashing elements. In this case, you'll get an X in the status display. Once this happens, the voltage level of the entire chain becomes suspect, so...

Take a multimeter and measure the voltage on each heatsink set of 3 (either reference ground or the set of chips prior). You'll usually get one or two sets of "different" voltages. The challenge from there is figuring out what's wrong. If you have a scope, have a look on the power rail to each chip group (VCCIO is hardly ever the issue - it's done from what I recall with a simple resistive dropper). CLKOut from each chip is also a good troubleshooting step - when the chip is working, it passes a regenerated clock out of that port. When it's not hashing due to power issues, you'll see what looks like a sawtooth on that pin (never gets to a good clock, just oscillates at the switching frequency or so)

It's all quite difficult to diagnose as a unit, because generally once one or two chips die in interesting ways, the power performance of the chain in it's entirety is suspect. Even when a chip is just not hashing ("scission of chips" caused by inter-chip comms failures), it can bring the entire chain down or make it unreliable.

You can also follow the reset / busy lines down the board as well as the CI/CO data.

Short version: It's difficult. Smiley
member
Activity: 117
Merit: 10
Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

ah. crap. any idea on how to identify bad ones? I guess I could use a thermal camera, but i'd guess that the whole chain never comes up when one is shorted... I imagine maybe they look at the (presumably I2C) bus that all the chips sit on to see which doesn't say 'hi'?
Pages:
Jump to: