Pages:
Author

Topic: Hacking GPU cards back into operation because I need something to do.... - page 3. (Read 3956 times)

legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
Getting back to my previous comment about certain models having the same issue, I also used to have a bunch of Asus GTX 780 Ti cards that were designed in a way that their VRMs would go well above 100°C as they had absolutely no dissipation (just hid under the heatsink with no contact). I bought a few thermal pads and put it on them so that the pads connected them to the heatsink and the temps were decreased drastically.

Also, when I used to mine Ethereum I noticed the memory modules would go slightly above 100°C (GTX 970) even without overclocking while the GPU itself was about 60°C and I expect a lot of those cards will end up dying coming from miners who mined Eth for a long time or might even still mine it.

So my point is that probably each exact model of cards have an expected way of dying. And ebay is probably full of faulty cards that were already checked by someone experienced like OP and deemed FUBAR.
legendary
Activity: 1108
Merit: 1005
I have 5, or 6 dead boards, 7950,7970,280x,6990s
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Nice. I'd be interested to see how many of the cards you attempt to repair can actually be repaired. 
I have a friend in the component-level repair industry and he says that most GPUs die because their VRMs are either of terrible design or the cooling is bad. The graphics chip itself rarely dies. Though he doesn't actually repair graphics card due to luck of schematics, which he says makes the process more time consuming.
Good luck, I'll be following this thread.
Indeed. Lack of cooling on VRMs will cause the FETs to go, my guess is if you're overclocking that can do it (current will avalanche as temps go up). As for schematics, there never seem to be any, anymore especially for Bitcoin miners; no one wants to take the liability I suppose. However these things are pretty simple at their heart: Get power into them, get work into the chip and out, and put the heat somewhere.

Now I need some dead boards to start working on. Anyone got a box of old dead boards?
legendary
Activity: 1007
Merit: 1000
Im highly intrigued by this idea and have thought of the same myself, however dont have the low-level hardware background to make it a possibility. Personally i have a 280x thats driving me nuts. Hopefully you end up working with a card with a similar issue.

Just to put it out there, the card mines just fine, but no matter what drivers or gpu-reading software i use (gpu-z, AB, trixx) I cannot ever get this thing to show a temperature! In fact ive spent the shipping and had it sent back to gigabyte under warrunty, and after claiming to fix it, it still shows no temp!

All that said, I love the idea of this thread and will be following very closely, Good luck, and thank you in advance for any tips and tricks you find. Smiley
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
I presume if VRM get shorted, PCB will be damaged at least on mid/high end cards

In past like 6+ years ago most problems were due GPU used bad solder to PCB so reflow helped, now I think its VRM mostly or GPU memory going bad
Typically the high side FETs on reasonable VRMs will have a RC circuit or a op amp comparator across them to measure current flow and shut down the VRM if the current flow goes too high (ie a burned FET) before there is a cut through short to ground. Low side FETs rarely fail because their on time is much higher than the high side, so they don't have as much switching loss.

If the GPU shorts internally you're sunk of course but that can be tested by pulling the high side FETs and looking for shorts. Hm.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Interesting. Power subsystems are one of my specialties, it's surprisingly hard to build a good one and easy to screw it up.

My first thought was that overheating the GPU chip could cause the solder balls to go high resistance, thus causing it to fail, however the problem is most GPUs are a very high density BGA mounted on a board to a pitch that will mate to a rational PCB. The high density BGA isn't the issue, it's that they glue the die to the carrier and if you overheat the chip too much the solder balls "blow out" and short under the die. That's sunk.

I'll take a look into the VRMs.

C
newbie
Activity: 27
Merit: 0
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
Cool.

Out of dozens of GPUs over the years I only ever had one particular model (GV-N75TOC-2GI) dying because it had weak VRMs. I think 5 out of six died withing months.
After the RMA repair process the same cards still work flawlessly.
legendary
Activity: 1901
Merit: 1024
I presume if VRM get shorted, PCB will be damaged at least on mid/high end cards

In past like 6+ years ago most problems were due GPU used bad solder to PCB so reflow helped, now I think its VRM mostly or GPU memory going bad
sr. member
Activity: 420
Merit: 251
Nice. I'd be interested to see how many of the cards you attempt to repair can actually be repaired. 
I have a friend in the component-level repair industry and he says that most GPUs die because their VRMs are either of terrible design or the cooling is bad. The graphics chip itself rarely dies. Though he doesn't actually repair graphics card due to luck of schematics, which he says makes the process more time consuming.
Good luck, I'll be following this thread.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Reserved for status. Let's roll....
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Reserved for tips and tricks
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
So I've been fixing Titans, Neptunes, Monarchs, Singles, Avalons, and a whole bunch of mining technologies over the years, but for some reason never really fiddled around much with GPU cards. They blow up too, and I see them on sale at Ebay all the time. I need a challenge, so I thought I would start a thread on my observations in fixing them if possible, developing techniques that can work, and figuring out how to tell one that can be fixed from a brick.

As normal, I will post my thoughts below and see what I can come up with. First up I need to find some dead cards to practice on....

Background: Years of doing SMD repair on electric car power controllers (400v/500a) as well as miners (.6 volts, 1000 amps) and other small things. I prefer to use hot air rework tools, and I like to use pre-heat to keep from roasting components. I don't use the toaster to repair boards. :-)

Let's see where this goes.
Pages:
Jump to: