Pages:
Author

Topic: Hacking KNC Titan / Jupiter / Neptune miners back to life. Why not? - page 24. (Read 76793 times)

legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
I had a Neptune whos controller blew, the light would flash orange not white when it came on, the FPGA felt hot, i mean hot enough to burn your finger. the second controller the guy sent did the same thing, chip got roasting hot, it would see the cubes but they would not start hashing.
I wonder why this happens. Note I unplugged both the cube and the display so that is not it. There's something going on in the FPGA.

Maybe it's the lack of driver buffers, maybe the chips just suck. Regardless it's odd.

I'll order 4 more FPGAs, replace the two bad ones I have here and fix two other boards someone sent in to me. Need to get this stuff done :-)
sr. member
Activity: 364
Merit: 250
I had a Neptune whos controller blew, the light would flash orange not white when it came on, the FPGA felt hot, i mean hot enough to burn your finger. the second controller the guy sent did the same thing, chip got roasting hot, it would see the cubes but they would not start hashing.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Something else I am noticing: Have a reference Titan plugged into a controller here. Just noticed the FPGA is getting hot. Pulled the Titan, put nothing else on the controller, still hot. Odd. I think it's going to blow the FPGA soon, no biggie as I can fix it, but is this a symptom of some sort?

Anyone else have a hot FPGA? I've seen a few repairs come in where people put heat sinks on the thing, it's not supposed to run warm. Something else is happpening.

C
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
No problem. That front screw spot is not very strong, the torque from the heat sink coupled with the slight bending needed to put down the sides to get to the screw can cause it to come loose. Then there is no pressure on the die.

The second die probably has a bad power supply. Note I have seen a number of people try to "repair" the supplies it is very difficult to remove them without a lot of bottom heat, air, and patience. Otherwise you wind up pulling up the traces which can wreck the cube. Run it at 150mhz (half of 300) on the remaining supply and feel good knowing that it is keeping the overall power draw at a reasonable level.

C
hero member
Activity: 808
Merit: 502
I just wanted to report on a problem that I was having with a titan cube. It was purchased in new condition as the person/company that bought it never used the Titans.
Unfortunately when I received the single cube two of the dies, the top two dies did not work properly. I opened the cube and found that the heat sync popped loose because KNC must have over tightened the screw on the end and the base popped loose that holds the screw to the circuit board. I bolted the end back down by getting parts at the hardware store.

Before I had fixed it the symptoms were the top die would hash for about 10 seconds and overheat and shutdown. The second die would immediately shutdown. I could run the top die at a 100mhz and it would run steady but still running a little hot. After properly clamping down the single screw end of the heat synce the top die would run at 300 mhz no heating problem. The second die runs at 175 mhz setting but is using more current than it should be using. The cube amazingly still runs at over 70 MHZ hashing rate even though second die is under clocked. I opened the cube and found that the heat sync popped loose because they over tightened the screw on the end and the base popped loose that holds the screw. I bolted the end back down by getting parts at the hardware store. I got the cube hashing at full power. Thanks Chris for the help.

I have another cube with the same problem that I never could get fully hashing as the two top dies are shut off because the heat sync is not bolted down properly. I will work on that one today and see if I can get that one fully operational. I figured out away to bolt the heat sync down by buying slightly longer screws and using a washer and lock washer and nut to bolt down the front end of the heat sync properly. That is my mechanical solution.
newbie
Activity: 3
Merit: 0
Lightfoot, have you traced out pin 6 of the 10 pin connector on a Titan cube?  I am reading 39 ohms on cube here.  I thought this was a good working cube as i was solid on all four dies before I blew the controller board with another cube.  I am afraid to fire it up as I only have one working controller until I get others fixed.  Have additional tools to arrive over weekend and then can work on the controllers.  Gonna try replacing the FPGA chip again.  Will follow your advice on replacing that.  Got a preheat table.  What you think?
copper member
Activity: 2898
Merit: 1465
Clueless!
Nice. I ran a Neptune for awhile with a Corsair H100 water block secured to the top, it worked very well for a few months keeping the chip cool until the water block sprung a leak. Then all hell broke loose :-)

Leak detection is a big problem: The stupid things don't have temp sensing in the chip dies and little to no thermal mass, as a result the chip can overheat very quickly before software manages to shut things down. One of the leading causes of Titan Die death is when the die inside that chip carrier overheats, as soon as that happens the solder BGA grid under the chip melts, extrudes as solder balls, and either shorts the +.8v supply lines (if you are very lucky) or shorts the signal lines (which are shared by all dies) which kills the whole board.

So you might want to continue running that big heat sink thing: It actually cools very well and as long as the compound they used (which is crap) is secure you should be in good shape. Just remember that if you have an air gap on a corner of the chip, that die will extrude, melt, and fail.

Meantime I tried to source the displays and failed (got the wrong part) for the blown display boards. Anyone know what the right DigiKey or Mouser part number is?


Well I have a 1 working die cube....so of course I threw like $50 at it (dumb) the long spiky copper heatsinks and the a15-ippc 3000 rpm fan (33% louder then the 2000 rpm ones on
Titans I was told.....er and 33% more wind ..too) Smiley

So>>>>>>>>> assuming <<<<<<<<<<<<<<<<<< always a bad idea to assume with kNC stuff

That does NOT work ...and it still is a 1 die working cube only unit (currently at 24 on putty.....they run great all alone at 325 on adv page lol )

Anway if it does not work the above. I MAY just get one of these water coolers for the fun of seeing how fast I can get one die to hash......(I need a life) Smiley

HOPEFULLY with the repaste/heatsinks/3000 rpm fan I can get another die working out of this cube. Never will make my money back likely..but it is like an 'itch' to try Smiley

legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Nice. I ran a Neptune for awhile with a Corsair H100 water block secured to the top, it worked very well for a few months keeping the chip cool until the water block sprung a leak. Then all hell broke loose :-)

Leak detection is a big problem: The stupid things don't have temp sensing in the chip dies and little to no thermal mass, as a result the chip can overheat very quickly before software manages to shut things down. One of the leading causes of Titan Die death is when the die inside that chip carrier overheats, as soon as that happens the solder BGA grid under the chip melts, extrudes as solder balls, and either shorts the +.8v supply lines (if you are very lucky) or shorts the signal lines (which are shared by all dies) which kills the whole board.

So you might want to continue running that big heat sink thing: It actually cools very well and as long as the compound they used (which is crap) is secure you should be in good shape. Just remember that if you have an air gap on a corner of the chip, that die will extrude, melt, and fail.

Meantime I tried to source the displays and failed (got the wrong part) for the blown display boards. Anyone know what the right DigiKey or Mouser part number is?
hero member
Activity: 868
Merit: 517

Ok now we are talking Smiley  Article on Water Cooling a Titan Smiley

http://cryptomining-blog.com/tag/knc-titan-cooling/

the cooler (i think this is the one) on Amazon below

http://www.amazon.com/Silverstone-Durable-High-Performance-Adjustable-TD02-E/dp/B00U8IS8F8/ref=sr_1_1/176-2500252-4794240?ie=UTF8&qid=1463481063&sr=8-1&keywords=tundra+td02


I MIGHT try this on my cube with only ONE working die....assuming any mods I do will bring none of the others back


er maybe not ...still too scary ...shudder (ping.....pop Titan Cube Sprinkler System)

anyway for those braver then myself for your consideration Smiley



This water cooling project looks like it would be a lot of fun.  If you had a Jupiter case laying around it would be kinda cool to retro the Titan's into the Jupiter case which would give you the room to setup the radiators and plumbing for the cooling.  That is good to know that the Titan board has a standard 115X mount for coolers.  That would be very doable I think if you happened to have a case laying around that you could configure for the new cooling options.  That article he didn't show it mounted up in a case, but I don't think it would be too hard to do.

copper member
Activity: 2898
Merit: 1465
Clueless!

Ok now we are talking Smiley  Article on Water Cooling a Titan Smiley

http://cryptomining-blog.com/tag/knc-titan-cooling/

the cooler (i think this is the one) on Amazon below

http://www.amazon.com/Silverstone-Durable-High-Performance-Adjustable-TD02-E/dp/B00U8IS8F8/ref=sr_1_1/176-2500252-4794240?ie=UTF8&qid=1463481063&sr=8-1&keywords=tundra+td02


I MIGHT try this on my cube with only ONE working die....assuming any mods I do will bring none of the others back


er maybe not ...still too scary ...shudder (ping.....pop Titan Cube Sprinkler System)

anyway for those braver then myself for your consideration Smiley

full member
Activity: 253
Merit: 100
was thinking that you could use MEK (Methyl ethyl ketone) in a syringe and inject it all round the sides of the chip to try and get it to penetrate the glue, we use this stuff at work and it will loosen up most bonding.

anyway thought I would add this, It might help for spreader removal.   
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
In the meantime I worked on the unit with the lid off. Die 0 seems to be the problem, and sure enough it's shorted according to the output of the dc-dc supply. And the board had 0 ohms on the pins 4,6. So I heated that die to super hot then let it cool down.

The resistances changed. Both on the 4-6 and the dc-dc. Still way too low, but they did change, first evidence a specific die is apparently the cause. Heated it again, back to shorted both places.

So I pulled the die. Real royal bitch. Under it the solder was smeared and there was a bronze ball, which is caused by a really high overtemp. Moral, chip overheated, melted the solder under the die (not the chip) and blew up the board.

legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
So an update: I tried the same trick with a chip that was not bent like a pretzel and the glue would not loosen. Problem is if you push too hard you will rip the plastic underlay that carries all the signals. Too much heat and you lift the balls under the die.

Hm.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Tarkin: The first one was a Neptune, it had the smaller dies. I've only taken apart one Titan, this is it.

Another note: DO NOT BURN THE CONNECTORS. If the ground burns out first the cube will try to find ground through the 10 pin ribbon cable. This will either blow up the cube's die, the controller, or every cube on the controller. I've pretty much got the failure mode mapped out: If the 12v side lifts first the cube shuts down. If the main ground lifts first all of the other cube dies basically have ground dragged through them causing a serious failure.

copper member
Activity: 2898
Merit: 1465
Clueless!



Well on another note.. the Gen Tarkin Software went ACK! and shut another die to OFF..on my orig dead die I got from knc so now down 2 dies..

the Gen Tarken Software .goes all SKYNET on me when I try to tweak it ...heat issue ..do not pass go...do not muck with the die....heh not a good sign

when you try to turn a die on and reboot and during reboot slap'd down and put back to OFF after like 15 seconds or so lol Smiley Tarky Titan Software NOT amused Smiley



So......I guess I should re-paste stuff and add the heatsinks better fan etc (stuff from the Swedish guy's video and his parts)

So I guess it has been for 4 of the cubes on my 6 cube unit been 1 week to go be 18 months..the consensus of Maxumark is my paste is turning to ASIC

faerie dust as we speak....so summer heat is coming....break it now trying to fix or burn it out later (its like Sophie's Choice for ASIC dies) Sad



Ack! Guess I'll have to "MAN UP" and just try to start the 1 die only working cube.....its one step away from paperweight anyway...(this could end badly) Sad

I would imagine there are a bunch of folk with NOV 2014 or so Titans that with Summer coming are also going to learn the joys of 're-paste and heatsinks' on Titan cubes.

Good to have goals Smiley

legendary
Activity: 2450
Merit: 1002
Well, this is a step forward: I managed to get the top off a die without destroying the carrier.

The dies are larger on this board, I think KNC did two revs:



More interesting, someone tried to fix this board. Bad job!



The secret is to use full board heat, then heat around the side of the chip, then very very gentle pressure to lever up the lid. Bitch on wheels to do, but in this case the top is off.

Hm.

Wait, WHAT?!?!? the dies shown here and the dies shown a couple pages back on forum ... are really really really freaking different in size! Are these both Titans?
If so, it looks almost like they could have used 2 different node sizes. Its such a substantial difference in size the revision would have been an insane undertaking. This is weird.

If these 2 pics are different products tho (IE: neptune vs titan) then such differences in size are normal.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
On a related note, I *HATE* replacing TPS chips. And note that a Rpi can short out the CPU chip which puts 5 and 12 volts on the lines to the TPS chip and FPGA. The resulting damage sucks to repair.

But sank the day today fixing this board. It's up and running, now the owner just needs to send over money for a Rpi, I'll order a new one, and send it on back....

C
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Is this good news for -> ?

It's interesting, and a solid day's worth of work. However thanks to someone sending over a literal taco board, I was able to work the lid off with the right amount of heat without breaking the chip free.

I'll work on it more next weekend, need to see if I can do this on a Neptune, then run the unit without the lid (sinking directly to the dies). Ifso I can try lobotomizing a die and see if I can clear the solder and get a nep to work. If so then I try someone's volunteer.

legendary
Activity: 915
Merit: 1005
Well, this is a step forward: I managed to get the top off a die without destroying the carrier.


Is this good news for -> ?

Quote
Titans, but some of the issues, especially the ones where the Titan shorts the controller are out of reach. Vegas, do you have a stencil to reball a Titan die? The big problem with them is related to the 1.2 volt lines to the die controller shorting out, if you can remove those pads or cut the traces under the board you would win big time.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Well, this is a step forward: I managed to get the top off a die without destroying the carrier.

The dies are larger on this board, I think KNC did two revs:



More interesting, someone tried to fix this board. Bad job!



The secret is to use full board heat, then heat around the side of the chip, then very very gentle pressure to lever up the lid. Bitch on wheels to do, but in this case the top is off.

Hm.
Pages:
Jump to: