Pages:
Author

Topic: Hacking KNC Titan / Jupiter / Neptune miners back to life. Why not? - page 35. (Read 76793 times)

hero member
Activity: 895
Merit: 504
lightfoot,

I got the controller and bridges back. Right after turning on the PSU, I knew that the controller will work by looking at the red blinking LED on the RPi and it worked. My other controller started working also. I am guessing PSU might be at fault before. Thanks.

BTW, I have 2 heavy duty bridges from qberty for sale. 0.5BTC each including shipping. If any body needs one, please pm me.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Update: Parts in, boards being fixed, always nice. Will tackle the one controller board that got totally burned later.

In the meantime, speaking of "burn" this is why you really should run your miners at a bit less than "full fucking blast" (yes, that is a Y connector. No, the house is not in danger of burning down. Yes... Well yes.)


Must have been some dirt. Hint: Don't unplug a cube with power on or plug in with power on. Sad things happen.

Sad:
Pulling the power molex sockets off is a bitch on wheels by the way. I recommend cutting the top three connectors, and melting normal solder onto the 3 bottom pins, then getting the connector off, then getting the cut top pins off.

More later.
copper member
Activity: 2898
Merit: 1465
Clueless!
Got a PM from Lightfoot he says 'go for it' I don't have time (my rant on stuff related on knc swedish thread link below) only 2 days off every 12 days so anyone
want to beat me to test this feel free ..in link below

https://bitcointalksearch.org/topic/m.13751378

but looking like 'spaghetti cables' may work also (no need for titan daughter board)

but I have all the parts old pi B+ 512mb and LCD to use...the 'cloned qbert supplied daughter board' and MY old 0ct 2013 KNC 4 port Jupiter board

but again may not get to it ...anyone with same parts from an old Jupiter/Saturn/Mercury/Neptune feel free to give the above cable option a try (hey if its sitting about wtf)

me I'll get to it when I can and let folk know if the daughter board option worked....but gbert who made the clone titan bridges has the schematics for such someone w/o

a cloned daughter board and an old jupiter or whatever card laying about and a pi could give the spagetti cable idea a shot


again if it comes up on the LCD and/or the GUI as a Titan with no cubes we are probably golden


also edited the above to my post with pics for clarity check out the reply on the other thread above for more details (ramblings or whatever)


hero member
Activity: 895
Merit: 504
lightfoot,

Need you address to send in my sick controller. Changing bridge and RPi didn't work. Controller still restarts every 2-3 mins, doesn't recognize any cube, LCD screen stays dimly lit w/ no bright LED or any LED.
Hey Hawk, bad news:

There's nothing wrong with your board.

More specifically I put on a test beaglebone and checked it with a neptune. Happy light, green light, mines fine, display works. Ran for a few hours, FPGA cool.

Popped your titan code into a sample titan. Works fine. Code ok.

Popped a default Titan code into it, works fine. Hooked it up to hash, works fine.

HOWEVER I did notice something: Your Pi doesn't have the standoff to the controller board. If I prop up the Pi by having the network cable on something nothing works. My guess is the Pi loses contact on its' header pins when not parallel to the bridge board. So you need a stand off there to keep it parallel to the board underneath it.

I'll let it run on test overnight, but sometimes nothing is really wrong....

That's sucks, but I am happy. For some reason, I couldn't get it to work, the pi would constantly restarts. Have you tried using the image on the SD card I sent? Is the LCD screen working, showing IP address and Titan firmware etc.? I have the plastic spacer which I could install. But curious why it was acting up when I tried to use it with or without any cubes. Please PM me your BTC address for shipping cost and your time.
Thanks.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
lightfoot,

Need you address to send in my sick controller. Changing bridge and RPi didn't work. Controller still restarts every 2-3 mins, doesn't recognize any cube, LCD screen stays dimly lit w/ no bright LED or any LED.
Hey Hawk, bad news:

There's nothing wrong with your board.

More specifically I put on a test beaglebone and checked it with a neptune. Happy light, green light, mines fine, display works. Ran for a few hours, FPGA cool.

Popped your titan code into a sample titan. Works fine. Code ok.

Popped a default Titan code into it, works fine. Hooked it up to hash, works fine.

HOWEVER I did notice something: Your Pi doesn't have the standoff to the controller board. If I prop up the Pi by having the network cable on something nothing works. My guess is the Pi loses contact on its' header pins when not parallel to the bridge board. So you need a stand off there to keep it parallel to the board underneath it.

I'll let it run on test overnight, but sometimes nothing is really wrong....
copper member
Activity: 2898
Merit: 1465
Clueless!
I keep repeating to myself: "I will not build a miner, I will not build a miner". While seductive, that way lies DOOOOOOM!!!!!!

Anyway, with nothing else to do but things I should be doing I'm screwing with the neptune/titan firmware. KNC has all sorts of weirdness in these config files, I got a Neptune to think it was a Raspberry pi but it doesn't survive a reboot. System goes "WHAT THE HELL AM I???" and resets the files.

Need a bigger mellon-baller here to do a proper lobotomy....




yeah ...getting a titan board to work with a bbb would be 'epic' esp if glen tarkin supported such with a version of his firmware


by the by still trying to be 'brave' and see if that 'clone bridge' i have will work with a pi and the below (see pics) Oct Jupiter 550gh 4 port board.

I figure I could just try it.....put the LCD in the slot..NO CUBES and if the LCD yelled it was OK and a Titan and the GUI came up with no cubes

I'd be golden

(got to be out of my frigging mind)

OR

I could simply send the works to YOU ....the proper Raspberry Pi B+ 512mb for a Titan (me thinks that is correct or at least what I got from KNC rep
for having about as a spare with NOV 2014 PI fail rates on shipping etc)....the 'clone titan bridge (identical in all ways to KNC version incl silk screen logo lol)
and the LCD screen I don't use

Then once it is in your hot little hands perhaps you could look it over and just use 'pin cables' with the WITHOUT THE NEED FOR ANY proper daughter board placement to the PI and just
plain 'get around' needing a 'daughter board clone' at all....lots of spaghetti cables but hey maybe we are 'overthinking' stuff ..it is NOT like KNC put any real thought in this (IMHO)
easy, cheap and lazy could be to our 'advantage' on such a unicorn project.......would be ugly as hell but wtf ..if it worked a lot of folks would be very happy Smiley..ju

but anyway options...or I can just break down someday when I get time and just 'blow it up' myself Smiley (ah hobbies the joy of despair yet joy of destruction from cool sparky explosions)


let me know here are the pics for all to see ..just in case they have an ugly OCT 2013 Jupiter/Saturn/Mercury BBB setup the same ....still from back in the day...
if the links don't take look under the Jupiter 550gh Album on the link below ...All should be public ...my somewhat 'dubious' evil KNC Shirine ..don't ya know Smiley

lostgonzo.imgur.com





An 'uglier' KNC 4 port card ..has likely never been shown on here.... ugly duckling that it is....I'd like to see if we could make it into a Swan' (or roast duck whichever) Smiley

If nothing else this works..we probably will have a 'good idea' of why the 4 month delay on Neptune's (going into data halls as they yanked out the Jupiter's) and on
a side note what happened to the controllers for these ...Jupiter units and likely Titans units..thus their also 4 month delay.....why make new port boards when you
can recycle! (assuming the above works and I don't blow up everything ..big assumption that this works) Sad

if that is the case (conjecture on my part) you have to give KNC credit they really, really do have this who 'Evil Genius Empire' thing down pat! Smiley


EDIT: Got a PM from Lightfoot he says 'go for it' I don't have time (my rant on stuff related on knc swedish thread link below) only 2 days off every 12 days so anyone
want to beat me to test this feel free ..in link below

https://bitcointalksearch.org/topic/m.13751378

but looking like 'spaghetti cables' may work also (no need for titan daughter board)

but I have all the parts old pi B+ 512mb and LCD to use...the 'cloned qbert supplied daughter board' and MY old 0ct 2013 KNC 4 port Jupiter board

but again may not get to it ...anyone with same parts from an old Jupiter/Saturn/Mercury/Neptune feel free to give the above cable option a try (hey if its sitting about wtf)

me I'll get to it when I can and let folk know if the daughter board option worked....but gbert who made the clone titan bridges has the schematics for such someone w/o

a cloned daughter board and an old jupiter or whatever card laying about and a pi could give the spagetti cable idea a shot


again if it comes up on the LCD and/or the GUI as a Titan with no cubes we are probably golden










newbie
Activity: 11
Merit: 0
Hey, if u have the ability to build a miner more power to you!... u would make a fortune, thats the only reason why ASIC companies exist. Those who supply the shovels in a gold rush era are the only ones who really profit =)
No no no no no no nonononononono.

I would try to make the damn thing reliable, solid, stable, stuff like that. Which would make it totally uneconomical to sell in any quantity when the fucking shysters are selling junk that works at first, but when it blows up they're gone (see: exit strategy). Oh and I would be up against people who raise millions in "venture capital" with either pre-orders, groupons, or fuck knows what else is used to finance today...

That's the #1 problem with the concept of building miners: It's a race to the absolute bottom in quality. Not worth fortunes.

Instead I'll sharpen shovels, supply tips on making the shovels better, stuff like that. Titans can take a very interesting edge.... :-)

LOL
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Hey, if u have the ability to build a miner more power to you!... u would make a fortune, thats the only reason why ASIC companies exist. Those who supply the shovels in a gold rush era are the only ones who really profit =)
No no no no no no nonononononono.

I would try to make the damn thing reliable, solid, stable, stuff like that. Which would make it totally uneconomical to sell in any quantity when the fucking shysters are selling junk that works at first, but when it blows up they're gone (see: exit strategy). Oh and I would be up against people who raise millions in "venture capital" with either pre-orders, groupons, or fuck knows what else is used to finance today...

That's the #1 problem with the concept of building miners: It's a race to the absolute bottom in quality. Not worth fortunes.

Instead I'll sharpen shovels, supply tips on making the shovels better, stuff like that. Titans can take a very interesting edge.... :-)
legendary
Activity: 2450
Merit: 1002
I keep repeating to myself: "I will not build a miner, I will not build a miner". While seductive, that way lies DOOOOOOM!!!!!!

Anyway, with nothing else to do but things I should be doing I'm screwing with the neptune/titan firmware. KNC has all sorts of weirdness in these config files, I got a Neptune to think it was a Raspberry pi but it doesn't survive a reboot. System goes "WHAT THE HELL AM I???" and resets the files.

Need a bigger mellon-baller here to do a proper lobotomy....



Hey, if u have the ability to build a miner more power to you!... u would make a fortune, thats the only reason why ASIC companies exist. Those who supply the shovels in a gold rush era are the only ones who really profit =)
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
I keep repeating to myself: "I will not build a miner, I will not build a miner". While seductive, that way lies DOOOOOOM!!!!!!

Anyway, with nothing else to do but things I should be doing I'm screwing with the neptune/titan firmware. KNC has all sorts of weirdness in these config files, I got a Neptune to think it was a Raspberry pi but it doesn't survive a reboot. System goes "WHAT THE HELL AM I???" and resets the files.

Need a bigger mellon-baller here to do a proper lobotomy....

newbie
Activity: 11
Merit: 0
Well if the failure is under the chip in the chip itself then one is fucked. Removing the hashing chip is possible (did it with a Neptune) but reballing it is impossible for a person with 80mm reballing tools. Reballing a simple BGA is enough of a pain, trying to do a Titan would be damn near impossible. Still, I am wondering, maybe there is something else on the board that could be shorted. If we had a damn schematic this would be simple. But we don't, unless someone does........

I suppose I could try applying a large current voltage to blow apart the short, but that is a serious one way trip. I'll talk to the client first and get their guidance.

Yes, that might be a bit extreme, but if the unit cannot function at all, then it is already toast. There are basically only two effective ways to map out the layers and traces. You can sand down a donor board, layer by layer, photographing each layer. Or, you can use x-ray and have each layer mapped that way. In any case, with the new 3D PCB printers coming out, we are getting closer now to being able to replicate these at home. You really only need a Gerber file to upload to the new printers for the printing process. The costs have to be computed however. If it costs $150 to replicate a board, without the ASICs, then it is a viable project, solving almost all of the inherent problems. I suspect that the traces on the existing boards are too thin for the amperage coursing through them.

The only other problem is the ASIC chips. Those can be backward engineered. I have searched for the patents on the ASICs themselves. I have yet to find a single patent for these in Sweden, or anywhere else for that matter. I am sure there are Chinese companies willing to backward engineer them, but now we are back to cost.

http://www.ast.co.il/#!knc-alchip-ast-titan-ltc/cdt1  I reached out to these companies (except KnC) to see if there were leftover ASIC chips which could be purchased in bulk. I heard nothing back from them.

All other components are readily available on the market.

Finally, there is the risk associated with Litecoin price (basically tied to Bitcoin performance). By the time you get through the whole process, it could be another year and, depending on the number of first-run chips pre-purchased, the demand for re-manufactured boards (ASIC and all) could be more than what they would produce on a profitable basis, through mining. Based on current Titan costs on eBay, and the present profitability of the units mining, it could effectively be very profitable, provided the technology is properly upgraded, with the board architecture tweaked to higher performance standards, especially the lower grade traces on the layers, which are probably not enough for the amperage load they presently see.

Anyway.... it is only a thought at the moment. I know how sensitive KnC would be to have their products re-manufactured by a third party, even though they are out of the consumer market, and no longer have these produced for themselves. More or less, this would be more easily accomplished under a licensing agreement with KnC, but would need to include some re-engineering of the architecture, regardless.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Well if the failure is under the chip in the chip itself then one is fucked. Removing the hashing chip is possible (did it with a Neptune) but reballing it is impossible for a person with 80mm reballing tools. Reballing a simple BGA is enough of a pain, trying to do a Titan would be damn near impossible. Still, I am wondering, maybe there is something else on the board that could be shorted. If we had a damn schematic this would be simple. But we don't, unless someone does........

I suppose I could try applying a large current voltage to blow apart the short, but that is a serious one way trip. I'll talk to the client first and get their guidance.
newbie
Activity: 11
Merit: 0
And a sad, sad little epilouge here...

Started removing the erickson power supplies. One came up a mess, it's literally got a smoke mark under the supply, right where the power supply's control chip is. Sure enough the chip is cracked and burned.

My guess is the back-feed from the failed 12v supply took out the 3.3 line and burned out things on it. It's shorted, but even after removal the board is still shorted. My guess is the reason these boards have burns under the main chip is that's where the 3.3v line goes to power the hotel circuits on the main hashing chip. And with that shorted, board no work no more.

Crap. Must have been a really big power supply. Oh well, will take a break for awhile and think about next steps with the remaining units.

So you are saying the entire board is completely toast, not repairable? If there is a short between the 3.3V and the 12V power feeds, it would affect the entire chain, back to the 10-pin connector. Next question: Is it possible to sever/isolate the entire chain, from ASIC, through associated VRMs, to 10-pin to keep that path from affecting the other operating pathways/ASICs, in effect disabling that entire branch of the board? Are these particular pathways accessible on layer 1 (surface layer) or, are there other layers of the board involved, which are not accessible (much work, much risk)? In effect, if you can isolate the entire branch, severing the connection with the rest of the functioning branches/board, you would have recoverable branches allowing some degree of continuing functionality.

Yep, it sure sounds like KNC just took as many short-cuts as possible with these units. There were no provisions made for fail-safes to protect the various branches of the board, should a terminal failure occur with any of the branches independently. :/

This thread is highly educational and well-valued. Smiley
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
And a sad, sad little epilouge here...

Started removing the erickson power supplies. One came up a mess, it's literally got a smoke mark under the supply, right where the power supply's control chip is. Sure enough the chip is cracked and burned.

My guess is the back-feed from the failed 12v supply took out the 3.3 line and burned out things on it. It's shorted, but even after removal the board is still shorted. My guess is the reason these boards have burns under the main chip is that's where the 3.3v line goes to power the hotel circuits on the main hashing chip. And with that shorted, board no work no more.

Crap. Must have been a really big power supply. Oh well, will take a break for awhile and think about next steps with the remaining units.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
Ok, tighten up this is where it gets *really* interesting.

So I'm fuzzing both this board and a dead Neptune. The question is where the fuck is everything and what the fuck is going on?

Let's start with the basics: SCL requires four things:

An SCL clock
An SCL signal line
A vcc (typically 3.3v)
A ground

We're going to assume here for laughs that SCL is what these clowns are using. Ok. So what does the 10 pin connector do?

Well, we can reverse engineer things. We know that on the titans pin 4,6 are shorted. Great. We also know that pin 8 is broken and pin 2 "works". And we know what a LM75 and a EEPROM looks like.

Here is where things go on a Neptune:
SCL clk: pin 2 on 10 pin. Ok that makes sense as a clock.
SCL signal: pin 5 on 10 pin. Once again, ok.
Ground: Frame ground and pin 10 on 10 pin. Fair.
vcc: Here is the weirdness. On a Neptune it goes NOWHERE! None of the 10 pins register.

So how the fuck does a Neptune generate the power? It doesn't come from the ribbon and there is no other power supply.... Aw fuck-tarts.... Checking the 14 pin connectors on the back I see that pin "2" on the ericksons which is flagged as a "remote" or something. My guess is KNC pulled the 3.3v for the hotel stuff from there. Explains why some Neptunes don't appear till powered up unlike Titans.

 Well, if that is bussed over, how about clk and signal? Checking shows yup, they show up on the Ericksons as well which means that everything is on one SCL bus. Great.

So what is with pins 4,6,8 on a Titan? My guess is they separated things a bit. Maybe. However if they are connected then a blown supply could cause a ground loop and bye bye circuits.

More importantly since the power is generated by the 3.3v supplies on the Titans, the failure could be anywhere along that supply chain. Time to remove power supplies! AAARRRGGGHHH!
legendary
Activity: 1098
Merit: 1000
The second one is a bit more interesting. Try going to -.0366 or so and 50mh on the chip. Watch to see if the supplies turn on for awhile then turn off or start at 0 volts. Also check the heat sink connection, and if there is junk/crap all over the power supplies especially the ones on the sides.

Will have a play on Monday when I'm at the location my machine is at and let you know.

Question, on ASIC 4 DIE 1 ... does that DCDC work fine after a power cycle? or is it permanently like that?
Also, due to only having one DCDC powering the ASIC I would suggest lowering voltage a bit more or lowering ur clock, that remaining DCDC is pumping over 46A which is way beyond spec for these DCDC's ... 43A is bout the max I would recommend for longevity. My upcoming firmware will attempt to power cycle the DCDC's to bring back the DCDC 1 thats messed up ... so in ur case if its permanently damaged u may have to turn off that ASIC entirely w/ my upcoming firmware release to prevent constant bfgminer restarts.

ASIC 5 DIE 2, I Theorize the voltage is a higher because there is barely any load on the DCDC's at that clock setting...compared to the load of a normal ASIC clock setting.

Re Asic 4: All the dies that do that have been like that since I got the machine. I have tried lowering voltages/speeds but it doesn't seem to make any difference to the Amps so just settled on the compromise of slightly over.  Server room is air conditioned so not really worried about temps and has probably helped to keep them going.

I intend to do as above and mess around with them all on Monday to see if any of the ones I currently have off can be brought back to life through your firmware.

Thanks to you both Smiley
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
My wife wonders why I spend the day slaving over a hot iron (soldering iron).....

Ok, more research on the shorting Titans: Pin 4 and 6 are very interesting. They go to not only the LM75 and the EEPROM, but also to the four itty bitty chips U19,18,17,9 and the big cap C150 and I would bet that weird cap C47 as well.

Each of these chips has a small capacitor between that line and frame ground. What I'm beginning to think here is that the power supply failure blew apart the (non) isolation between the 12 volt rail and this little 3.3 volt somewhat isolated rail which went between boards and blew up a lot of stuff.

Unfortunately removing those caps and chips did not clear the short. More knowledge but still stuck. I was hoping it was a shorted cap, that could have been enough to sink everything. Drat.

And for those following along, here is a map of the 10 pin connector along with what you should see to ground and what I *do* see to ground on bad boards that blow up fpgas....
Pin        Good   Bad
pin 1----open   open
pin 2----6.6k    5.6k      pin 8 on rom, pin 1 on lm75
pin 3----open   open
pin 4----.9k     short      pin 2 on lm75, pin 2 on rom
pin 5----open   open
pin 6----.9k     short      same as pin 4
pin 7----6.4k   5.59k
pin 8----hopped to megohms  39 ohms
pin 9----open   open
pin 10---short  short

Edit: Cutting the lines to pins 2,4,8 results in a controller that doesn't shut down but still doesn't work even with 12 volts off. Which means those power the pins to the LM75 and EEPROM to wake those up. Damn!
legendary
Activity: 2450
Merit: 1002
@ Lightfoot ...

hey dude, give the tech docs a read for the DCDC's ... they are highly configurable, switching frequency and a whole host of settings like 90+ settings can be configured on these DCDC's ... u know more bout he electrical side of things than myself. Maybe you can figure out some more optimal settings to run these DCDC's at, I believe KNC largely just runs them stock cuz I dont see much in the code that really configures the DCDC's at all.

But all the safety precautions Ive coded into my firmware are based off the tech docs, such as overtemp & overcurrent situations etc...
full member
Activity: 133
Merit: 100
I must say - Lightfoot is an amazing contributor and an incredible person.  I sent him 3 broken KNC controllers (1 FUBAR) he got them fixed in a VERY timely manner, Always professional in conversation.  A TRUE asset to the BTC world.  I have been mining for about a year and I have to admit it was a little weird sending someone I have never met some equipment to evaluate and repair.  But I will vouch for this dude forever!  I wish all of my "chances" that I have taken in the BTC world would have gone this smooth.  

Keep doing what you are doing Lightfoot.  All others could learn from your example!

Sincerely, Smiley

Boomin
legendary
Activity: 2450
Merit: 1002
Further to my post a few pages ago, https://bitcointalksearch.org/topic/m.13519222

I turned my machine off shortly after as wasn't really profitable to run anymore in the UK, but the recent spike in rates on nicehash brought it back to life, after talking with GenTarkin I bought his custom firmware and it did cure the problem.

Here are a couple of other problems that my machine has, I'm sure you've come across them already but just for info in case you haven't



Asic 4, Die 1 (The Half Running Die)
I have quite a few of these across my 6 cubes, I presume this is just a faulty asic chip and nothing can be done

Asic 5, Die 2 (The Half Speed Die)
Have a couple of these as well, they will only run at much reduced speeds as the voltages are very high, these are a mystery and am curious on the problem.


Question, on ASIC 4 DIE 1 ... does that DCDC work fine after a power cycle? or is it permanently like that?
Also, due to only having one DCDC powering the ASIC I would suggest lowering voltage a bit more or lowering ur clock, that remaining DCDC is pumping over 46A which is way beyond spec for these DCDC's ... 43A is bout the max I would recommend for longevity. My upcoming firmware will attempt to power cycle the DCDC's to bring back the DCDC 1 thats messed up ... so in ur case if its permanently damaged u may have to turn off that ASIC entirely w/ my upcoming firmware release to prevent constant bfgminer restarts.

ASIC 5 DIE 2, I Theorize the voltage is a higher because there is barely any load on the DCDC's at that clock setting...compared to the load of a normal ASIC clock setting.
Pages:
Jump to: