*nod* I'd agree, the Monarch is not going to work very well in racks in a data center. The more I watch reports of failures, and the more I think about it, the more I see that the big failure here is that it's throwing heat in 3 directions:
- There's the heat coming off the radiator on the end. A lot, but well managed; it blows out and away.
- There's the heat coming from the back of the board. Also a lot, but going sideways.
- And there's the heat coming off the tops of those FETs. Just drifting off the other side.
That's probably the killer: If you put a monarch next to another monarch, the heat from the back of one set of fets blows onto the front of the one next to it. Making those fets hotter. And then next one you put down is going to get even hotter air blown on it from the neighbors. Eventually the FETs overheat, conduct, and short out. Or they throttle a lot and run slow. But with limited heat transfer ability I could see a top fet overheating before the temp sensor on the bottom picks it up.
I think that's the #1 problem with these things (technically): If you don't have some air flow, a heat sink is nothing but a metal blanket on a part. And heat sink efficiency depends completely on the difference in temp between the part and the ambient air temps. Blow hot air on a FET with a heat sink and it's not going to work. Blow air front to back, and you will just cavitate the air on the back heat sink (essentially fighting with the fan), and totally miss the tops of the FETs (due to the water block).
Top to bottom fans would be better, but then you need to move the heat into the hot aisle with baffles or something. And the bottom of the tray will limit the air flow. But in any case, the bottom FETs are going to get hotter and conduct more than the top ones. Which will result in the bottom ones conducting too much, going into thermal runaway, and boom.
The second thing is a basic problem: Those FETs are just packed in too damn close. On the Singles you had six FETs in a small area, but then you had board space for the heat to dissipate. Here you have 18 FETs on one side, the 18 on the other between two of the hottest things on the board and the FET drivers in the middle (which generate heat because they have to switch the gates under a 24-1 voltage ratio). There's just nowhere for the heat to go but out the back, and the high density means that a lot of heat needs to be moved off the board quickly.
The back heat sink doesn't get too warm, which sounds nice except that it means the thermal transfer from the board to the sink is limited. That's why the FETs went to 100c while the heat sink on the back was cool to the touch. A front fan fixed that, but still not optimal if you have more than one Monarch.
Kind of reminds me of the singles: Putting half the FETs on the ends of the board was a good idea. But putting the other half in the center of the board, between the heat sinks for the chips, was a bad idea. That's why you could either run them like banshees with full fans in the cases, or more quietly if you took them out of the case and put a single fan blowing air in from the *side* against those center heat sinks.
Ug.
I'm getting a blown one in tomorrow, shorts power supplies and the pump was leaking, I'll post what I find as I fix it. Could just be a shorted FET, if so removing the fets on that side should get half of it going and putting on a new FET should fix it. We'll see.
Actually if this one's plumbing is bad as well I might try to splice in one of the sedion water blocks I used on a jally and put that on the back of the board while purging the water loop. Then *all* of the heat would go to the radiator and out, that might do it. Or just put a water block with it's own radiator on the back and see what happens to the FETs on the front.
Hm.