Author

Topic: Dual PSU rig, one PSU dies for no reason! (Read 2802 times)

sr. member
Activity: 251
Merit: 250
February 28, 2014, 10:07:31 AM
#19
I don't think it will make a difference the software uv, unless you flashed the wrong bios, but if you dump it and modify it I don't think that is the issue.

If the secondary psu shuts off it is a matter of time before the whole system becomes unresponsive. Now the system also becomes unresponsive without the psu shutting off some times?

Some of your risers burnt? I don't think that backfeeding is what burned your risers, but I've never burnt a riser before. I've gotten some risers that where dead on arrival, you sure this is not the case?

CPU and ram seems fine to me.

It would be interesting if you enabled log in gpu-z so we can check the values at the moment of the crash. Perhaps the voltage is dropping for some reason and it becomes unstable.

Also this cards run the vrms really hot, so perhaps it would be usefull to try to set the fan speed to a fixed value to see if it makes any difference as well. Remove --auto-fan and add --gpu-fan 80

And finally, for now, are you using latest stable driver (13.12) or latest beta? Sdk 2.9? Does the driver failed and recover some times while you are mining?
newbie
Activity: 43
Merit: 0
February 28, 2014, 07:37:06 AM
#18
Hey!

I tried flashing because I thought it would be the best and most stable. But I'll try software today. However, if flashing made the rig unstable, I'm not sure if software will be much better Sad

So the crash takes quite a while actually! Sometimes the rig will run for two days (zero HW errors, less than 3% rejects, running smoothly) and then drop off! I'll go and check on it and all the GPU fans are on, CPU fan is on but the system is not responding so I have to hard reboot it...then it runs again!

Or sometimes the rig drops off but when I go check on it, the OS is still working but cgminer failed because the 2nd PSU died causing three cards to turn off.

I did run a couple rigs before without the splitter using the paperclip method. But I feared that backfeeding is what burned a couple risers after turning one psu on before the other.

The CPU is a socket 1150 3.0Ghz G3220 and I have 8GB 2x4 GSkill ram in each.

Thank you so much for helping! Smiley
sr. member
Activity: 251
Merit: 250
February 28, 2014, 02:22:04 AM
#17
You can uv sapphires by software in windows, I don't know why that doesn't work for you. I think in linux the only way is to flash them.

Now if a rig is drawing 1550w at the wall, then it is actually drawing 1327w / 5 its 265w and you should actually deduct the power of ram and cpu, so it is even less. If the crash doesn't take long after you start mining you should check the psu temperature. If it is not hot to the touch then it was not being pushed to its limits.

If you want to know for certain they can stand 3 280x measure the wattage of a single psu as I said before.

Another test you can do is to run the secondary psu without the splitter. Just have it on by shorting the green and black wires, turn that one on before turning the primary.

Can you specify as well how much ram each rig has and what cpu model as well (1150 is the socket)?
newbie
Activity: 43
Merit: 0
February 27, 2014, 08:48:48 PM
#16
Alright people, here's an update:

Please see if there's something I can do or something I'm missing. I'm losing my mind and sleep over this.

I'm ready to reward someone in some way if you help me out with this...

Let me re-iterate what I've got setup with some more details:

I've got 10 rigs all setup with five Sapphire DUAL-X 280x cards and I've followed The Stilt's instructions after reading his very informative and educated guides.

My rigs are setup as follows:

- 5x Sapphire 280x DUAL-X Taihiti Memory (1020Mhz clock stock, 1.225V stock voltage)
- Cards show ASIC quality anywhere from 65%-70%
- 5x powered risers (ribbon)
- 2.8Ghz 1150 CPU
- ASrock H81 Pro BTC Mobo (since I'm powering risers, I omitted powering the two extra molex on the mobo)
- Basic sata HD
- 2x Corsair HX850 Gold+ PSU's
- Windows 8.1, latest Catalyst
- each rig on it's own 15A circuit!
- Measured running ~1525W at the wall measured with killawatt (which is already close to the max 1600W the circuit can handle)

Power is split as follows:
- Primary HX850 PSU:    Mobo, CPU, HD, 2x 280x cards, ALL risers
- Secondary HX850 PSU: 3x 280x cards

My settings are as follows:

-I 13 -g 2 -w 256 --thread-concurrency 11200 --gpu-engine 1060 --gpu-memclock 1500  --temp-target 75 --auto-fan

I get ~735kh/s & 75C at these settings and this is the MAX I can squeeze out. I tried TC 8192 like Stilt suggests but these cards are much more stable and don't get sick at 11200.

Here's my problem...
At these settings, the cards are pulling 300W EACH (I'm including the riser in this measurement as well). Which is way too much for dual 850W PSU's to handle, so I thought I could undervolt them as before buying them, I double checked online and people had success with this model. Running these cards at stock voltage with my settings eventually causes rigs to shut down on their own so I know it's the power. I ruled out risers and cards by swapping out. I recently even had one of the secondary PSU's shut down and puff smoke due to me testing three cards on it (3x300W = way more than one 850W PSU can handle).

I extracted the BIOS: https://www.dropbox.com/s/kycqagwn3fdjx2l/Tahiti.rom
I edited the bios to drop the voltage using VBE7 and tried every possible configuration and flashed the cards using atiflash successfully (these cards have a button where I flashed the secondary BIOS).

I edited fields #6 and #0 in "State 1 - Performance" with values 1200, 1188, 1150, 1163, 1088, 1131, 1100. I then ran my batch file to test. The rig power pull was great! I got it down to 1400W or even 1360W at some of these voltages...but.....ALL of these values caused either drivers to crash, some cards to not hash at all or cards eventually go sick. I also dropped my engine clock to 1020 (stock) at these lower voltages to see if that would help...but it didn't.

The ONLY "undervolt" that runs fine is 1200, but that's incredibly close to the stock 1225 and doesn't make a difference in power draw Sad

So my options are to either take one card out but I still don't know how I would split up the load. Get bigger power supplies, but I'm already maximizing the circuit. Drop the hash...I dropped the core to 1020Mhz to see how much power I'd save, but it's barely 80W for the entire rig. Or hopefully you can help me with some magic to find the happy configuration.

TL;DR: Please help me find a way to comfortably run 5x 280X GPUs on dual 850W PSUs on these rigs.

Sorry for the novel, but you guys are honestly my only hope. I don't know what else to do or who to turn to. I tried sending The Stilt a PM but understandably his message box is locked.

I've spent so much time and energy building this that I would hate to stop all this and sell off my parts....

Thank you for reading this and I really hope I'll hear from you. Smiley

Sincerely,
llamabucket
sr. member
Activity: 395
Merit: 250
February 27, 2014, 03:55:57 PM
#15
I just dont get along with that adapter.
When i turned on the rig and notice the second psu switch hasnt been turned on, i turned it on and something pooped inside the 2nd psu with burnt smell.
Maybe you did what i did? Turning on the psu switch after the first turned on ?
sr. member
Activity: 251
Merit: 250
February 27, 2014, 02:06:33 PM
#14
Well if it is guaranteed to be at least 87% then the rig is pulling at least that value. It could be more or less, also psus work more efficiently on 220 than on 110, so the % may vary for different voltages.

Also if your electricity supply is not stable and it you are receiving less than 110 or 220 then the psu will be actually drawing more watts and producing more heat. I have a 1200w psu that was working really hot with 3 280x, mobo and cpu and it was working with just 195v. If you are in a developed country this is very unlikely though.

Place the kill-a-watt on just the psu that normally shuts off and see that value.

Also, with which psu are you powering the risers? You might be mixing the psu's there, since the primary psu is connected to all the pcie and you are connecting the secondary as well there.
newbie
Activity: 43
Merit: 0
February 27, 2014, 01:26:07 PM
#13
Alright so here's the lowdown. At the settings I'm running here is what it looks like at the wall:

rig with 1GPU @ full load = 320W
rig with 2GPU @ full load = 623W
rig with 3GPU @ full load = 914W
rig with 4GPU @ full load = 1200W
rig with 5GPU @ full load = 1525W

623-320 = 303W
914-623 = 291W
1200-914 = 286W
1525-1200 = 325W

So it's fair to say that my cards are drawing about ~300W per card...which is more than I anticipated!

Based on what you guys have said, my dual HX820 Gold+ PSU's are 87% efficient at full load.

So taking the wall reading: 1525W*0.87= 1327Watts

Does this mean that the rig is pulling 1327W TOTAL then?
hero member
Activity: 546
Merit: 500
February 27, 2014, 12:19:56 PM
#12
Use trixx to undervolt them. In settings tick the option force constant voltage. It is normal to see a lower value in gpu-z than the one you set on trixx.

My first guess was one of your gpus died. But 75C is a good temp, so I'm not so sure about that now.

But yes you may be underpowering them and the reason the 1020 watts psu died is just because it was faulty. Also some psus don't have the same power as advertised, I haven't checked your model, but reading a good review that tests the psu on full load for a long time can help you see the psu performance.

Another option would be that the rail the gpus where connected didn't support that power draw. That if the psu doesn't have a single rail for its 12v outputs.

Or if you mixed up the power connectors from 2 psus into one gpu it may cause one psu to discharge on the second one. It can also provide 2 different voltage to the gpu and that can make it unstable.

Also I understand the efficiency works the other way around. A 90% efficient psu will spend more electricity at the wall than a 80% one.  So if you are drawing 1550w at the wall the  rig is drawing 1395w (1550*.9). A good psu should be able to run at full load if room temp is below 40 C.

You should test how much each psu is drawing individually to check none are getting overpowered. You should also log the voltage in gpu-z to see if the voltage supply is stable or it oscillates to dangerous values.

You are correct I was completely backwards on the calcs, efficiency is how much power it needs to draw from the wall to achieve the wattage rating!
sr. member
Activity: 251
Merit: 250
February 27, 2014, 11:12:39 AM
#11
Use trixx to undervolt them. In settings tick the option force constant voltage. It is normal to see a lower value in gpu-z than the one you set on trixx.

My first guess was one of your gpus died. But 75C is a good temp, so I'm not so sure about that now.

But yes you may be underpowering them and the reason the 1020 watts psu died is just because it was faulty. Also some psus don't have the same power as advertised, I haven't checked your model, but reading a good review that tests the psu on full load for a long time can help you see the psu performance.

Another option would be that the rail the gpus where connected didn't support that power draw. That if the psu doesn't have a single rail for its 12v outputs.

Or if you mixed up the power connectors from 2 psus into one gpu it may cause one psu to discharge on the second one. It can also provide 2 different voltage to the gpu and that can make it unstable.

Also I understand the efficiency works the other way around. A 90% efficient psu will spend more electricity at the wall than a 80% one.  So if you are drawing 1550w at the wall the  rig is drawing 1395w (1550*.9). A good psu should be able to run at full load if room temp is below 40 C.

You should test how much each psu is drawing individually to check none are getting overpowered. You should also log the voltage in gpu-z to see if the voltage supply is stable or it oscillates to dangerous values.
newbie
Activity: 43
Merit: 0
February 27, 2014, 10:48:00 AM
#10
So my HX850W PSU's are only 87% at full load, correct?
If that's the case then. .87*850 = 739.5W

Then 739.5*2=1479W total for two units in a rig. Basically, I should figure out how to run my cards so the rig stays below this level and all should be well? Smiley
hero member
Activity: 546
Merit: 500
February 27, 2014, 10:36:24 AM
#9
http://en.wikipedia.org/wiki/80_Plus

Have a read there to understand rating and efficiency.

Just because a Power Supply "says" it's a 1050 watt 80 plus gold means that the power on whatever rail is rated at the wattage stated for your HX1050 it is actually rated at [email protected] which = 1050 watts.  So far so good.  Next 80+ gold means that at 50% load the power supply will achieve 90% efficiency or .9*1050 = 945 watts at FULL 100% load it will achieve .87*1050 = 913.5 watts.  These are the rates the specific power supplies have been "Certified and tested" to operate at.  Can they do more than that?  Yes some are better than others, but the 80+ rating is all that is guaranteed from a manufacturer

Stats on the HX1050
+3.3V@25A, +5V@25A, [email protected], [email protected], +5VSB@3A
80 PLUS GOLD Certified

Percentage of rated load    20%   50%   100%
80 Plus Gold               87%   90%   87%

newbie
Activity: 43
Merit: 0
February 27, 2014, 09:19:53 AM
#8
Sorry, yes the 1050W is what I meant.

I'm not well educated about all this % efficiency stuff so this is interesting. So you're saying that at the wall is going to be different than what it's actually drawing?

I'm going to test out what one card pulls right now by testing the rig with one card, then adding another and taking the difference. I have the GPU's running at these settings:

i13, GPU 1060, MEM 1500, Powertune +20

The cards are maxed out with these settings so I'm guessing I'm REALLY pushing them! Also, I'm having a really hard time figuring out how to properly undervolt. cgminer shows one value and GPU-Z shows another (while mining I see cgminer showing 1.175V and GPU-Z fluctuating between 1.031-1.035V VDCC, are these supposed to be the same?).

When I change voltage in afterburner or trixx or whatever, where should the value change?

I really appreciate your help nachius and money! I'm losing sleep over this Sad
hero member
Activity: 546
Merit: 500
February 27, 2014, 09:02:03 AM
#7
I'm going to guess your power supplies are under powered.  The HX1050?  I think you said 1020, but didn't see that model the HX1050 has 1050 watt on the 12v at 88% efficiency is 924 watts.  IF the cards are in fact pulling near 300 watt each and the 2 PSUs are BOTH plugged into the motherboard it will also be drawing power from the 2nd power supply.  That being the case you are drawing more than the PSUs can steadily supply.  You said nearly 1550 at the wall 2 850 watt gold rated is 90% efficiency is 1530 watt.  I think you're under powered!
newbie
Activity: 43
Merit: 0
February 27, 2014, 08:34:10 AM
#6
MoneyMorpheus, thanks for the help.

The 1st PSU isn't dead, it's the 2nd one that turns off! All my cards are running at ~75C so that shouldn't be a problem. I'd LOVE to undervolt these cards but it seems none of them are reacting to my undervolt settings? I tried Afterburner, Trixx and in cgminer with no visible change in voltage within cgminer!?

Maybe I'm doing it incorrectly. I'll try again today with all those apps and see if I have better luck, and I'm also going to try and measure how much one GPU draws at the current settings and get back to you.

Also, MoneyMorpheus, how do you explain the more powerful 1020W PSU also turning off with three 280x cards?! It should be more than adequate!

I just woke up to other rigs starting to turn off...sigh.
full member
Activity: 210
Merit: 100
February 27, 2014, 08:31:28 AM
#5
Thanks for the reply dude...i have a feeling its something to do with the powered riser im using on the pci-e slot....im reading people are cutting these to avoid using too much/mixing power...i need to read a little further into this...
sr. member
Activity: 251
Merit: 250
February 27, 2014, 01:33:26 AM
#4
Are you sure the 1st psu is dead? With nothing plugged the psu short the green wire with any black on the mobo connector, it should start if its working.

As for the second psu if you are positive the smell came from it and not one of your gpus then its short protection, if it had any, didn't work and failed.

At what temperature you run your cards? Dual-x don't have temp sensors on the vrms and they can overheat, specially without undervolting. A capacitor may have blown in one of them.

I wouldn't trust this cards above 85 C without some serious undervolting, and I've run gpus at 110 C before without breaking them. It's easier to stick some fans in between them. If your cards are working dangerously hot you will notice some white spots on the back of the pcb over the tracks near where it connects to the psu.

If you are not sure or scared about undervolting then just undervolt it to 1175. It is less than stock, not much, but it will reduce your temperature a bit and I haven't seen a dual-x that is not stable at this voltage.


As for quark same goes for you. If your card died it was probably due to heat and not the psu. Check your temperatures and add more space in between gpus.



full member
Activity: 210
Merit: 100
February 27, 2014, 01:06:12 AM
#3
Hey man, let me know if u get to the bottom of this as im having a similar problem except im powering 4 sapphire dual X off 2 750w bronze psu (1 powering 2 cards, 1 powered riser and mobo and 1 is powering 2 cards) these have been running fine for a good week or so , until today the same thing happened to me...but wen i switched psu to test it...as soon as i turned it bk on ...smoke fizzed out of 1 of the cards and that burning smell filled the room  Cry i wanted to cry....luckily the shop switched both psu and gpu 2day aswell..however the same thing is starting to happen again...im scared to keep trying it as i dont wana mess any more cards up. Even if they were drawing 300w each..i should still have a bit of leeway...i undervolted one rig using Trixx and tried some sensible settings but still 1 psu turned its self off? Help plssss!!
newbie
Activity: 43
Merit: 0
February 26, 2014, 09:38:44 PM
#2
After chatting with some guys on IRC, some are suggesting that the 2x cards running on the secondary psu may be overloading it. People say that an OC'd 280x can draw close to 300W...in which case that would make sense.

However, why did my 1020Watt PSU ALSO die with 3x 280x cards plugged into it?

Any input?
newbie
Activity: 43
Merit: 0
February 26, 2014, 09:27:01 PM
#1
Hey all!

I've got a problem. I successfully run 10+ rigs and they're all mostly the same config:

- 5x Sapphire 280x DUAL-X (NOT undervolted, as I can't figure out how to. Tried many ways)
- 5x powered risers (ribbon)
- 1150 CPU
- ASrock H81 Pro BTC Mobo (since I'm powering risers, I omitted powering the two extra molex on the mobo)
- Basic sata HD
- 2x Corsair HX850 Gold PSU
- Windows 8.1
- Measured running ~1550W at the wall

PSU's configured like this:

Primary HX850 PSU:    Mobo, CPU, HD, 2x 280x cards, ALL risers
Secondary HX850 PSU: 3x 280x cards
Both PSU's connected to eachother and mobo using a simple splitter: https://www.hashratestore.com/shop/cables/dual-power-supply-adapter-cable-24-pin-atx/

All the rigs work beautifully and hash nicely. However, one rig started acting up and started to drop out of the pool so I went down to check on it. I look and the three cards connected to the secondary PSU aren't spinning and the PSU is cold. But the system is on, and all components connected to the primary PSU are running.

I got concerned that maybe 850W aren't enough for three 280x cards hashing at ~740kh/s, but I believe they are rated at around ~250W each, making three cards roughly 750W...so that should be okay.

So, I thought the secondary PSU simply died! I just switched it out with a spare one I had (HX1020, more power too!). The rig ran for a little while and then I get a notice that it also dropped off. I go down to check and same thing. Secondary PSU is off but primary is on?! AARGH!

I turn it back on only to smell a burning component smell (not good) coming out of the NEW replacement HX1020 PSU.

I'm at a loss. I run many rigs with the same configuration with no issues, what could be the cause? I thought maybe the splitter could cause an issue, but I don't see why? Maybe the power cable? Maybe the extension cord both PSUs are connected to? I don't know...

Anyways, before I plugin another PSU and burn it, I'd love to hear if anyone has any input for me.

Any help would be appreciated Smiley

Thank you,
llama
Jump to: