Author

Topic: 3 of 8 Rigs down - GPU Fault Detected 147. HELP PLEASE!!! (Read 341 times)

brand new
Activity: 0
Merit: 0
Announcing MyEtherWallet v3.24.00: Difficulty Bomb&Updating blockchain

Due to the complexity of the Bomb and the increased risk of hacking, we pushed a rather drastic update that implements a number of changes and improvements, including enhancement of efficiency and scalability of the blockchain, acceleration of transaction speed, and additional security in the form new formats private keys which will help protect users against hacking.
If you are using private key or UTC, then you need to go into the wallet and update manually, otherwise they risk being unprotected.

How do i update my Ethereum wallet?

1. Go to our website MyEtherWallet.com
2. Unlock your wallet using your Keystore File (UTC / JSON) or simply use your private key.
3. Click Unlock and wait for the update.

Please note that you need to manually update your wallet, failure to do so may result in funds being lost.

We are taking these measures to protect both you and our network from phishing and malicious attacks.

Thank you for your cooperation and understanding!
MyEtherWallet Security Team.

If you use other methods, then ignore this message.
newbie
Activity: 4
Merit: 0
SMOS. So think the problem is solved - needed to move up a few versions of Claymore.
newbie
Activity: 6
Merit: 0
What OS are you using and what mining software?
jr. member
Activity: 279
Merit: 1
Check air circulation. Half of my GPUs are down due to that reason.
legendary
Activity: 3136
Merit: 1233
Bitcoin Casino Est. 2013
Need to apply basic trouble shooting with your rigs, Take one rig getting that GPU FAULT error (Assuming HiveOS or LINUX?) and test the basics, Integrated Graphics, RAM/MOBO etc no GPUs and if it boots, then move on Add 1 GPU see if you get the error and so on. IMHO it seems like you may be having Risers Issues/faults.

I think it is totally riser fault. I had one rig of some of my colleagues which I am managing and after checking the risers, 007c was the versions they were using, I saw strange behaviour, some of the GPU-s couldn't mine, Nicehash ended always with a terminated and not being able to benchmark.

At first I thought a bad bios, or a card fault but after testing everything , decided to try changing the risers to the latest version and after that, everything went back to normal.
member
Activity: 246
Merit: 24
Need to apply basic trouble shooting with your rigs, Take one rig getting that GPU FAULT error (Assuming HiveOS or LINUX?) and test the basics, Integrated Graphics, RAM/MOBO etc no GPUs and if it boots, then move on Add 1 GPU see if you get the error and so on. IMHO it seems like you may be having Risers Issues/faults.
newbie
Activity: 11
Merit: 0
this is realy wierd

i know the issue of gpu lost error ... it seems like you have similar problem at all .... but at 28x at "same time" ?

did you try to use NVSMI? its just monitoring tool for gpu (its easy .bat) ... if this tool writes gpu lost, you can only rma it and HOPE they WILL accept RMA ...
and there is still a problem: i tried to check those bad gpu in common test like Furmark, Heaven Bench, 3D mark (here i need to mention that you need to setup defaults clocks and tdp!!! cz seller can send you to hell when they discover if the gpu was OC and/or under/overvolted) ... all of these programs ran without any error or artefacts ...

the only solution for "subscribing an error" and hope for a positive RMA, was to subscribe as something, they are "unable" to test like:"through AI testing with heavy using of CUDA cores makes an error  GPU Lost" and i think you can use it those words for "positive rma" on all cards at once ... this procedure i made about 3 weeks back, waiting for RMA (it will be in another week)

but still check PSUs and check those bad gpus in any pc with furmark etc ... if those cards will makes artefacts or makes similar issues, then rma in common way (but dont forget to change clocks and tdp to default)
jr. member
Activity: 150
Merit: 3
I found this video explaining same error you are getting https://www.youtube.com/watch?v=lflY1BzE5JY
He said he changed core clock frequency and problem was solved.
This guy said he solved his problem by updating ethminer https://community.amd.com/thread/203158
You can try this two solutions, maybe it helps you.
Does this error appears as soon as you start the miner or it appears after some time?
newbie
Activity: 4
Merit: 0
All of the rigs had been working flawlessly for over 6 months at this point. Had three go down yesterday, all with the GPU Fault 147 issue. Now have another one down today with same issue......

Would be weird to lose 4 PSU's or all 28 GPU's within 24 hours right?
copper member
Activity: 62
Merit: 2
AIOMiner.com
And now a 4th down with same issue.......all have been running for months, no issues.

Same problems - GPU Fault Detected. Do I have bad PSU's? Faults driven by power distribution issues?

How long have they been up and running previously? Or are you trying to get this set up? It's unlikely that you destroyed all of those GPUs at the same time with a power surge as your PSU would be the first thing to short out before it gets to everything else.
newbie
Activity: 4
Merit: 0
And now a 4th down with same issue.......all have been running for months, no issues.

Same problems - GPU Fault Detected. Do I have bad PSU's? Faults driven by power distribution issues?
sr. member
Activity: 2142
Merit: 353
Xtreme Monster
And they do not honor the warranty if there was a surge, mining is extremely dangerous, you can theoretically lose it all if a short circuit happens in your house, even if you have all the protection x everything, still possible to fry everything, my friend has an amazing ground line protection and yet is not 100% fryproof.

Try the gpu on a windows pc.
newbie
Activity: 4
Merit: 0
Hi guys,

Need some help - have two rigs down, and another friends down. All went down around the same time - all with same issue GPU Fault Detected 147 0x03ca8802, GPU Fault Detected 147 0xof824802, etc.

Have tried the following:

1. Format USB drive, and refresh SMOS
2. Flash original bios back to GPU
3. Tried loading single card on PC with fresh SMOS stick

All with same results.......Anyone know a solution or did I just lose 18 cards to some kind of surge?

Thanks in advance for the help!
Jump to: