Pages:
Author

Topic: Why my rig keeps hanging [BAMT with 4 280x GPU's]? (Read 2504 times)

trc
full member
Activity: 164
Merit: 100
The other 4 rigs I setup were all Z77 Extreme4, never an issue. I only bought this one as it was $50 cheaper with rebate from Newegg. I figured what harm could going AMD instead of Intel be. Great thought, lol.

Come on man it has nothing to do with AMD or even ASRock. Faulty mobo, BIOS, bad batch, etc. you know... I've had more trouble with Intel boards personally and I'm not blaming them either because most of the time it's a badly programmed BIOS on board or just that "shit happens" kind of bad luck on my part.  Smiley
newbie
Activity: 44
Merit: 0
The other 4 rigs I setup were all Z77 Extreme4, never an issue. I only bought this one as it was $50 cheaper with rebate from Newegg. I figured what harm could going AMD instead of Intel be. Great thought, lol.
newbie
Activity: 33
Merit: 0
I've only just noted that your Mobo is a ASRock 970 Extreme4, the one giving me grief was a ASRock 970 Extreme3 I wonder if there is a connection there somehow?

DOGE received with thanks Smiley will spend it on a beer at the weekend

UM
newbie
Activity: 44
Merit: 0
Glad to hear that your problem is solved, I don't understand what happens as my rig was running fine and then all of a sudden started playing this silly game, and wouldn't stop till I had removed the network managers, another rig running from the same .img of BAMT no problems at all  Huh

Well at least its fix, which is the main thing

Thats very kind of you sir address is DRfF5scEP7CU6c6rdk1hkRfGzkYWmSfXhc

UM

Yeah, I had setup 4 other rigs with identical setups, same version of BAMT, but this specific motherboard had this issue and it was brutal to diagnose. I was going to RMA the mobo until this fix worked. Thanks again, and doge's sent. Thanks for your help Smiley
newbie
Activity: 33
Merit: 0
Glad to hear that your problem is solved, I don't understand what happens as my rig was running fine and then all of a sudden started playing this silly game, and wouldn't stop till I had removed the network managers, another rig running from the same .img of BAMT no problems at all  Huh

Well at least its fix, which is the main thing

Thats very kind of you sir address is DRfF5scEP7CU6c6rdk1hkRfGzkYWmSfXhc

UM
newbie
Activity: 44
Merit: 0
fingers are crossed for you, let me know how you get on

UM

Your suggestion worked perfectly, 48+ hours straight without a glitch. If you have a dogecoin address, PM or post it here as I'd like to make a donation to thank you for your help. Thanks again!
newbie
Activity: 33
Merit: 0
fingers are crossed for you, let me know how you get on

UM
newbie
Activity: 44
Merit: 0
I made the changes suggested in the thread, rebooted and disabled my cronjob. I'll know in 12 hours if it worked but it looks very promising as I'm convinced its a network / NIC issue and not the mobo. Thanks for your help uncle_muddy.
newbie
Activity: 33
Merit: 0
After hunting I have found it

http://www.bitcointrading.com/forum/linux-distros/bamt-version-0-5-easy-usb-based-mining-linux-with-farm-wide-management-tools/

Look at the 2nd reply in the post, these are the instructions I used and the problem went away Smiley

I didn't replace the network manager with anything, as there was no instruction to do so. I have not had the problem since doing this thou so I'm guessing that this is the cause and that what ever is removed is not required

Hope this helps you out

UM
newbie
Activity: 44
Merit: 0
I finally got crontab to coldreboot after 12 hours, the server hung at around 8 hours. The server was no longer connected to my router although the USB stick was still flashing (showing the OS is running). After 12 hours, the server successfully rebooted and restarted on its own. Clearly, the OS is still functioning but something is up with its internet connectivity, it seems to be dropping the connection. I'm not using any wireless, using a wired cable between the onboard NIC to the router and I also tried a 1GB NIC instead of the mobo NIC.

I can't tell if the NIC is maybe going to sleep because of the mobo (no idea why that would happen) or if the OS is causing it. I've setup this type of rig a few times with other ASRock Mobo's and never seen this issue before so I'm a little baffled.

Uncle_muddy, if you can find that link I'd really appreciate it as it seems what you had is exactly the issue I'm having. Once you removed the network manager, did you replace it with something else?

Thank you!
sr. member
Activity: 840
Merit: 255
SportsIcon - Connect With Your Sports Heroes
After you RMA'd the mobo, did it work for you? Other than swapping the mobo, did anything else change (any other different hardware)?
Yes. The store tested with Prime95 and it failed with either my FX-6100 and also a Phenom T1100. Then I told them I wanted a new one, same Asrock unfortunately because a better one costed a lot more (some Asus Sabertooth). However this new one works flawlessly for 3 months now. Nothing changed, I can throw RAMs of different speeds and brands, single-channel 6GB RAM, some overclock, power saving options, lots of weird crap enabled on the BIOS, FX-6100 with a basic heatsink, Windows 7 booting from a USB pen, fake Alfa USB wifi dongle. All this shit works!

But be sure it is the mobo. Because I have a 7870 card that does precisely the same. Or even it's an hardware issue at all.  
newbie
Activity: 33
Merit: 0
I had exactly the same problem with my 4 x 280X rig, would just hang after a few hours for no reason what so ever. Showed the same showed the same symptoms as your getting.

In the end after a lot of looking around the problem was caused by the IP Stack falling down, no idea why or what caused it. One day it was working fine then all of a sudden it started to play up Embarrassed

I carried out all the same step that you have, reformat of USB key, new USB key, etc.. and still nothing, to the point where I set my rig to reboot every 4hrs to try and keep mining whilst I investigated the problem.

The resolution was to hardcode the IP into the rig and remove the network managers, I'm hunting for the link that I found with all the detail on it but can't find it right now but the command was something along the lines of

Code:
apt-get remove network-manager*  

Hope this helps, and if I can find the link I will post it up for you
newbie
Activity: 44
Merit: 0
As I said before, I had to send my Asrock 970 Extreme4 to RMA. I've been there, Asrock (970) is crap or at least there are many defective MB's on the market. I wasted time, money and peace of mind. Click here, and see if it looks familiar: https://bitcointalksearch.org/topic/random-lockupsfreezes-on-mining-rig-01-btc-bounty-for-a-fix-381626

Now, of course it could be something else, like yes, something that goes to sleep. But be prepared for the worst.

For a few days, I worked around those random lockups with a timer between the wall socket and the PSU plug that cuts the power each x hours, for a minute. You set "Power On" on power failure on the BIOS, and set the machine to start mining on boot. This is half-assed and the power cuts and restarts are not healthy at all for the rig components.

Did you try with Windows? I use Windows 7 64 where I can disable all power saving/sleeping options. Also on the BIOS, disable everything not needed for mining, like audio, firewire, SATA when using an usb pen, etc...

I've read the thread you linked me to and it sounds like we have the same problem. The idea of RMA'ing this mobo is a real pain though as it seems to work for 8 hours at a time. It doesn't make sense to me from a logic point of view that it would be the mobo as it works solidly for hours. I'm currently trying to do a soft reboot via crontab to reboot the system before it crashes hoping that will fix it. At least I see that as a better solution than a hard power cut, but I'm having problems getting the reboot command to execute right now.

After you RMA'd the mobo, did it work for you? Other than swapping the mobo, did anything else change (any other different hardware)?
trc
full member
Activity: 164
Merit: 100
Check the switch. Most low quality ones require a power reset at least once a week and some worse, once a day.
sr. member
Activity: 840
Merit: 255
SportsIcon - Connect With Your Sports Heroes
As I said before, I had to send my Asrock 970 Extreme4 to RMA. I've been there, Asrock (970) is crap or at least there are many defective MB's on the market. I wasted time, money and peace of mind. Click here, and see if it looks familiar: https://bitcointalksearch.org/topic/random-lockupsfreezes-on-mining-rig-01-btc-bounty-for-a-fix-381626

Now, of course it could be something else, like yes, something that goes to sleep. But be prepared for the worst.

For a few days, I worked around those random lockups with a timer between the wall socket and the PSU plug that cuts the power each x hours, for a minute. You set "Power On" on power failure on the BIOS, and set the machine to start mining on boot. This is half-assed and the power cuts and restarts are not healthy at all for the rig components.

Did you try with Windows? I use Windows 7 64 where I can disable all power saving/sleeping options. Also on the BIOS, disable everything not needed for mining, like audio, firewire, SATA when using an usb pen, etc...
newbie
Activity: 44
Merit: 0
Anyone know if its possible that the mobo is causing a sleep? This is the only ASRock 970 I've bought, all the other ones I've gotten were Intel based and I can't for the life of me figure out what's going on! Help!
newbie
Activity: 44
Merit: 0
I have a 7870 card that locks up whatever rig it's on. I also had to return an Asrock 970 E4 to RMA.
My thread is somewhere on the main Bitcoin mining forum

Try to run your rig underclocked like 500 Mhz core and 1000 Mhz ram, with everything not needed for mining disabled on the BIOS, disable power saving settings, etc... Then mine with only one card at a time.


This was a new build so I haven't done any optimizations yet, its running stock settings for GPUs and for BIOS too. It takes about 8 hours for the rig to hang so its not so easy to run just 1 GPU at a time to try and isolate it. But its clearly not the USB drive, not the NIC. Since it runs fine for 8 hours at a time I can't see this being a CPU/RAM issue. I'm really baffled what else it could be. The only fix I can come up with is writing a reboot script to reboot every 6 hours or so and hope that circumvents the hanging, but I'd rather fix the issue than rebooting 4x/day.
newbie
Activity: 44
Merit: 0
What's your PSU?


PSU: 1300W EVGA Supernova Gold

It runs at 1120W as checked by a wattage meter.
hero member
Activity: 718
Merit: 500
What's your PSU?
sr. member
Activity: 840
Merit: 255
SportsIcon - Connect With Your Sports Heroes
I have a 7870 card that locks up whatever rig it's on. I also had to return an Asrock 970 E4 to RMA.
My thread is somewhere on the main Bitcoin mining forum

Try to run your rig underclocked like 500 Mhz core and 1000 Mhz ram, with everything not needed for mining disabled on the BIOS, disable power saving settings, etc... Then mine with only one card at a time.
Pages:
Jump to: