Author

Topic: Redundant backup host machines for custom USB mining hardware? (Read 934 times)

sr. member
Activity: 319
Merit: 250
I have it in a box with an 8 way power extension, each plug is controlled by one of the relays.

I have modminer quads and found when experimenting with bfgminer and cgminer it required the modminer to reboot when switching so I rigged the relays up to the Pi and modminer quad and some GPU rigs.

I'm going to have to add some more power feeds in though for a new GPU rig as it'll be pulling close to the max output of a single plug at the wall soon.

Really handy when you're remotely tweaking and hard crash a system. Set the bios on the system to power on when there's a power outage so when you witch it back on it starts up again.

Interesting. could i get a relay board to control ATX switch? One that doesn't require doing sorcery with breadboards or soldering? Hardware noob here.

If the mining devices are powered by an ATX, then the jumper to be shorted could be controlled thru a relay.

Apologies to OP for going off topic.

You could do that but then you could also just plug the atx psu into a socket controlled by the relay, less hassle that way. The beauty of that board is that it's got ethernet on it so you can control it from anywhere as long as you have internet access. I would think that you could rig up a way to use it via sms text too.

I have some simple scripts on two ups connected servers which will control all of them automatically and manually.
sr. member
Activity: 322
Merit: 250
Supersonic
I have it in a box with an 8 way power extension, each plug is controlled by one of the relays.

I have modminer quads and found when experimenting with bfgminer and cgminer it required the modminer to reboot when switching so I rigged the relays up to the Pi and modminer quad and some GPU rigs.

I'm going to have to add some more power feeds in though for a new GPU rig as it'll be pulling close to the max output of a single plug at the wall soon.

Really handy when you're remotely tweaking and hard crash a system. Set the bios on the system to power on when there's a power outage so when you witch it back on it starts up again.

Interesting. could i get a relay board to control ATX switch? One that doesn't require doing sorcery with breadboards or soldering? Hardware noob here.

If the mining devices are powered by an ATX, then the jumper to be shorted could be controlled thru a relay.

Apologies to OP for going off topic.
sr. member
Activity: 319
Merit: 250
I have it in a box with an 8 way power extension, each plug is controlled by one of the relays.

I have modminer quads and found when experimenting with bfgminer and cgminer it required the modminer to reboot when switching so I rigged the relays up to the Pi and modminer quad and some GPU rigs.

I'm going to have to add some more power feeds in though for a new GPU rig as it'll be pulling close to the max output of a single plug at the wall soon.

Really handy when you're remotely tweaking and hard crash a system. Set the bios on the system to power on when there's a power outage so when you witch it back on it starts up again.
cp1
hero member
Activity: 616
Merit: 500
Stop using branwallets
I think there was another thread about this.  I still think a hardware watchdog timer with a relay would be best.  If used with a raspberry pi as host you don't need a very big relay.
sr. member
Activity: 322
Merit: 250
Supersonic
I have one of these: http://www.robot-electronics.co.uk/htm/eth_rly16tech.htm

When the miner fails due to a crashed fpga or when the miner itself crashes it cycles power then a script restarts the miner.

Thats a very handy thing to have, ive been thinking of getting something cheaper to control using GPIO or raspberry pi. More so to start the units one by one rather than put full load on ATX at same time.

Which fpga board are you using that needs power cycle to recover?

Ive been recently mining with ztex 1.15y, and so far the only failures ive seen is when starting cgminer/bfgminer, it fails to configure the FPGAs. A power cycle would fix it, but ive figured out that cycling the USB on the host also fixes that problem, so no physical power cycle needed.

Since we are on topic of preventing outage, is it try to assume that using the "normally open" connector will continue to give power to the mining device if the relay board itself fails?
sr. member
Activity: 319
Merit: 250
I have one of these: http://www.robot-electronics.co.uk/htm/eth_rly16tech.htm

When the miner fails due to a crashed fpga or when the miner itself crashes it cycles power then a script restarts the miner.
sr. member
Activity: 322
Merit: 250
Supersonic
Personally I'd just drastically overbuild the machine controlling the miners. I'm basically got a dual cpu rack server with raid5 setup for this purpose. It's just a smallish webserver that I've repurposed.

More things to go wrong imho. Now you have introduced a new point of failure - the raid5 controller itself.

Keep it as simple as possible, and use more machines. If you have to use a full blown computer, just load everything into memory and run off it. Immunized from disk failure as well.. you typically wouldn't care about running changes being persistent... hmm maybe even boot from LAN.
sr. member
Activity: 322
Merit: 250
Supersonic
If you have 10 mining units being controlled by 10 raspberry pi (which is quite stable imho no moving part + linux). If one of them breaks, you loose 1 miner. Say downtime of 24 hours (before you can physically fix it).

If you have 10 raspi, assuming one of them to FAIL once a month. Your total loss per month would then be 1 miner for 1 day. Calculate the amount. Im sure investing in some sort of failover would be more expensive.

Personally im interest in your topic for academic reasons. I wouldnt consider investing in failover for actual money reasons.

Instead of one device that fails once a month, you now have 10 devices at 1/10 the hashing power, each failing once a month, 10 failures/mo total.

You're right back where you started, except you've bought 9 more devices, and you have to fix/reset them 10 times a month, instead of once.

So from a downtime perspective, there is no improvement.


But the "USB switch" lead you mentioned is a good one, especially if you can control it remotely somehow.

The difference is ur not completely out when the outage occurs. With rising difficulty, it is better to run at 90% performance for 12 hours rather than possibility of 0% , minimizing the risk of total outage when difficulty is lower. Think of it asif the mining unit came built with ethernet port which you hook into interwebs directly... only difference is that the module is user replaceable. Id say the odds of mining unit crashing is higher than the odds of a well supported embedded system on a chip.

Embedded systems should be a lot more resilient to failure. Much fewer components(more importantly immune to mechanical failure). I bet you that if i intentionally corrupt the SD card (or yank it out) of a running raspberry pi, it will continue mining for few hours (or until next reboot). And the SD card would be the weakest link in the host since ive heard electrical surges on the usb port can corrupt it.

sr. member
Activity: 420
Merit: 250
Personally I'd just drastically overbuild the machine controlling the miners. I'm basically got a dual cpu rack server with raid5 setup for this purpose. It's just a smallish webserver that I've repurposed.


Set up your monitoring software to watch things and make it text or call you with any issues. Automation is great and all... but being able to know there's a problem quickly and fix it yourself is always the fastest option.
hero member
Activity: 560
Merit: 500
If you have 10 mining units being controlled by 10 raspberry pi (which is quite stable imho no moving part + linux). If one of them breaks, you loose 1 miner. Say downtime of 24 hours (before you can physically fix it).

If you have 10 raspi, assuming one of them to FAIL once a month. Your total loss per month would then be 1 miner for 1 day. Calculate the amount. Im sure investing in some sort of failover would be more expensive.

Personally im interest in your topic for academic reasons. I wouldnt consider investing in failover for actual money reasons.

Instead of one device that fails once a month, you now have 10 devices at 1/10 the hashing power, each failing once a month, 10 failures/mo total.

You're right back where you started, except you've bought 9 more devices, and you have to fix/reset them 10 times a month, instead of once.

So from a downtime perspective, there is no improvement.


But the "USB switch" lead you mentioned is a good one, especially if you can control it remotely somehow.
sr. member
Activity: 322
Merit: 250
Supersonic
There has to be a better method available than just taking a shotgun approach, as that just staggers your down time occurrences.

I've also found that the host device is usually the weakest link in the mining setup (next to the PSU/brick), so introducing a new point of failure is still an improvement if that point is more reliable than the hosts connected to it.

Assume, "What can go wrong will go wrong"

Now for each single point of failure, calculate the probability of it failing, and also calculate the damage caused by its downtime.

If you have 10 mining units being controlled by 10 raspberry pi (which is quite stable imho no moving part + linux). If one of them breaks, you loose 1 miner. Say downtime of 24 hours (before you can physically fix it).

If you have 10 raspi, assuming one of them to FAIL once a month. Your total loss per month would then be 1 miner for 1 day. Calculate the amount. Im sure investing in some sort of failover would be more expensive.

Personally im interest in your topic for academic reasons. I wouldnt consider investing in failover for actual money reasons.

Now if you still feel failover is cheaper than probability of host outage + downtime loss, then another method could be rigging up some kind of USB switch that can be controled by multiple host computers who monitor each other and "elect" a master. The master commands the switch to point to itself as host. If no such hardware is available u need to get one made by a electrical engineer...

BTW just realized the keyword to search for is "usb switch" :-
http://www.iogear.com/product/GUB231/
http://www.amazon.co.uk/USB-2-0-switching-hub-switch/dp/B000I3WV1U/ref=tag_stp_s2_edpp_url
Quote
The USB ports are switched from one user to another with the touch of the button
^ The "touch of the button" here can be rigged to the gpio of multiple raspberry pi...
http://www.amazon.com/Belkin-F1U200V-4-Port-USB-Switch/dp/B000EJUCVE


But all failure scenarios need to be tested. Often times the failover itself causes a fail if not properly tested/practiced.
hero member
Activity: 560
Merit: 500
There has to be a better method available than just taking a shotgun approach, as that just staggers your down time occurrences.

I've also found that the host device is usually the weakest link in the mining setup (next to the PSU/brick), so introducing a new point of failure is still an improvement if that point is more reliable than the hosts connected to it.

Was reading up about USB over ethernet a while ago. Basically u attach the usb devices to a ethernet thing...

Thanks for the link, but my understanding is that this only lets you use a USB device connected (by USB port) to PC#1 from a PC#2 on the network, so PC#1 has to be working properly first.
sr. member
Activity: 322
Merit: 250
Supersonic
Was reading up about USB over ethernet a while ago. Basically u attach the usb devices to a ethernet thing... and then send USB packets over ethernet to them. the 2 hosts could monitor each other... and possibly take over if it sees no usb activity or smthn... But now you have new points of failure. Perhaps some devices like this one..


A lot neater/cheaper way (assuming you have multiple hardware you wanna control) is to divide the devices to different host machines. Could be something as cheap as a raspberry pi or a tp-link router with usb port. That way if one of the host machine dies it only takes out the miner(s) attached to it. Horizontal scalability FTW.
hero member
Activity: 560
Merit: 500
When the host machine providing work to your miners stops working for whatever reason, the miners stop mining, obviously.

Is there a way to connect your USB hardware to multiple hosts, so the backup machine(s) takes over, thus reducing your downtime?
Jump to: