I'll be adding support to customize the threshold for both failover and lockup detection in the next day (setting to 0 will disable). I'm at work all night, so I'm not sure if I'll get a chance to work on it tonight from here (depends on how busy things get), or when I get home in the morning.
Here are the changes I am planning to make:
- Make --kill, --reload and --restart options for smartcoin (smartcoin.sh). This will allow you to take actions easily from custom scripts
- On lockup detection, fire custom script, and restart smartcoin automatically (instead of killing smartcoin). You should know if there is a real issue if this happens repeatedly (I'm still debating on whether smartcoin should automatically send emails on these events, or if that should be left up to the user with their custom script.. Thoughts?)
- Regarding the lockup routine when there is a loss of Internet, I think the current scheme (after above changes) will be plenty acceptable. Basically in this case, smartcoin will continually restart itself (unless you do a 'smartcoin --kill' from your custom script) about every 5 minutes until the Internet comes back. This really won't waste any extra electricity, as the GPUs will be idle themselves with no work available for them. This also has the advantage of dealing with miner software that has locked up (as an example of such has already been shown here). This also fits the philosophy of smartcoin in that once its running, it should try to automatically deal with things in a sane manner without manual intervention. (If I hadn't had to reboot my mining machine from time to time for testing purposes, I would wager that smartcoin has ensured me 0% downtime with no intervention from me other than initially setting up failover - its literally dealt with every problem automatically on my machine, and I've had to do nothing but check in every now and then). My goal is to have the lockup detection offer this same robustness!
it will be pretty good.
my miners are still killed about every 3 hours because of wrong lockdown detection.
After restart smartcoin all GPUS adding proof of work so GPU is not locked
I guess many reasons can be in. (internet down, server half way down, or something we don't know yet)
can you do more verbose log?
And why are running 8 minutes 2 profiles together until they got killed?
Don't they have go switch to profile 1 or 2 in seconds?
07/19/11 16:14:15 Update option selected
07/19/11 17:32:16 A change was detected in the failover system
07/19/11 17:32:16 Killing Miners....
07/19/11 17:32:20 Starting miner Miner.1!
07/19/11 17:32:20 Starting miner Miner.2!
07/19/11 17:32:20 Starting miner Miner.3!
07/19/11 17:32:20 Starting miner Miner.4!
07/19/11 17:32:20 Starting miner Miner.5!
07/19/11 17:32:20 Starting miner Miner.6!
07/19/11 17:32:20 Starting miner Miner.7!
07/19/11 17:32:20 Starting miner Miner.8!
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u h
07/19/11 17:32:20 Starting miner Miner.9!
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u
07/19/11 17:32:20 Starting miner Miner.10!
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u
07/19/11 17:32:20 Launching miner with launch string: ./cgminer -a 4way -t 6 -o
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -
07/19/11 17:32:20 Launching miner with launch string: ./cgminer -a 4way -t 6
07/19/11 17:32:20 Launching miner with launch string: python phoenix.py -v -u htt
07/19/11 17:40:36 ERROR: It appears that one or more of your devices have locked up. This is most likely the result of extreme overclocking!
07/19/11 17:40:36 It is recommended that you reduce your overclocking until you regain stability of the system
07/19/11 17:40:36 Killing Miners....
so -kill, --reload and --restart will be nice.