Pages:
Author

Topic: [Avalon] How to automate restarting of Avalon/cgminer when it stops mining? (Read 5316 times)

sr. member
Activity: 266
Merit: 250
Hi all, it seems that some problem with cgminer connect timeout exist if primary pool port is in filtered state. It just fails to start and hangs on timeout connection to first pool. I noticed than a number of times, but i can't set cgminer connect timeout or install tools like nmap to check port because image space is already used for other things.

So i coded this permanent temporarily solution to switch the first pool until there a possibility to set connect timeout in cgminer options. Insert this in /etc/init.d/cgminer after checks of user input of pool strings.

Please notice that you will need other external host with http + php where you should put simple-portscan.php.
Maybe it's not the best solution, but it's working for me.

Quote
# this is to check the checker host
scan_host="freepc";
scan_host_test=`ping -w 1 -c 1 ${scan_host} | grep "64 bytes from ${scan_host}"`;

# set first pool to working one
if [ -n "$scan_host_test" ]
then
        CHECK_POOL1=`wget "http://${scan_host}/simple-portscan.php?url=${_pool1url}" -q -O - | grep open`;
        CHECK_POOL2=`wget "http://${scan_host}/simple-portscan.php?url=${_pool2url}" -q -O - | grep open`;
        CHECK_POOL3=`wget "http://${scan_host}/simple-portscan.php?url=${_pool3url}" -q -O - | grep open`;

        if [ -z "$CHECK_POOL1" ] && [ -n "$CHECK_POOL2" ]
        then
                POOL1=$POOL2;
                POOL2="";
        else
                if [ -z "$CHECK_POOL1" ] && [ -n "$CHECK_POOL3" ]
                then
                        POOL1=$POOL3;
                        POOL3="";
                fi
        fi
fi

echo "USING FIRST POOL: $POOL1";

simple-portscan.php
Quote
$url_string = $_GET["url"];
$url_arr = split(":",$url_string);
$host = $url_arr[0];
$port = $url_arr[1];
$timeout = 1;

$fp = fsockopen($host,$port,$errno,$errstr,$timeout);
if($fp)
{
echo "port " . $port . " open on " . $host . "\n";
fclose($fp);
}
else
{
echo "port " . $port . " closed on " . $host . "\n";
}
flush();
?>
sr. member
Activity: 332
Merit: 250
Fixed it by doing flash to 3-21, restore backup, then flash to 3-25 keep settings.

also fixed the wwan not connecting issue:
- thanks go out to "senseless" and "\\\" from #avalon for this fix:

Ip for wifi CANNOT be set on 192.168.0.xxx it will conflict with the avalon 192.168.0.100 internal ip.  Your subnet must be 10.x.x.x or can change like I did to 192.168.1.x

Once I made that change all the other settings could stay default and could connect over local wifi/wwan
legendary
Activity: 1764
Merit: 1002
Leaving it plugged in does not work... well I haven't tried that actually, maybe I need to use a dummy plug   Wink

No this one unit is my problem child for sure.  Now it is getting average 50 GH/s with long idle times of 1 to 3 minutes every hour or so, so it is successfully restarting itself but cgminer is stalling quite often which reduces the average hash rate.  Not sure what the cause is.

I have not opened this case yet, could this problem be caused by a faulty tp-link or by usb hub problems? 

i would definitely check all internal connections.  a number of ppl have reported loose or disconnected cords from the shipping.
legendary
Activity: 3080
Merit: 1080
Check the firewall rules - yes the router has iptables rules set. It may be programmed to ignore ssh/telnet connection on the wireless interface.

sr. member
Activity: 332
Merit: 250
It just happened on a different unit as well.  It stalls out, but as soon as you log in to it via a direct connection it hashes again.

BTW I still can't seem to log in over WWAN have to plug in direct with ethernet cable.
sr. member
Activity: 332
Merit: 250
Leaving it plugged in does not work... well I haven't tried that actually, maybe I need to use a dummy plug   Wink

No this one unit is my problem child for sure.  Now it is getting average 50 GH/s with long idle times of 1 to 3 minutes every hour or so, so it is successfully restarting itself but cgminer is stalling quite often which reduces the average hash rate.  Not sure what the cause is.

I have not opened this case yet, could this problem be caused by a faulty tp-link or by usb hub problems? 
hero member
Activity: 607
Merit: 500
i have it ethernet connected.
after some days i can say the latest firmware 3.25 is working great. I see sometimes cgminer that is restarting
or the machine is booting, i can't tell, but cgminer is never idle. Also the last time the shares are counted correctly thus
utility is back at 990 Wink
legendary
Activity: 1764
Merit: 1002
Still having trouble with 3.25 and the cgminer-monitor script fix both installed.  It happens about once per day with the more troubled units (some have been rock solid since day 1, others seem more trouble-prone.

After running fine for several hours, unit stops mining and goes idle.  When I plug in the LAN cable to check the miner status, it re-starts cgminer and gets right back to it like nothing happened.

So something about the lan cable being plugged into the ethernet port on the tp-link "wakes up" the system or perhaps it causes some kinda of process to run.  What to report this as a bug not sure if anyone else has had this problem.

leave it plugged in?  Grin
sr. member
Activity: 332
Merit: 250
Still having trouble with 3.25 and the cgminer-monitor script fix both installed.  It happens about once per day with the more troubled units (some have been rock solid since day 1, others seem more trouble-prone.

After running fine for several hours, unit stops mining and goes idle.  When I plug in the LAN cable to check the miner status, it re-starts cgminer and gets right back to it like nothing happened.

So something about the lan cable being plugged into the ethernet port on the tp-link "wakes up" the system or perhaps it causes some kinda of process to run.  What to report this as a bug not sure if anyone else has had this problem.
legendary
Activity: 1764
Merit: 1002
Thanks for the script mills!  Worked like a charm for my perpetually stalling avalon unit.  I can confirm that this is not included in 3.25 firmware, because I updated to 3.25 and that did not fix it, but overwriting the cgminer in u psr/bin did

Fantastic
sr. member
Activity: 332
Merit: 250
Thanks for the script mills!  Worked like a charm for my perpetually stalling avalon unit.  I can confirm that this is not included in 3.25 firmware, because I updated to 3.25 and that did not fix it, but overwriting the cgminer in usr/bin did
sr. member
Activity: 266
Merit: 250
I was using the following line just to get number of shares.
Code:
B=`cgminer-api | grep "\[Accepted\]" | cut -f2 -d">" | cut -f2 -d" "`;
(if you replace cut -f2 -d" " to sed "s/ //g" it would do essentially the same.)

But the biggest problem begins when pool going offline in a strange ways - for example,
nmap shows that pool connection port is not closed or opened, but filtered.
So the cgminer will stuck for a very long time on startup trying to test this pool, and
i can't configure pool test timeout values in cgminer options. Seems that the only option
is to build my own image with better timeout values or add an option to cgminer.
sr. member
Activity: 388
Merit: 250
you should check the stat of the cgminer
you will see a restart of cgminer each time the crom job runs
full member
Activity: 155
Merit: 100
Quasi fixed the issue with the miner quitting. The cgminer-monitor script has an error in it which writes out "   [ACCEPTED] => X" in the file it's comparing against "[ACCEPTED] => X". These extra spaces caused the files to not match which causes the script to think that cgminer is still mining correctly. This script below removes all spaces from the files when they are created and makes the checking accurate. Replace the contents of /usr/bin/cgminer-monitor with the script below and the cron job should once again be able to properly reset cgminer when it stops mining.



#!/bin/sh
# This file is for cron job

C=`pidof cgminer | wc -w`
if [ "$C" != "1" ]; then
   /etc/init.d/cgminer stop
   /etc/init.d/cgminer start
   exit 0;
fi

A=`cat /tmp/cm.log | sed "s/ //g"`
B=`cgminer-api  | grep "^   \[Accepted\]" | sed "s/ //g"`
echo $B > /tmp/cm.log
if [ "$A" == "$B" ]; then
   /etc/init.d/cgminer stop
   /etc/init.d/cgminer start
   exit 0;
fi


This is a good catch.  I've changed mine like this as well and will see if this does the trick.  Thanks!
this will restart cgminer each time the crom job runs

Not true. It's working as intended and has actually saved me twice today on one of my Avalons. First part of the script stops and starts cgminer if it cant detect a pid for it. The second part compares accepted shares from five minutes ago (if thats where your cron is scheduled) to current. If it's different, it's assumed that everything is working. If it's the same, it's assumed the miner has stalled but not quit. A restart is then initiated. The only difference between my script and the one already in there are the regex sed commands to remove spaces from the files it's echoing out and comparing so there is no false negative.
+1
newbie
Activity: 30
Merit: 0
Quasi fixed the issue with the miner quitting. The cgminer-monitor script has an error in it which writes out "   [ACCEPTED] => X" in the file it's comparing against "[ACCEPTED] => X". These extra spaces caused the files to not match which causes the script to think that cgminer is still mining correctly. This script below removes all spaces from the files when they are created and makes the checking accurate. Replace the contents of /usr/bin/cgminer-monitor with the script below and the cron job should once again be able to properly reset cgminer when it stops mining.



#!/bin/sh
# This file is for cron job

C=`pidof cgminer | wc -w`
if [ "$C" != "1" ]; then
   /etc/init.d/cgminer stop
   /etc/init.d/cgminer start
   exit 0;
fi

A=`cat /tmp/cm.log | sed "s/ //g"`
B=`cgminer-api  | grep "^   \[Accepted\]" | sed "s/ //g"`
echo $B > /tmp/cm.log
if [ "$A" == "$B" ]; then
   /etc/init.d/cgminer stop
   /etc/init.d/cgminer start
   exit 0;
fi


This is a good catch.  I've changed mine like this as well and will see if this does the trick.  Thanks!
this will restart cgminer each time the crom job runs

Not true. It's working as intended and has actually saved me twice today on one of my Avalons. First part of the script stops and starts cgminer if it cant detect a pid for it. The second part compares accepted shares from five minutes ago (if thats where your cron is scheduled) to current. If it's different, it's assumed that everything is working. If it's the same, it's assumed the miner has stalled but not quit. A restart is then initiated. The only difference between my script and the one already in there are the regex sed commands to remove spaces from the files it's echoing out and comparing so there is no false negative.
sr. member
Activity: 388
Merit: 250
Quasi fixed the issue with the miner quitting. The cgminer-monitor script has an error in it which writes out "   [ACCEPTED] => X" in the file it's comparing against "[ACCEPTED] => X". These extra spaces caused the files to not match which causes the script to think that cgminer is still mining correctly. This script below removes all spaces from the files when they are created and makes the checking accurate. Replace the contents of /usr/bin/cgminer-monitor with the script below and the cron job should once again be able to properly reset cgminer when it stops mining.



#!/bin/sh
# This file is for cron job

C=`pidof cgminer | wc -w`
if [ "$C" != "1" ]; then
   /etc/init.d/cgminer stop
   /etc/init.d/cgminer start
   exit 0;
fi

A=`cat /tmp/cm.log | sed "s/ //g"`
B=`cgminer-api  | grep "^   \[Accepted\]" | sed "s/ //g"`
echo $B > /tmp/cm.log
if [ "$A" == "$B" ]; then
   /etc/init.d/cgminer stop
   /etc/init.d/cgminer start
   exit 0;
fi


This is a good catch.  I've changed mine like this as well and will see if this does the trick.  Thanks!
this will restart cgminer each time the crom job runs
legendary
Activity: 3080
Merit: 1080
Hmm, so I guess that is indeed the fix. The latest testing firmware includes this fix:

http://downloads.qi-hardware.com/people/xiangfu/avalon/next-testing/

I think I shall wait until it's officially released out of the testing phase before updating. For now I've noticed no restarts.

The latest testing firmware does not include this fix.

"Update cgminer-monitor, fix [Accept] give null at the first few seconds of cgminer start
Fix a typo on /usr/bin/cgminer-monitor, which make it cannot restart cgminer when no Accept"

I understood that mention to mean that they fixed it, but to be honest I did not look at the code in the "NEXT" firmware.

There is a mention on how to fix the cgminer-monitor script for those that updated to 2013/03/21

Code:
sed -i 's/ $B / "$B" /' /usr/bin/cgminer-monitor

hero member
Activity: 607
Merit: 500
how about wrong shares' statistics in 'Cgminer Status' window and the silly utility of 15 Huh
also the need to put password whenever you change a tab is anoying Smiley
legendary
Activity: 1890
Merit: 1003
I am running the test firmware (3/25/2013 Next) as well. Nothing seems amiss at the moment.
full member
Activity: 155
Merit: 100
Hmm, so I guess that is indeed the fix. The latest testing firmware includes this fix:

http://downloads.qi-hardware.com/people/xiangfu/avalon/next-testing/

I think I shall wait until it's officially released out of the testing phase before updating. For now I've noticed no restarts.

The latest testing firmware does not include this fix.
Pages:
Jump to: