And now, I announce the first ever automated block checking and restart script for your masternodes.
#!/bin/bash
blockold=$(/home/dash/dash-cli getinfo | grep "blocks" | grep -Eo '[0-9]{6,100}')
while true
do
sleep 1000
blocknew=$(/home/dash/dash-cli getinfo | grep "blocks" | grep -Eo '[0-9]{6,100}')
if [ $blocknew -gt $blockold ]
then
blockold=$blocknew
else
tail -n 30 /home/dash/.dash/debug.log >> crash.log
/home/dash/dash-cli stop
sleep 30
pkill -9 dashd
sleep 10
/home/dash/dashd
sleep 60
blockold=$(/home/dash/dash-cli getinfo | grep "blocks" | grep -Eo '[0-9]{6,100}')
fi
done
This program grabs the current block. Then waits 16 minutes(longest block I found was about 13 minutes) and checks that block. If the new block is bigger it stores the new block as the old and waits another 16 minutes. If the old block is not bigger we assume dashd is locked up. It first logs the last 30 lines of the debug.log and tries to shutdown nicely. And if that doesn't work it kills the process. Then restarts and finds blocks again. This should also restart the dashd if it doesn't get any output(like if it isn't running).
So for those with more than one masternode on a server, you should change dashd to dashd1, dashd2, etc. Then use pkill -9 dashd1, dashd2, etc. If you just use dashd it will shut anything down with dashd in it. If there is any interest, I could make a script for multiple nodes.
Start from crontab or with screen. Go back a dozen pages for more info on those options.
As of last night, I found 122 recent blocks (since 300000) over 1000 seconds apart.
(ranging from 17 to 40 minutes)
(block number, seconds since last, pasted here):
https://www.zerobin.net/?4ac778770e4c182a#sBpr6QYqHJSVW8tgPgTegOKgsgoVYpYqQ5/9Tvy9U9s=Your 1000 second sleep isn't long enough. You'll end up restarting unnecessarily.
I'm sure there's better test criteria to use, but I don't have any
suggestions at this time.
Also, during shutoff 'dashd' renames its process to 'dash-shutoff' I'm not
sure if pkill will pick that up. killall doesn't. I do 'dash-cli stop ;
sleep 20 ; killall -9 dashd dash-shutoff' to allow the daemon time to
shutdown and to be sure I've killed the (probably hung-on-closing) dashd
(now named dash-shutoff)
HTH
BTW, I run several hundred instances of dashd, and haven't found a need to
force-restart due to hangs. Version 12.0.53 has proven very stable for me.
If you are having recurring hangs, check your hosting/build environment.
I use the gitian-built distributed files from
https://www.dashpay.io/downloads/I use DigitalOcean to host my masternodes