Author

Topic: ethminer monitoring restart script (hanging, CUDA errors, etc) (Read 183 times)

jr. member
Activity: 95
Merit: 2
I've put together an `ethminer` Python3-based process wrapper to monitor output and restart as necessary. This lets you increase overclock settings a bit more as it will restart on CUDA error output and generally be a bit more hands off your rigs. Not sure if it works on Windows (I've only tested on Ubuntu 16.04) It doesn't work on Windows; see notes below - sorry!

Code:
import os
import signal
import subprocess
import sys

try:
   TIMEOUT_NO_ACTIVITY_SECONDS = int(os.getenv('TIMEOUT_NO_ACTIVITY_SECONDS', 60))
except:
   TIMEOUT_NO_ACTIVITY_SECONDS = 60


class MinerException(Exception):
    pass


class TimeoutException(Exception):
    pass


def timeout_handler(signum, frame):
    raise TimeoutException("No activity from etnminer for {} seconds".format(TIMEOUT_NO_ACTIVITY_SECONDS))


def execute(cmd):
    signal.signal(signal.SIGALRM, timeout_handler)

    shutdown = False
    while not shutdown:
        proc = subprocess.Popen(cmd,
                                bufsize=0,
                                stdout=subprocess.PIPE,
                                stderr=subprocess.STDOUT,
                                universal_newlines=True)

        try:
            signal.alarm(TIMEOUT_NO_ACTIVITY_SECONDS)
            for line in iter(proc.stdout.readline, ""):
                line = line.strip()
                print(line)
                if line.startswith('CUDA error'):
                    raise MinerException('****** Restarting due to CUDA error')
                signal.alarm(TIMEOUT_NO_ACTIVITY_SECONDS)
        except (MinerException, TimeoutException) as e:
            print('\n\n', str(e), '\n\n')
        except KeyboardInterrupt:
            shutdown = True

        signal.alarm(0)

        proc.send_signal(signal.SIGINT)
        proc.stdout.close()

        try:
            proc.wait(timeout=15)
        except subprocess.TimeoutExpired:
            print("Miner didn't shutdown within 5 seconds")
            proc.kill()


if __name__ == "__main__":
    execute(sys.argv[1:])

It starts ethminer in a subprocess; it then:

  • checks ethminer output for CUDA error (usually related to overclocking memory errors) and
  • detects long waits/delays for output (sometimes ethminer gets hung waiting for server to respond, etc)

If either happens, the wrapper will kill the ethminer subprocess and restart it.  To use it, download the script and use it like this:

Code:
python3 emwrapper.py ethminer -v 9 -U

Change the number of seconds to wait for ethminer to output text (ie. hanging/freeze/delay detection) by setting the TIMEOUT_NO_ACTIVITY_SECONDS environment variable (defaults to 60 seconds):

Code:
TIMEOUT_NO_ACTIVITY_SECONDS=300 python3 emwrapper.py ethminer -v 9 -U

  • Note that Python2 is not supported as it does not have `subprocess.TimeoutExpired`
  • Note that Windows is not supported as it does not have `signal.alarm`
  • If this helped you, please send a small ETH donation to 0x4B005e68D323bdABD8eeD1D415117Ff1B57b3EC5
  • If you have any questions/comments/issues/fixes with this, post a comment below
Jump to: