Pages:
Author

Topic: [Guide]How to setup an automated headless LinuxCoin mining rig + watchdog - page 2. (Read 20782 times)

hero member
Activity: 711
Merit: 500
Code:
root@linuxcoin:/home/user# export DISPLAY=`cat /home/user/.display`
root@linuxcoin:/home/user# pc=`ps waxuf | grep miner1.sh -c`
root@linuxcoin:/home/user# ld=`aticonfig --odgc --adapter=0 | grep "GPU load" | cut -c 30-35 | cut -d % -f 1`
root@linuxcoin:/home/user# if [ $pc -lt "2" ] || [ $ld -lt "50" ] ; then
>  kill `ps -ef | grep miner1 | grep -v grep | awk '{print $2}'`
>  lxterminal --title miner1 --command sh /home/user/miner1.sh &
>  date +"%D %r miner1 restarted" >> /home/user/cron_job.log
> fi
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
[1] 3668
root@linuxcoin:/home/user#
(lxterminal:3668): Gtk-WARNING **: cannot open display:


I tried a fresh install...still doesn't want to work, using the default restart script.
full member
Activity: 133
Merit: 100
That's weird. I don't get that message. Try running the aticonfig line as root.
hero member
Activity: 711
Merit: 500
the restart script doesnt seem to be working for me....followed this exactly

is it working for others?
Try this:
Exit out one of the miners.
Then run sh restart.sh
See what happens.

It doesn't restart them, It logs it, but doesnt start the miners back for some reason.
Any errors when you manually run it?


Code:
user@linuxcoin:~$ export DISPLAY=`cat /home/user/.display`
user@linuxcoin:~$ pc=`ps waxuf | grep miner1.sh -c`
user@linuxcoin:~$ ld=`aticonfig --odgc --adapter=0 | grep "GPU load" | cut -c 30-35 | cut -d % -f 1`
aticonfig: This program must be run as root when no X server is active
user@linuxcoin:~$ if [ $pc -lt "2" ] || [ $ld -lt "50" ] ; then
>  kill `ps -ef | grep miner1 | grep -v grep | awk '{print $2}'`
>  lxterminal --title miner1 --command "sh /home/user/miner1.sh" &
>  date +"%D %r miner1 restarted" >> /home/user/cron_job.log
> fi
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
[1] 3583
user@linuxcoin:~$ pc=`
(lxterminal:3583): Gtk-WARNING **: cannot open display:

[1]+  Exit 1                  lxterminal --title miner1 --command "sh /home/user/miner1.sh"
Code:
user@linuxcoin:~$ export DISPLAY=`cat /home/user/.display`
user@linuxcoin:~$ pc=`ps waxuf | grep miner1.sh -c`
user@linuxcoin:~$ ld=`aticonfig --odgc --adapter=0 | grep "GPU load" | cut -c 30-35 | cut -d % -f 1`
aticonfig: This program must be run as root when no X server is active
user@linuxcoin:~$ if [ $pc -lt "2" ] || [ $ld -lt "50" ] ; then
>  kill `ps -ef | grep miner1 | grep -v grep | awk '{print $2}'`
>  lxterminal --title miner1 --command sh /home/user/miner1.sh &
>  date +"%D %r miner1 restarted" >> /home/user/cron_job.log
> fi
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
[1] 3964
user@linuxcoin:~$ pc=`ps waxuf | grep miner2.sh -c`

(lxterminal:3964): Gtk-WARNING **: cannot open display:
[1]+  Exit 1                  lxterminal --title miner1 --command sh /home/user/miner1.sh
user@linuxcoin:~$ ld=`aticonfig --odgc --adapter=1 | grep "GPU load" | cut -c 30-35 | cut -d % -f 1`
aticonfig: This program must be run as root when no X server is active
user@linuxcoin:~$ if [ $pc -lt "2" ] || [ $ld -lt "50" ] ; then
>  kill `ps -ef | grep miner2 | grep -v grep | awk '{print $2}'`
>  lxterminal --title miner2 --command sh /home/user/miner2.sh &
>  date +"%D %r miner2 restarted" >> /home/user/cron_job.log
> fi
bash: [: -lt: unary operator expected
user@linuxcoin:~$ pc=`ps waxuf | grep miner3.sh -c`
user@linuxcoin:~$ ld=`aticonfig --odgc --adapter=2 | grep "GPU load" | cut -c 30-35 | cut -d % -f 1`
aticonfig: This program must be run as root when no X server is active
user@linuxcoin:~$ if [ $pc -lt "2" ] || [ $ld -lt "50" ] ; then
>  kill `ps -ef | grep miner3 | grep -v grep | awk '{print $2}'`
>  lxterminal --title miner3 --command sh /home/user/miner3.sh &
>  date +"%D %r miner3 restarted" >> /home/user/cron_job.log
> fi

?? not sure if im doing this right

thanks for the help
member
Activity: 98
Merit: 11
MiningMonitor is only 1BTC/month. Pete has added so many new features in the last two weeks that it is fully worth the cost. The proxy + miningmonitor has solved all of my uptime and notification issues. For example, if one of my 12 GPUs goes down (or all b/c a pool is offline) at 10am and I'm at the office and don't get home until 6pm - then it definitely makes sense to pay 1BTC/month to know about it prior to losing 8 hours of mining time. Even if I were to ssh into my home LAN (which I do anyway) to fix things, that's a whole bunch of micromanagement during the work day when I'm supposed to be doing things at work. Handling the miners should almost never happen - they should be online 24x7 and if the primary pool is down then the secondary and tertiary pool should be available. If my outgoing internet connection on the LAN goes down then that's another story - and, once I get my secondary cisco router online - the internet will automagically connect to my secondary provider.

Miners must be active 24x7! Cheesy
I think you missed the point. When miningmonitor notifies you of a downed miner you still have to manually reset it. What the script does is detect the downed miner and restart it without telling out.

MiningMonitor
Miner down -> MiningMonitor notification -> You reset -> miner

Scripts
miner down -> script detects -> script resets ->miner

No middle man -> No money spent  Smiley

Anyways, that's how I view it. If MiningMonitor works for your setup then by all means use it. I'm not here to make you change your setup, I'm just stating the way I see it.

Yes indeed 24x7 mining ftw! Smiley

I understand what SmartCoin does - but since I have actually spend time with both SmartCoin and BTC-mining-proxy I can say from *my* experience that the proxy accomplishes the purpose much easier and cleaner than SmartCoin. I can switch pools for all 12 GPUs in one click of a web interface with BTC Mining Proxy, where as SmartCoin I have to go and SSH into my bastion host, then ssh into the rigs, then get into the smartcoin interface, then issue a couple of commands. The proxy has a protected web interface that I can access much easier.

I'm not trying to bad mouth smartcoin - it's a valid project and I'm glad its being developed. I just don't think it works as well as the combination of miningmonitor (trending and graphs and email/sms notification) and the proxy (pool management, worker management web interface, dashboard to see all 12GPUs state in one screen).

If smartcoin works for you then that's great. It frustrated me and I didn't like the way it manages workers/pools/profiles/etc.
full member
Activity: 133
Merit: 100
MiningMonitor is only 1BTC/month. Pete has added so many new features in the last two weeks that it is fully worth the cost. The proxy + miningmonitor has solved all of my uptime and notification issues. For example, if one of my 12 GPUs goes down (or all b/c a pool is offline) at 10am and I'm at the office and don't get home until 6pm - then it definitely makes sense to pay 1BTC/month to know about it prior to losing 8 hours of mining time. Even if I were to ssh into my home LAN (which I do anyway) to fix things, that's a whole bunch of micromanagement during the work day when I'm supposed to be doing things at work. Handling the miners should almost never happen - they should be online 24x7 and if the primary pool is down then the secondary and tertiary pool should be available. If my outgoing internet connection on the LAN goes down then that's another story - and, once I get my secondary cisco router online - the internet will automagically connect to my secondary provider.

Miners must be active 24x7! Cheesy
I think you missed the point. When miningmonitor notifies you of a downed miner you still have to manually reset it. What the script does is detect the downed miner and restart it without telling out.

MiningMonitor
Miner down -> MiningMonitor notification -> You reset -> miner

Scripts
miner down -> script detects -> script resets ->miner

No middle man -> No money spent  Smiley

Anyways, that's how I view it. If MiningMonitor works for your setup then by all means use it. I'm not here to make you change your setup, I'm just stating the way I see it.

Yes indeed 24x7 mining ftw! Smiley
full member
Activity: 133
Merit: 100
the restart script doesnt seem to be working for me....followed this exactly

is it working for others?
Try this:
Exit out one of the miners.
Then run sh restart.sh
See what happens.

It doesn't restart them, It logs it, but doesnt start the miners back for some reason.
Any errors when you manually run it?
hero member
Activity: 711
Merit: 500
the restart script doesnt seem to be working for me....followed this exactly

is it working for others?
Try this:
Exit out one of the miners.
Then run sh restart.sh
See what happens.

It doesn't restart them, It logs it, but doesnt start the miners back for some reason.
full member
Activity: 238
Merit: 100
Headless automation is great. Thanks for the write up.

From personal experience, it's way easier to manage pool H/A with Bitcoin Mining Proxy than with Smartcoin or some homegrown scripts that watch load. I have 4 boxes with 3 cards each and I'm using MiningMonitor to notify me of down'd workers -  having to ssh into a box to see Smartcoin information is a waste of my time. I have a separate web/db server running the proxy app so it doesn't get affected by rigs that need to reboot or have work done to them. YMMV.
Not to try to take focus away from the main topic of the thread here - but..
I think you have mistaken the purpose of smartcoin - its not meant to be a monitor at all. Its meant to automatically control your mining operation so that you don't have to check on it all the time (failovers, lockup detection, automatic restarting of miners, etc) - as well as give you features like profiles (I can change where I am mining to in less than 5 seconds just by changing to another profile that I have defined), easy multi-instance setups etc. Coming versions will have multi-machine support, munin plugin support (trending graphs viewable over the internet), email/SMS notifications, etc.  Sure, it has an overview "status" screen where you can look at a quick overview of the operation, but thats not the purpose of smartcoin.
member
Activity: 98
Merit: 11
Headless automation is great. Thanks for the write up.

From personal experience, it's way easier to manage pool H/A with Bitcoin Mining Proxy than with Smartcoin or some homegrown scripts that watch load. I have 4 boxes with 3 cards each and I'm using MiningMonitor to notify me of down'd workers -  having to ssh into a box to see Smartcoin information is a waste of my time. I have a separate web/db server running the proxy app so it doesn't get affected by rigs that need to reboot or have work done to them. YMMV.
You have your point but this is designed to keep the GPU's online, not the connection to the pool alive. If the pool goes down then no matter how many restarts, the pool isn't going to come back up. And MiningMonitor is kinda expensive. Ins't it for profitable to let the worker stay down until you get home then to pay MiningMonitor's fees?

MiningMonitor is only 1BTC/month. Pete has added so many new features in the last two weeks that it is fully worth the cost. The proxy + miningmonitor has solved all of my uptime and notification issues. For example, if one of my 12 GPUs goes down (or all b/c a pool is offline) at 10am and I'm at the office and don't get home until 6pm - then it definitely makes sense to pay 1BTC/month to know about it prior to losing 8 hours of mining time. Even if I were to ssh into my home LAN (which I do anyway) to fix things, that's a whole bunch of micromanagement during the work day when I'm supposed to be doing things at work. Handling the miners should almost never happen - they should be online 24x7 and if the primary pool is down then the secondary and tertiary pool should be available. If my outgoing internet connection on the LAN goes down then that's another story - and, once I get my secondary cisco router online - the internet will automagically connect to my secondary provider.

Miners must be active 24x7! Cheesy
full member
Activity: 133
Merit: 100
Headless automation is great. Thanks for the write up.

From personal experience, it's way easier to manage pool H/A with Bitcoin Mining Proxy than with Smartcoin or some homegrown scripts that watch load. I have 4 boxes with 3 cards each and I'm using MiningMonitor to notify me of down'd workers -  having to ssh into a box to see Smartcoin information is a waste of my time. I have a separate web/db server running the proxy app so it doesn't get affected by rigs that need to reboot or have work done to them. YMMV.
You have your point but this is designed to keep the GPU's online, not the connection to the pool alive. If the pool goes down then no matter how many restarts, the pool isn't going to come back up. And MiningMonitor is kinda expensive. Ins't it for profitable to let the worker stay down until you get home then to pay MiningMonitor's fees?
full member
Activity: 133
Merit: 100
the restart script doesnt seem to be working for me....followed this exactly

is it working for others?
Try this:
Exit out one of the miners.
Then run sh restart.sh
See what happens.
full member
Activity: 133
Merit: 100
can the cron job work?


when I setup the cronjob  and added a debug info

echo "miner1->$pc $ld" >>  /home/user/cron_job.log

the $ld always be blank,  it seems "aticonfig --odgc --adapter=1" can;t been run at cronjob?

Hmm, I have no idea, I haven't tried. Maybe you should fix the $Id problem first?
full member
Activity: 133
Merit: 100
Quote
10. Now run AMDOverdriveCtrl -i 0. This opens up AMDOverdriveCtrl GUI for card 0. Overclock your card and setup your fan profiles. Export it as a file called gpu0.ovdr. Repeat for each card you have and change index numbers accordingly (ie AMDOverdriveCtrl -i 3)

11. Make the file /home/user/start.sh and put paste in the following:

Code:
#!/bin/bash
sleep 20
AMDOverdriveCtrl -i 0 -b gpui0.ovdr
AMDOverdriveCtrl -i 3 -b gpui3.ovdr
lxterminal --title miner1 --command "sh /home/user/miner1.sh"
lxterminal --title miner2 --command "sh /home/user/miner2.sh"This will make the computer wait 20 seconds to loadup/connect to the network/etc. Then it will load the overclock and fan profiles for each GPU and then start your miners.

Is the above correct?

No they need to match
Yeah, they need to match. I was copying/pasting my own script and for some reason I put an extra i in there. Fixed in the post. Thanks for pointing that out.
hero member
Activity: 711
Merit: 500
the restart script doesnt seem to be working for me....followed this exactly

is it working for others?
member
Activity: 98
Merit: 11
Headless automation is great. Thanks for the write up.

From personal experience, it's way easier to manage pool H/A with Bitcoin Mining Proxy than with Smartcoin or some homegrown scripts that watch load. I have 4 boxes with 3 cards each and I'm using MiningMonitor to notify me of down'd workers -  having to ssh into a box to see Smartcoin information is a waste of my time. I have a separate web/db server running the proxy app so it doesn't get affected by rigs that need to reboot or have work done to them. YMMV.
hero member
Activity: 711
Merit: 500
Quote
10. Now run AMDOverdriveCtrl -i 0. This opens up AMDOverdriveCtrl GUI for card 0. Overclock your card and setup your fan profiles. Export it as a file called gpu0.ovdr. Repeat for each card you have and change index numbers accordingly (ie AMDOverdriveCtrl -i 3)

11. Make the file /home/user/start.sh and put paste in the following:

Code:
#!/bin/bash
sleep 20
AMDOverdriveCtrl -i 0 -b gpui0.ovdr
AMDOverdriveCtrl -i 3 -b gpui3.ovdr
lxterminal --title miner1 --command "sh /home/user/miner1.sh"
lxterminal --title miner2 --command "sh /home/user/miner2.sh"This will make the computer wait 20 seconds to loadup/connect to the network/etc. Then it will load the overclock and fan profiles for each GPU and then start your miners.

Is the above correct?

No they need to match
hero member
Activity: 504
Merit: 500
Quote
10. Now run AMDOverdriveCtrl -i 0. This opens up AMDOverdriveCtrl GUI for card 0. Overclock your card and setup your fan profiles. Export it as a file called gpu0.ovdr. Repeat for each card you have and change index numbers accordingly (ie AMDOverdriveCtrl -i 3)

11. Make the file /home/user/start.sh and put paste in the following:

Code:
#!/bin/bash
sleep 20
AMDOverdriveCtrl -i 0 -b gpui0.ovdr
AMDOverdriveCtrl -i 3 -b gpui3.ovdr
lxterminal --title miner1 --command "sh /home/user/miner1.sh"
lxterminal --title miner2 --command "sh /home/user/miner2.sh"This will make the computer wait 20 seconds to loadup/connect to the network/etc. Then it will load the overclock and fan profiles for each GPU and then start your miners.

Is the above correct?
sr. member
Activity: 322
Merit: 250
can the cron job work?


when I setup the cronjob  and added a debug info

echo "miner1->$pc $ld" >>  /home/user/cron_job.log

the $ld always be blank,  it seems "aticonfig --odgc --adapter=1" can;t been run at cronjob?
full member
Activity: 133
Merit: 100
Oh, don't worry about me -  I personally don't overclock to the point of instability. But, as a software author, a majority of my support questions and problems revolve around users that overclock too aggressively, which is why I know quite a bit about the problem.
My only point was, that the software watchdog really doesn't do much, as it A) doesn't help a system made unstable from aggressive overclocking, and B) I know of no reasons other than what I posted that would cause a GPU to go below 50% load. (I.e. restarting a miner that is purposely throttled will have no positive outcome, restarting a miner that went low load because of overtemp will have no positive outcome, and restarting a miner pointed to a ddos'd pool will have no positive outcome)
Perhaps there is more reasons to have a software watchdog than I'm thinking of?
Well it's not designed to solve problem A so its moot. Problem B: Solves "Miner is idle" messages, and miner crashing (ie command window disappears, process dies)
full member
Activity: 238
Merit: 100
Oh, don't worry about me -  I personally don't overclock to the point of instability. But, as a software author, a majority of my support questions and problems revolve around users that overclock too aggressively, which is why I know quite a bit about the problem.
My only point was, that the software watchdog really doesn't do much, as it A) doesn't help a system made unstable from aggressive overclocking, and B) I know of no reasons other than what I posted that would cause a GPU to go below 50% load. (I.e. restarting a miner that is purposely throttled will have no positive outcome, restarting a miner that went low load because of overtemp will have no positive outcome, and restarting a miner pointed to a ddos'd pool will have no positive outcome)
Perhaps there is more reasons to have a software watchdog than I'm thinking of?
Pages:
Jump to: