Author

Topic: Monitor GPU temp/load/fans and hashrate with munin (Read 4345 times)

sr. member
Activity: 252
Merit: 250
Sorry for not replying - somehow the notification did not work or I simply did not see it Sad

But I think xen82 got the solution here - your miner scripts start under a certain user which logs on to X, apparently this user also has to execute the scripts in your environment.

I wonder why it works without modification for my system though..
newbie
Activity: 22
Merit: 0
Make sure that the munin plugins gpu_* (/etc/munin/plugin-conf.d/munin-node) are run as the user logged into X. That should help  Smiley
newbie
Activity: 12
Merit: 0
I tried this:

Quote
DISPLAY=:0 sudo aticonfig --odgt --adapter=0 | awk '/Sensor/ {print $5}'

and added munin to group sudo. Now munin-run gpu_temperature says
Code:
gpu0.value [sudo] password for nobody: 

which is nice. Not about to add nobody to sudo.  Tongue
Instead looked up the munin faq and it has an answer for running a job as a specific user, so here's what I then did: I took out the sudo above, and added a file gpu_temperature.auth to /etc/munin/plugin.conf.d/ containing

Code:
[gpu_temperature]
user root

Now munin-run gpu_temperature does this:
Code:
gpu0.value No protocol specified

Google informs me "No protocol specified" probably means I didn't forward my X session when ssh-ing in, but the message is the same whether or not I use -X.
Crs
member
Activity: 107
Merit: 10
added munin to the admin group, modifid the plugins with the sudo command, restarted munin but it's not working.
sr. member
Activity: 252
Merit: 250
That is strange, it really seems to require root access. Could you add sudo in front of the aticonfig commands and try again?
The commands then would read similar to this:

Quote
DISPLAY=:0 sudo aticonfig --odgt --adapter=0 | awk '/Sensor/ {print $5}'




You might also need to add munin to the list of allowed sudoers.

If this fixes the issue I'll investigate further, there is a possibility to configure munin to execute certain commands with other user accounts.
newbie
Activity: 12
Merit: 0
Hi, I'm having the same issue (empty graphs), also running Ubuntu (11.04), here is the output from sudo munin-run gpu_temperature:

Code:
gpu0.value No protocol specified
aticonfig: This program must be run as root when no X server is active

I'm pretty sure there is an X server running...
sr. member
Activity: 252
Merit: 250
Oh I am sorry. The correct command for testing the plugins is "munin-run " (note the dash). Also you might want to execute the test command with root permissions:
Code:
sudo munin-run gpu_temperature

This then executes the munin plugin as munin user and we should be able to see what's wrong.

Also the munin user should not need admin permissions. The values can be queries from aticonfig as normal user (at least in my environment)
Crs
member
Activity: 107
Merit: 10
nmc@nmc-GA-790FXTA-UD5:/etc/munin/plugins$ munin run if_err_eth0
munin: command not found
nmc@nmc-GA-790FXTA-UD5:/etc/munin/plugins$ sudo munin run if_err_eth0
sudo: munin: command not found
nmc@nmc-GA-790FXTA-UD5:/etc/munin/plugins$ sh munin run if_err_eth0
sh: Can't open munin
nmc@nmc-GA-790FXTA-UD5:/etc/munin/plugins$ sh munin run if_err_eth0

I have Ubuntu


root@nmc-GA-790FXTA-UD5:/etc/munin/plugins# echo -n "gpu0.value "; DISPLAY=:0 aticonfig --odgt --adapter=0 | awk '/Sensor/ {print $5}'
gpu0.value 74.50
root@nmc-GA-790FXTA-UD5:/etc/munin/plugins#

Should I change the munin user and add him in the sudoers file ? Or in the root/admin group ?
sr. member
Activity: 252
Merit: 250
The directory the plugins are in seems correct.
What happens if you execute "munin run "?
What is the output if you just execute the plugins themself (change into /etc/munin/plugins and type ./

Please post the outputs.

Also be sure to allow up to 15 minutes for the first values to appear.

Regards
Crs
member
Activity: 107
Merit: 10
thanks this is useful.
I've installed munin, it works, but your plugins don't seem to be working.

Code:
root@nmc-GA-790FXTA-UD5:/etc/munin/plugins# ls -lh
total 12K
lrwxrwxrwx 1 root root  28 2011-06-02 00:56 cpu -> /usr/share/munin/plugins/cpu
lrwxrwxrwx 1 root root  33 2011-06-02 00:56 cpuspeed -> /usr/share/munin/plugins/cpuspeed
lrwxrwxrwx 1 root root  27 2011-06-02 00:56 df -> /usr/share/munin/plugins/df
-rwxrwxrwx 1 root root 418 2011-06-02 09:53 gpu_fans
-rwxrwxrwx 1 root root 388 2011-06-02 09:56 gpu_load
-rwxrwxrwx 1 root root 361 2011-06-02 09:55 gpu_temperature
lrwxrwxrwx 1 root root  32 2011-06-02 00:56 if_err_eth0 -> /usr/share/munin/plugins/if_err_
lrwxrwxrwx 1 root root  28 2011-06-02 00:56 if_eth0 -> /usr/share/munin/plugins/if_
lrwxrwxrwx 1 root root  29 2011-06-02 00:56 load -> /usr/share/munin/plugins/load
lrwxrwxrwx 1 root root  31 2011-06-02 00:56 memory -> /usr/share/munin/plugins/memory
lrwxrwxrwx 1 root root  29 2011-06-02 00:56 swap -> /usr/share/munin/plugins/swap
lrwxrwxrwx 1 root root  31 2011-06-02 00:56 uptime -> /usr/share/munin/plugins/uptime
lrwxrwxrwx 1 root root  30 2011-06-02 00:56 users -> /usr/share/munin/plugins/users
root@nmc-GA-790FXTA-UD5:/etc/munin/plugins#



Uploaded with ImageShack.us



Uploaded with ImageShack.us
sr. member
Activity: 252
Merit: 250
Hey there,

for my headless linux rig I've configured munin to be able to watch the system status from remote. Additionally to the default installation I've added munin plugins to show the current GPU temperature, load, fan speed and the hashrate.
It is not a big deal but maybe somebody has use for them.

Required is a working munin installation. If you do not have one there are plenty of tutorials available on the net.
Simply copy those plugins to the /etc/munin/plugins directory, modify them for your needs, make them executable (chmod +x), restart munin-node and after ~10-15 minutes you should see the graphs appearing.

Disclaimer: I am not responsible for any damages caused by these scripts.

gpu_fans
You might want to modify the number of GPUs (add or delete the lines) and change the labels according to your needs
Code:
#!/bin/sh
case $1 in
   config)
        cat <<'EOM'
graph_title fan speed
graph_vlabel percent
gpu0.label 5830_0
gpu1.label 5830_1
gpu2.label 5830_2
gpu3.label 6870

EOM
        exit 0;;
esac

echo -n "gpu0.value "; DISPLAY=:0.0 aticonfig --pplib-cmd "get fanspeed 0" | awk '/Result/ {print $4}' | cut -d "%" -f1
echo -n "gpu1.value "; DISPLAY=:0.1 aticonfig --pplib-cmd "get fanspeed 0" | awk '/Result/ {print $4}' | cut -d "%" -f1
echo -n "gpu2.value "; DISPLAY=:0.2 aticonfig --pplib-cmd "get fanspeed 0" | awk '/Result/ {print $4}' | cut -d "%" -f1
echo -n "gpu3.value "; DISPLAY=:0.3 aticonfig --pplib-cmd "get fanspeed 0" | awk '/Result/ {print $4}' | cut -d "%" -f1

gpu_temperature
You might want to modify the number of GPUs (add or delete the lines) and change the labels according to your needs
Code:
#!/bin/sh
case $1 in
   config)
        cat <<'EOM'
graph_title temperature
graph_vlabel celsius
gpu0.label 5830_0
gpu1.label 5830_1
gpu2.label 5830_2
gpu3.label 6870

EOM
        exit 0;;
esac

echo -n "gpu0.value "; DISPLAY=:0 aticonfig --odgt --adapter=0 | awk '/Sensor/ {print $5}'
echo -n "gpu1.value "; DISPLAY=:0 aticonfig --odgt --adapter=1 | awk '/Sensor/ {print $5}'
echo -n "gpu2.value "; DISPLAY=:0 aticonfig --odgt --adapter=2 | awk '/Sensor/ {print $5}'
echo -n "gpu3.value "; DISPLAY=:0 aticonfig --odgt --adapter=3 | awk '/Sensor/ {print $5}'

gpu_load
You might want to modify the number of GPUs (add or delete the lines) and change the labels according to your needs
Code:
#!/bin/sh
case $1 in
   config)
        cat <<'EOM'
graph_title current load
graph_vlabel load
gpu0.label 5830_0
gpu1.label 5830_1
gpu2.label 5830_2
gpu3.label 6870

EOM
        exit 0;;
esac

echo -n "gpu0.value "; DISPLAY=:0 aticonfig --odgc --adapter=0 | awk '/load/ {print $4}' | cut -d "%" -f1
echo -n "gpu1.value "; DISPLAY=:0 aticonfig --odgc --adapter=1 | awk '/load/ {print $4}' | cut -d "%" -f1
echo -n "gpu2.value "; DISPLAY=:0 aticonfig --odgc --adapter=2 | awk '/load/ {print $4}' | cut -d "%" -f1
echo -n "gpu3.value "; DISPLAY=:0 aticonfig --odgc --adapter=3 | awk '/load/ {print $4}' | cut -d "%" -f1

hashrate
You have to insert your Deepbit API key.
This works for deepbit, if you are using another pool you need to modify the URL accessed. Maybe also the commands to cut the raw value out of the JSON have to be modified

Code:
#!/bin/sh
case $1 in
   config)
        cat <<'EOM'
graph_title hashrate
graph_vlabel hashrate
rate.label hashrate
EOM
        exit 0;;
esac

echo -n "rate.value "; curl -s http://deepbit.net/api/ | awk -F, '{print $2}' | cut -d":" -f2

If everything works you will get graphs like this:








If you have any questions or something isnt working just ask!
Jump to: