Author

Topic: (Linux) Automated Nvidia GPU fancontrol (Read 117 times)

member
Activity: 92
Merit: 10
March 30, 2018, 04:14:36 AM
#1
Maybe this can be valuable for someone.
I've written a little bash script for myself to control the fan speed of the gpu's in my rig's.

You can configure your highest temp. This configurations leads the script to raise the fan speed by STEP (the value of this var is configurable) percent in case the max-temp is reached (equal or greater).
You also can configure a minimum fan speed because the script also lowers the speed if needed but never below the minimum speed. The speed will just be lowered if the current temperature is below the temp zone. This temp zone is always configured with (max-temp - 5 degrees).
This means that if you configure MAX_TEMP with 70, MIN_SPEED with 40 and STEP with 5 the temp will be lowered with 5% speed just if the current temperature is below the temp zone which in this case will be 70-5=65. So if the current highest temperature of one GPU in the rig is 64, the fan speed will be lowered by 5%.

You may also have to adjust the exec paths for
  • NVIDIA_EXEC
  • ECHO
  • HEAD
  • AWK
  • SORT
  • DATE
  • PRINTF
  • XINIT_CMD

The script always determines the current temperature by searching the highest temp from all cards in the rig. This means if you got 6 GPU's in the rig this script runs on with the following temperatures
Code:
65
64
62
58
66
71
the highest temp where the logic is based on will be 71 and all speed adjustments will be done equally for all gpu's

Code:
#!/bin/bash

#Set temperatur which should not be reached. If at least one of the GPUs reaches this temp the fan speed will be increased
MAX_TEMP=70
#Define fan speed increase step in percent
STEP=5
#Define minimum fan speed in %
MIN_SPEED=40
#Logfile to write to
LOGFILE=[YOUR_LOGFILE]

CURRENT_FAN_SPEED=0

NVIDIA_EXEC=/usr/bin/nvidia-smi
NVIDIA_OPTS_QUERY_TEMP="--query-gpu=temperature.gpu --format=csv,noheader"
NVIDIA_OPTS_QUERY_FANSPEED="--query-gpu=fan.speed --format=csv,noheader"

ECHO=/usr/bin/echo
HEAD=/usr/bin/head
AWK=/usr/bin/awk
DATE=/usr/bin/date
SORT=/usr/bin/sort

XINIT_CMD="/usr/bin/xinit /usr/bin/nvidia-settings -a GPUFanControlState=1 -a GPUTargetFanSpeed=%s"

function log() {
        echo `$DATE` " -- $1" >> $LOGFILE
}

function queryCurrentFanSpeed() {
        CURRENT_FAN_SPEED=`$NVIDIA_EXEC $NVIDIA_OPTS_QUERY_FANSPEED | $HEAD -1 | $AWK '{print $1}'`
}

log "==== START ===="
queryCurrentFanSpeed
log "Current fan speed: $CURRENT_FAN_SPEED %"

if [ "$CURRENT_FAN_SPEED" -eq "100" ]; then
        log "Not able to increase fan speed because max is reached. Exiting!"
        exit 0
fi

HIGHEST_TEMP="`$NVIDIA_EXEC $NVIDIA_OPTS_QUERY_TEMP | $SORT -r | $HEAD -1`"
TEMP_ZONE=$((MAX_TEMP-5))

log "Highest current temperature: $HIGHEST_TEMP"
if [ "$HIGHEST_TEMP" -ge "$MAX_TEMP" ]; then
        log "One GPU reached max temp ($MAX_TEMP). Increasing fan speed with $STEP% for all GPUs"
        NEW_FAN_SPEED=$((CURRENT_FAN_SPEED+STEP))
        XINIT_CMD=$($PRINTF "$XINIT_CMD" $NEW_FAN_SPEED)
        log "Executing: $XINIT_CMD"
        NULL=`$XINIT_CMD 2>&1`
else
        log "GPU max temp not reached"
        if [ "$HIGHEST_TEMP" -ge "$TEMP_ZONE" ]; then
                log "Not lowering fan speed because current temp ($HIGHEST_TEMP) is in configured temp zone ($TEMP_ZONE <= $HIGHEST_TEMP < $MAX_TEMP)"
        else
                log "Checking if lowering speed will fall under minimum speed ($MIN_SPEED%)"
                NEW_SPEED=$((CURRENT_FAN_SPEED-STEP))
                if [ "$NEW_SPEED" -ge "$MIN_SPEED" ]; then
                        log "New speed will be $NEW_SPEED%. Lowering!"
                        XINIT_CMD=$($PRINTF "$XINIT_CMD" $NEW_SPEED)
                        log "Executing: $XINIT_CMD"
                        NULL=`$XINIT_CMD 2>&1`
                else
                        log "Not lowering fan speed"
                fi
        fi
fi
log "==== END ===="

Don't forget to configure LOGFILE.

Sample output in logfile:

Code:
Fri Mar 30 07:45:01 CEST 2018  -- ==== START ====
Fri Mar 30 07:45:01 CEST 2018  -- Current fan speed: 60 %
Fri Mar 30 07:45:01 CEST 2018  -- Highest current temperature: 69
Fri Mar 30 07:45:01 CEST 2018  -- GPU max temp not reached
Fri Mar 30 07:45:01 CEST 2018  -- Not lowering fan speed because current temp (69) is in configured temp zone (65 <= 69 < 70)
Fri Mar 30 07:45:01 CEST 2018  -- ==== END ====
Fri Mar 30 07:50:01 CEST 2018  -- ==== START ====
Fri Mar 30 07:50:01 CEST 2018  -- Current fan speed: 60 %
Fri Mar 30 07:50:01 CEST 2018  -- Highest current temperature: 70
Fri Mar 30 07:50:01 CEST 2018  -- One GPU reached max temp (70). Increasing fan speed with 5% for all GPUs
Fri Mar 30 07:50:01 CEST 2018  -- Executing: /usr/bin/xinit /usr/bin/nvidia-settings -a GPUFanControlState=1 -a GPUTargetFanSpeed=65
Fri Mar 30 07:50:07 CEST 2018  -- ==== END ====
Fri Mar 30 07:55:01 CEST 2018  -- ==== START ====
Fri Mar 30 07:55:01 CEST 2018  -- Current fan speed: 65 %
Fri Mar 30 07:55:01 CEST 2018  -- Highest current temperature: 68
Fri Mar 30 07:55:01 CEST 2018  -- GPU max temp not reached
Fri Mar 30 07:55:01 CEST 2018  -- Not lowering fan speed because current temp (68) is in configured temp zone (65 <= 68 < 70)
Fri Mar 30 07:55:01 CEST 2018  -- ==== END ====
.
.
.
Fri Mar 30 11:26:01 CEST 2018  -- ==== START ====
Fri Mar 30 11:26:01 CEST 2018  -- Current fan speed: 70 %
Fri Mar 30 11:26:01 CEST 2018  -- Highest current temperature: 64
Fri Mar 30 11:26:01 CEST 2018  -- GPU max temp not reached
Fri Mar 30 11:26:01 CEST 2018  -- Checking if lowering speed will fall under minimum speed (40%)
Fri Mar 30 11:26:01 CEST 2018  -- New speed will be 65%. Lowering!
Fri Mar 30 11:26:01 CEST 2018  -- Executing: /usr/bin/xinit /usr/bin/nvidia-settings -a GPUFanControlState=1 -a GPUTargetFanSpeed=65
Fri Mar 30 11:26:06 CEST 2018  -- ==== END ====

I am running this script using cron:
Code:
*/5 * * * *     [YOUR_PATH]/gpuTempControl.bash
Jump to: