Sebastian Nohn

Monitoring your ALIX I2C sensors with nagios

Most ALIX system boards come with onboard i2c temperature sensors. Nagios Plugins come with a check_sensors command.

However, the default lm-sensors configuration that ships with most distributions doesn't know about the ALIX sensors and the default Nagios plugin doesn't report performance data.

Once lm-sensors is installed and you run sensors-detect, the sensors command will output something like this:

root@bnalrr01:~# sensors
lm86-i2c-0-4c
Adapter: CS5536 ACB0
temp1:       +30.0 C  (low  =  +0.0 C, high = +70.0 C)  
                      (crit = +85.0 C, hyst = +75.0 C)  
temp2:       +36.9 C  (low  =  +0.0 C, high = +70.0 C)  
                      (crit = +85.0 C, hyst = +75.0 C)

And the check_sensors probe would output something like this:

root@bnalrr01:~# ./check_sensors 
sensor ok

To make the senors output more verbose, add this to your /etc/sensors3.conf:

chip "lm90-*" "adm1032-*" "lm86-*" "max6657-*" "adt7461-*"
   label temp1 "M/B Temp"
   label temp2 "CPU Temp"
   label tcrit1 "M/B Crit"
   label tcrit2 "CPU Crit"

Now the sensors command is a bit more verbose on the sensors:

root@bnalrr01:~# sensors
lm86-i2c-0-4c
Adapter: CS5536 ACB0
M/B Temp:    +30.0 C  (low  =  +0.0 C, high = +70.0 C)  
                      (crit = +85.0 C, hyst = +75.0 C)  
CPU Temp:    +36.6 C  (low  =  +0.0 C, high = +70.0 C)  
                      (crit = +85.0 C, hyst = +75.0 C)

To add performance data to the check_sensors probe, replace the content with

#! /bin/sh

PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin

PROGNAME=`basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION="1.4.15"

. $PROGPATH/utils.sh


print_usage() {
        echo "Usage: $PROGNAME"
}

print_help() {
        print_revision $PROGNAME $REVISION
        echo ""
        print_usage
        echo ""
        echo "This plugin checks hardware status using the lm_sensors package."
        echo ""
        support
        exit 0
}

case "$1" in
        --help)
                print_help
                exit 0
                ;;
        -h)
                print_help
                exit 0
                ;;
        --version)
        print_revision $PROGNAME $REVISION
                exit 0
                ;;
        -V)
                print_revision $PROGNAME $REVISION
                exit 0
                ;;
        *)
                sensordata=`sensors 2>&1`
                CPUHEAT=`sensors -A | grep CPU | grep Temp | cut -c 15,16`
                MOBHEAT=`sensors -A | grep M/B | grep Temp | cut -c 15,16`
                PERFDATA="cpu_temp=$CPUHEAT;mob_heat=$MOBHEAT"
                status=$?
                if test "$1" = "-v" -o "$1" = "--verbose"; then
                        echo ${sensordata}
                fi
                if test ${status} -eq 127; then
                        echo "SENSORS UNKNOWN - command not found (did you install lmsensors?)"
                        exit -1
                elif test ${status} -ne 0 ; then
                        echo "WARNING - sensors returned state $status |$PERFDATA"
                        exit 1
                fi
                if echo ${sensordata} | egrep ALARM > /dev/null; then
                        echo "SENSOR CRITICAL - Sensor alarm detected! |$PERFDATA"
                        exit 2
                else
                        echo "sensor ok |$PERFDATA"
                        exit 0
                fi
                ;;
esac

Now it prints out performance data and can be graphed with pnp4nagios:

root@bnalrr01:~# ./check_temp_sensors 
sensor ok |cpu_temp=36;mob_heat=30

Posted Mar 05, 2011 by Sebastian Nohn
Tagged as: ALIX, i2c, lm-sensors, Nagios