Pages

Friday, August 29, 2014

Monitor your SSL certificates through Nagios with shell scripts

Recently I had a requirement to monitor few SSL certificate expiry dates through Nagios. The easiest method was to define a service with check_http. –S for SSL and –C for warning dates.

define service{
        use             generic-service         ; Inherit default values from a template
        host_name               XXXweb
        service_description     XXX.lk Certificate Expire
        check_command   check_http! -S -H XX.XX.XX.XX -C 90
        }

That worked for some hosts but unfortunately there were some SSL issues in some hosts.
But I could get the certificate details through openssl command.
echo | openssl s_client -connect XXX.lk:443 2>/dev/null | openssl x509 -noout -dates

So what my option was to write a shell script with relevant EXIT codes to Nagios and configure Nagios to run it periodically.  
Step 1
Created a separate command definition in /etc/nagios/objects/commands.cfg
define command{
        command_name    check_cus_command
        command_line    $ARG1$
        }

So I need only to call check_cus_command with my full command.

Step 2

Create a custom script at /usr/lib64/nagios/plugins/ with right permission 755 and I call it expscript.sh
 Thank to http://superuser.com/questions/618370/check-expiry-date-of-ssl-certificate-for-multiple-remote-servers, I found a script only had to do some minor modifications to support Nagios EXIT codes.

#!/bin/bash

DEBUG=false
#DEBUG=true
warning_days=90 # Number of days to warn about soon-to-expire certs
#certs_to_check='XXX.YYY.lk:443'
certs_to_check=$1

for CERT in $certs_to_check
do
  $DEBUG && echo "Checking cert: [$CERT]"

  output=$(echo | openssl s_client -connect ${CERT} 2>/dev/null |\
  sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' |\
  openssl x509 -noout -subject -dates 2>/dev/null)
  if [ "$?" -ne 0 ]; then
    $DEBUG && echo "Error connecting to host for cert [$CERT]"
    logger -p local6.warn "Error connecting to host for cert [$CERT]"
    continue
  fi

  start_date=$(echo $output | sed 's/.*notBefore=\(.*\).*not.*/\1/g')
  end_date=$(echo $output | sed 's/.*notAfter=\(.*\)$/\1/g')

  start_epoch=$(date +%s -d "$start_date")
  end_epoch=$(date +%s -d "$end_date")

  epoch_now=$(date +%s)

  if [ "$start_epoch" -gt "$epoch_now" ]; then
    $DEBUG && echo "Certificate for [$CERT] is not yet valid"
    logger -p local6.warn "Certificate for $CERT is not yet valid"
  fi

  seconds_to_expire=$(($end_epoch - $epoch_now))
  days_to_expire=$(($seconds_to_expire / 86400))
  #$DEBUG && echo "Days to expiry: ($days_to_expire)"
  echo "Days to expiry: ($days_to_expire) - $end_date"

  warning_seconds=$((86400 * $warning_days))

  if [ "$seconds_to_expire" -lt "$warning_seconds" ]; then
    #$DEBUG && echo "Cert [$CERT] is soon to expire ($seconds_to_expire seconds)"
    echo "Cert [$CERT] is soon to expire ($seconds_to_expire seconds)"
    logger -p local6.warn "cert [$CERT] is soon to expire ($seconds_to_expire seconds)"
    exit 1

  elif [ "$seconds_to_expire" -gt "$warning_seconds" ]; then
    exit 0
  fi
done


Step 3

My service definition at my relevant host;

define service{
        use             generic-service         ; Inherit default values from a template
        host_name               XXX.YYYWeb
        service_description     XXX.YYY Cert
        check_command   check_cus_command!sh /usr/lib64/nagios/plugins/expscript.sh XXX.YYY.lk:443
        servicegroups   URL_Checks
        }

That’s it…..