This site hosts historical documentation. Visit www.terracotta.org for recent product information.

Integrating with Nagios XI

You can monitor Terracotta nodes using the Nagios XI monitoring solution - see http://www.nagios.com/. To do so, create a Nagios plugin. A Nagios plugin can query the Terracotta Management Server (TMS) for information through the TMS REST interface, or directly through a node's REST interface.

Plugins can be written in a variety of languages, and should follow published guidelines.

This document provides an example using a shell script.

Monitoring the NODE_LEFT Event

When a node leaves a Terracotta cluster, it generates a node.left event. The following shell script can report this type of event in Nagios XI.

#!/bin/bash

# Parameters
# ----------
SERVER=$1    # The IP address or resolvable hostname of a Terracotta server.
PORT=$2      # The Terracotta server's tsa-group-port (9530 by default).
INTERVAL=$3  # How far back in time, in minutes, to search for the event. 


RESTURL="http://${SERVER}:${PORT}/tc-management-api/agents/operatorEvents?sinceWhen=${INTERVAL}m"

GET_INFO=`curl "$RESTURL" -s | grep left`
NB_LINES=`echo $GET_INFO | wc -l`
if [[ $NB_LINES -gt 0 ]]; then
       SERVER_LIST=''
       for i in `echo $GET_INFO | sed 's/.*Node\(.*\)left the cluster.*/\1/g'`; do SERVER_LIST="$SERVER_LIST $i"; done
       echo $SERVER_LIST 
       CHECK="NODE_LEFT"
else
       CHECK="NO_EVENT"
fi

if [[ "$CHECK" == "NODE_LEFT" ]]; then
   echo "NODE LEFT EVENT: $SERVER_LIST"
   exit 2
elif [[ "$CHECK" == "NO_EVENT" ]]; then
   echo "No NODE LEFT Event: ${SERVER}"
   exit 0
else
   echo "Check failed"
   exit 3
fi

Note that the script's exit codes follow the standard required for Nagios plugins:

Value Status Description
0 OK The plugin was able to check the service and it appeared to be functioning properly.
1 Warning The plugin was able to check the service, but it appeared to be above some "warning" threshold or did not working properly.
2 Critical The plugin detected that either the service was not running or it was above some "critical" threshold.
3 Unknown Invalid command line arguments were supplied to the plugin or low-level failures internal to the plugin occurred (such as unable to fork or open a tcp socket) that prevent it from performing the specified operation. Higher-level errors (such as name resolution errors or socket timeouts) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states.

After you create the script, install it in Nagios. A number of tutorials on installing Nagios XI plugins are available on the Internet.

You can generalize the script to find other events by editing the REGEX pattern. Or edit the RESTURL to return other types of information.