Nagios

Nagios monitoring through SNMP

This article is part of our Academy Course titled Nagios Tutorial for IT Monitoring.

In this course, we provide a compilation of Nagios tutorials that will help you set up your own monitoring infrastructure. We cover a wide range of topics, from installation and configuration, to plugins and NRPE. With our straightforward tutorials, you will be able to get your own projects up and running in minimum time. Check it out here!

In the previous two articles (Nagios Core Installation and Configuration on Ubuntu Server and Using Nagios plugins and NRPE to check network services and metrics on remote hosts) we discussed how to install Nagios Core, plugins and NRPE to monitor host status (up / down), several network services running on Linux servers, and machine-specific metrics such as the number of logged-on users, processes, and CPU load, to name a few examples.

Additionally, you can use the Simple Network Management Protocol (SNMP) with Nagios to manage other types of network devices, such as printers, routers, and switches. This will be the topic that we will address in this guide.

SNMP not only allows to collect information about a network device, but also to modify the behavior of such device. Under the hood, this protocol exposes device data in the form of variables, which can then be queried and / or set by controlling applications. A classic example of modifying device data consists of changing the date and time on network printers, and retrieving print counts. However, not all variables are rw (read and write), and most of them are only read-only. If in doubt, refer to the device documentation.

Prerequisites

As explained earlier, you will typically use SNMP with Nagios to monitor network devices, as opposed to using plugins or NRPE to check services and system information associated with a Linux system. However, you can still use SNMP in the latter case as well. For simplicity, we will use the same CentOS 7 box we have been utilizing so far.

In order to use the CentOS 7 box to simulate a regular network device, we will need to perform some preliminary work before proceeding. This prework will consist of the following steps:

Step 0 – Review basic SNMP concepts. Although a thorough discussion about SNMP is out of the scope of this article, we will try to point out the basics as we go. However, should you need a more detailed explanation, feel free to take a look at this excellent question (with answers) in the Ubuntu forums. Bookmark that page in case you need to refer to it for clarifications later.

Our test environment consists of the following key components (essential in SNMP monitoring):

  • Managed device: CentOS 7 box.
  • Agent: the snmpd service running on the CentOS machine.
  • A Network Management System: Nagios running on the Ubuntu box.

Step 1 – Stop and disable xinetd (which will prevent the service from starting automatically on subsequent boots) on the managed device:

systemctl stop xinetd && systemctl disable xinetd

Step 2 – Install the SNMP packages on the Ubuntu box (this is required before proceeding with Step 3):

sudo apt-get install snmp snmpd libsnmp-dev

and in the CentOS 7 machine (necessary to monitor this host via SNMP):

yum install net-snmp net-snmp-utils

Finally, make sure that the snmpd service is started in the current session and on subsequent boots (CentOS):

systemctl start snmpd && systemctl enable snmpd

Step 3 – Recompile the Nagios plugins on the Ubuntu box. Since the SNMP packages were not installed at the time when we first compiled the plugins (see Step 9 in Nagios Core Installation and Configuration on Ubuntu Server for your reference), the check_snmp plugin could not be added to /usr/local/nagios/libexec.

After completing Steps 2 and 3, verify that the check_snmp plugin is now present inside /usr/local/nagios/libexec, as indicated in Fig. 1:

ls -l /usr/local/nagios/libexec | grep snmp
Figure 1: Checking the presence of the check_snmp plugin
Figure 1: Checking the presence of the check_snmp plugin

Step 4 – Remove (or comment out) the lines in the /usr/local/nagios/etc/servers/centos7.cfg file except for the host definition (we will later add a couple of service definitions).

define host {
host_name               centos7
alias                   My CentOS 7 server
address                 192.168.0.29
max_check_attempts      3
check_period            24x7
check_command           check-host-alive
contacts                nagiosadmin
notification_interval   60
notification_period     24x7
}

Step 5 – Open port 161/udp on the CentOS 7 host (for your information, this is the port where SNMP traffic will be directed to):

firewall-cmd --add-port=161/udp
firewall-cmd --add-port=161/udp --permanent

Steps 0 through 5, as outlined above, represent the essential preparations in order for Nagios to monitor the CentOS 7 system via SNMP. In the following sections we will get to the nitty-gritty of the corresponding configurations.

Configuring SNMP on the managed device

Once you have installed and started snmpd in the CentOS 7 box, the variables are only accessible from that host. We need to allow the Ubuntu box to query those variables. To do that, rename /etc/snmp/snmpd.conf to /etc/snmp/snmpd.conf.orig, and -for simplicity- create a new snmpd.conf file with only the following lines in it. The last three lines (beginning with disk) represent the mount points of existing root, projects, and backups logical volumes, where we will want to check the percentage of disk usage via SNMP:

rocommunity public 192.168.0.0/24
disk /
disk /home/projects
disk /home/backups

After restarting snmpd, hosts in the 192.168.0.0/24 network will be allowed to query (ro: read-only) the SNMP variables from the CentOS 7 machine.

Please keep in mind that this is a basic configuration. You can also restrict access to the SNMP variables by host using the IP of the allowed machine. To explore further options, run the snmpconf command on the managed device after making a copy of snmpd.conf.

Configuring Nagios for SNMP

To read or set variables via SNMP, object identifiers (OIDs) are used. A list of common OIDs are available in http://www.oid-info.com/basic-search.htm.

Here are some sample OIDs that we are going to use in this guide:

  • System uptime: 1.3.6.1.2.1.25.1.1.0
  • Percentage of disk space usage (first mount point indicated in snmpd.conf): 1.3.6.1.4.1.2021.9.1.9.1
  • Percentage of disk space usage (second mount point indicated in snmpd.conf): 1.3.6.1.4.1.2021.9.1.9.2
  • Percentage of disk space usage (third mount point indicated in snmpd.conf): 1.3.6.1.4.1.2021.9.1.9.3
  • Total RAM installed: 1.3.6.1.4.1.2021.4.5.0

Object IDentifiers are unique across devices and vendors. In other words, the same information is accessible using the same OID. However, some vendors may have specific OIDs for their devices.

In the Nagios server, make sure the following block is present in /usr/local/nagios/etc/objects/commands.cfg:

define command{
command_name    check_snmp
command_line    $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
}

And append the following command definitions to the same file. They will be used to check the uptime, the percentage of disk usage, and the total RAM installed.

# Uptime via SNMP
define command{
command_name    SNMP-Uptime
command_line    $USER1$/check_snmp -o 1.3.6.1.2.1.25.1.1.0 -H $HOSTADDRESS$ $ARG1$
}
# Percentage of disk usage (/)
define command{
command_name    SNMP-DiskUsagePercentageRoot
command_line    $USER1$/check_snmp -o 1.3.6.1.4.1.2021.9.1.9.1 -H $HOSTADDRESS$ $ARG1$ -w 60 -c 80
}
# Percentage of disk usage (/home/projects)
define command{
command_name    SNMP-DiskUsagePercentageProjects
command_line    $USER1$/check_snmp -o 1.3.6.1.4.1.2021.9.1.9.2 -H $HOSTADDRESS$ $ARG1$ -w 60 -c 80
}
# Percentage of disk usage (/home/backups)
define command{
command_name    SNMP-DiskUsagePercentageBackups
command_line    $USER1$/check_snmp -o 1.3.6.1.4.1.2021.9.1.9.3 -H $HOSTADDRESS$ $ARG1$ -w 60 -c 80
}
# Total RAM installed
define command{
command_name    SNMP-TotalRAMInstalled
command_line    $USER1$/check_snmp -o 1.3.6.1.4.1.2021.4.5.0 -H $HOSTADDRESS$ $ARG1$
}

Finally, we will add the corresponding service definitions to apply the above commands to our CentOS box. To do that, insert the following lines in /usr/local/nagios/etc/servers/centos7.cfg:

define service{
use                     generic-service
host_name               centos7
service_description     System uptime
check_command           SNMP-Uptime!-C public
}
define service{
use                     generic-service
host_name               centos7
service_description     Disk used percentage of /
check_command           SNMP-DiskUsagePercentageRoot!-C public
}
define service{
use                     generic-service
host_name               centos7
service_description     Disk used percentage of /home/projects
check_command           SNMP-DiskUsagePercentageProjects!-C public
}
define service{
use                     generic-service
host_name               centos7
service_description     Disk used percentage of /home/backups
check_command           SNMP-DiskUsagePercentageBackups!-C public
}
define service{
use                     generic-service
host_name               centos7
service_description     System uptime
check_command           SNMP-Uptime!-C public
}

Once Nagios is restarted, we can open the web user interface and check the status of the services that we just defined as we can see in Fig. 2:

 

Figure 2: Displaying system variables acquired through SNMP in Nagios
Figure 2: Displaying system variables acquired through SNMP in Nagios

The WARNING status in the percentage of disk usage corresponding to the root partition is caused by the -w flag followed by 60 in the SNMP-DiskUsagePercentageRoot command definition; that is, raise a warning message if the disk usage is above 60%.

Summary

In this article we have reviewed some essential concepts about SNMP and explained how to configure Nagios to monitor system metrics in the managed device using that protocol. To check other types of network devices, consult the specific documentation. The only difference is that you will not need to set up a SNMP agent in a network printer, or a router. The rest of this guide should apply to such cases without major modifications.

Gabriel Canepa

Gabriel Canepa is a Linux Foundation Certified System Administrator (LFCS-1500-0576-0100) and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work. When he's not typing commands or writing code or articles, he enjoys telling bedtime stories with his wife to his two little daughters and playing with them, the great pleasure of his life.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Boibary
Boibary
8 years ago

very nice

Azim
Azim
4 years ago

Hi, i would be happy if you can help. is it possible to configure it all other way, means : not with snmp. so when the switch is down or a Port has problem, he should send a report to Nagios with asking the Switch. i dont know if i make a point here, but i think it would be awesome if it works, because the Nagios check if anything is down or not wokring then we get an email or it is going to be Red, so i want it all other way if it possible. thanks in Advance.… Read more »

Back to top button