Friday, December 30, 2011

Network Monitoring Tool - Nagios

Lately I have been working on evaluating the various Open source Network monitoring tools for monitoring the large Networks.

Yes Networks, which not only comprised of servers/ boxes but also other devices like printers, routers, switches etc.

Moreover the network which I am talking about is not of 5-10 machines but I am talking monitoring a cluster of 100+ boxes.

Here is brief and a good article defining the few of the popular Open source Network Monitoring tools.

Apart from all the tools mentioned in the above link I have been extensively working on Nagios and found it pretty good, efficient, extensible, very easy to use and setup

Written in C and Backed by a very active community support, Nagios comes with an extremely robust backend, fast and intuitive UI and gives almost everything a system or network administrator could ask for monitoring the network.

Nagios comes with a wide range of plugins which can monitor almost every bit of hardware or software installed anywhere in your network or networks of networks.

Starting from the basic Ping plugin it has gone far ahead and provided the features of active (see NRPA plugin) and passive (see NSCA Plugin) monitoring .

Of course we can provide our own plugins and also can use it like any other plugin's.

Notifications as usual remains the critical part of any monitoring tool and Nagios itself provides a good and robust SMS and email notification framework.

The webUI is pretty intuitive and provides different ways to logically group the boxes and networks, so that monitoring is much easier and can provide the health of my network in a logically manner.

After learning all the above there is no second thought that Nagios is rightly termed as the Enterprise class monitoring tool, which have been used to monitor the IT-Infrastructure of the companies

The only missing thing what I found is that it does not support installation on Windows but it does provide the plugin's to monitor Window's Boxes (see here).

Few important Links: -

1. Download Nagios
2. Installing Nagios on Ubuntu
3. Short Intro about Nagios Plugins
4. Nagios Directory
5. Nagios Plugins and Add-ons
6. Plugins for Monitoring Hadoop - HDFS , Datanodes and JMX Plugin
7. Monitoring Windows Boxes

Tips and Tricks: -

1. Where to find Nagios Configurations files - /usr/local/nagios

2. Where is the Plugin Directory - /usr/local/nagios/libexec

3. How to stop/ start/ restart Nagios - sudo /etc/init.d/nagios stop|reload|start

4. Debugging issues during Nagios startup -

In Nagios what all matters is the /usr/local/nagios/etc/nagios.cfg file.

This is the file which you need to validate before starting or restarting your nagios process.

To validate the nagios.cfg file run this command "sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg".

In case of any issues the errors will be displayed on the console and would help in debugging and fixng the issue.

Error handling and reporting in nagios is in very raw state and even the type of plugins used also matters a lot.
(I had a hard time in configuring NSCA becasue there was one extra blank line in the one of the Shell Scripts).

So it is always advisable to frequently the check the configurations, so that the errors can be catch at early stages.


No comments: