User Tools

Site Tools


nagios_active_checks

Monitoring Bacula with Nagios (active checks)

1. Introduction

The goal is to see whether all the backups in Bacula have been successfully completed within some period of time (eg. past 24 hours).

2. What you need

  • Bacula server
  • NRPE
  • NRPE plugin check_bacula.pl
  • Nagios - may be on a separate host

2.1 Assumptions

  • I use FreeBSD myself, so Nagios plugins are installed in /usr/local/libexec/nagios/
  • you are using NRPE >= 2.0
  • your backup-server's name in Nagios configuration is “mybackup-srv”
  • backup-job you are monitoring is named “myserv-conf” in Bacula
  • you run the backup once a day
  • you are using MySQL as Bacula database

3. Configure NRPE

3.1 Get the plugin

I assume you've got NRPE up and running. You'll just need to add a separate command for every backup you want to check. The commands will query Bacula database for backups' info.

First, download the script check_bacula.pl (into Nagios plugins' directory) and restrict its permissions:

chown root:nagios check_bacula.pl
chmod 750 check_bacula.pl

Now set the correct SQL parameters in the script:

my $sqlDB = "bacula";
my $sqlUsername = "bacula";
my $sqlPassword = "bacula-sql-password";

If you don't use MySQL as your database, you'll also have to change the URL for accessing database:

 my $dsn = "DBI:mysql:database=$sqlDB;host=localhost";

3.2 Define NRPE command

The sample command to be put into nrpe.cfg:

command[check_bacula_myserv_conf]=/usr/local/libexec/nagios/check_bacula.pl -H 24 -w 2 -c 1 -j myserv-conf

The above command means:

  • the check is made within the last 24 hours
  • if there is less than 1 successful backup, CRITICAL is returned
  • if there is less than 2 successful backup, WARNING is returned
  • backup-job's name (in Bacula) is “myserv-conf”

If you make 1 job per day, you should set -w to 1 as well.

3. Configure Nagios

I assume you've got Nagios up and running and there are already some services in place for the host we're going to monitor for successful backups.

Assuming you're doing one backup per day, it's enough to check the backup once per day. So let's define a template service for that, basing on already defined generic-service:

define service {
  name                    backup-service
  use                     generic-service
  normal_check_interval   1440 ; 24 hours
  max_check_attempts      2
  register                0
}

Now we need to add the services that actually check bacula jobs. For each job, add a service definition (substituting the correct bacula job name, of course):

define service {
  use                     backup-service
  host_name               mybackup-srv
  service_description     Bacula-backup: mysrv-conf
  check_command           check_nrpe2!check_bacula_myserv_conf
}

Run the nagios pre-flight check and fix any reported errors:

nagios -v

If everything is OK, restart Nagios to apply the changes.

nagios_active_checks.txt · Last modified: 2010/01/12 22:33 by mator