Bacula is computer backup software; it permits a system or network administrator to manage backup, recovery, and verification of computer data across a network of computers of different kinds. Bacula can also run entirely upon a single computer (though it may not be the best choice for this sort of service) and can back up to various types of media, including tape and disk.
In technical terms, it is a Client/Server based network backup program. Bacula is relatively easy to use and efficient, while offering many advanced storage management features that make it easy to find and recover lost or damaged files. Due to its modular design, Bacula is scalable from small single computer systems to systems consisting of hundreds of computers located over a large network.
Bacula 5.0.0 was released in January 2010, replacing 3.0.3, with GPL licensing. An enterprise commercial version is also available; the version numbers for this release will now start with an even number, commencing at 4.0, to distinguish them from the GPL release. This manual will document the GPL release, of which the commercial version will generally be a subset (albeit perhaps an improper subset).
This documentation is in the process of being upgraded to match 5.0; everything my not be completely up to date when you read it, if this message is still here.
If you are currently using a program such as tar, dump, or bru to backup your computer data, and you would like a network solution, more flexibility, or catalog services, Bacula will most likely provide the additional features you want. However, with power comes complexity: if you are new to Unix systems or do not have offsetting experience with more sophisticated backup software, the Bacula project does not recommend using Bacula as it is much more difficult to setup and use than tar or dump.
If you want Bacula to behave like the above mentioned simple programs and write over any tape that you put in the drive, then you will find working with Bacula difficult. Bacula is designed to protect your data following the rules you specify, and this means reusing a tape only as the last resort. It is possible to “force” Bacula to write over any tape in the drive, but it is easier and more efficient to use a simpler program for that kind of operation.
If you are running Amanda and would like a backup program that can write to multiple volumes (i.e. is not limited by your tape drive capacity), Bacula can most likely fill your needs. In addition, quite a number of Bacula users report that Bacula is simpler to setup and use than other equivalent programs.
If you are currently using a sophisticated commercial package such as Legato Networker. ARCserveIT, Arkeia, or PerfectBackup+, you may be interested in Bacula, which provides many of the same features and is free software available under the GNU Version 2 software license.
Bacula is made up of the following five major components or services: Director, Console, File, Storage, and Monitor services. Each performs a different part of the work, and some install only on client machines, some only on command and control workstations. We'll provide an overview of each module here, and go into detail in later sections.
The Bacula Director service is the program that supervises all the backup, restore, verify and archive operations. The system administrator uses the Bacula Director to schedule backups and to recover files. The Director runs as a daemon (or service) in the background. The Director needs access to a DBMS engine, in which it stores catalog information about backups, and access to a running Storage service, to store the actual volumes of backed up data; it is common for these three items to run on the same server (“the backup server”) in small to mid-sized installations.
For more details see the Director Services Daemon Design Document in the Bacula Developer's Guide.
The Bacula Console is the program that allows the administrator or user to communicate with the Bacula Director. Currently, the Bacula Console is available in three versions: a text-based console interface, a GNOME-based interface, and a wxWidgets graphical interface.
The first and simplest is to run the Console program in a shell window (i.e. TTY interface). Most system administrators will find this completely adequate. The second version is a GNOME GUI interface that is far from complete, but quite functional as it has most the capabilities of the shell Console. The third version is a wxWidgets GUI with an interactive file restore. It also has most of the capabilities of the shell console, allows command completion with tabulation, and gives you instant help about the command you are typing.
The latter two interfaces can be run via X from another *nix machine, but will require the appropriate libraries on the Director server, if your backup server does not have an X window installation. (CHECKME)
For more details see the Bacula Console Design Document.
The Bacula File service (also known as the Client program) is the software program that is installed on the machine to be backed up. It is specific to the operating system on which it runs and is responsible for providing the file attributes and data when requested by the Director. The File services are also responsible for the file system dependent part of restoring the file attributes and data during a recovery operation. This program runs as a daemon on the machine to be backed up.
In addition to Unix/Linux File daemons, there is a Windows File daemon (normally distributed in binary format). The Windows File daemon runs on current Windows versions (NT, 2000, XP, 2003, and possibly Me and 98). Bacula versions as new as 5.0 can interact with File daemons as old as 2.x (), so you can upgrade your backup server without necessarily having to reinstall all your workstation and server client programs.
For more details see the File Services Daemon Design Document in the Bacula Developer's Guide.
The Bacula Storage service consists of the software programs that perform the storage and recovery of the file attributes and data to the physical backup media or volumes. In other words, the Storage daemon is responsible for reading and writing your tapes (or other storage media, e.g. disk files). The Storage services runs as a daemon on the machine that has the backup device (usually a tape drive).
For more details see the Storage Services Daemon Design Document in the Bacula Developer's Guide.
The Bacula Catalog service is comprised of the software programs responsible for maintaining the file indexes and volume databases for all files backed up. The Catalog service permit the system administrator or user to quickly locate and restore any desired file. The Catalog service sets Bacula apart from simple backup programs like tar and bru, because the catalog maintains a record of all Volumes used, all Jobs run, and all Files saved, permitting efficient restoration and Volume management.
Bacula currently supports three different databases, PostgreSQL, MySQL, and SQLite, one of which must be chosen when building or installing Bacula. If you're installing from RPM or other packages, you'll generally find package names which include those names; these will be the entire backend setup, configured for that particular database, but will not include the DBMS engine itself (CHECKME).
The three SQL databases currently supported (PostgreSQL, MySQL, and SQLite) provide quite a number of features, including rapid indexing, arbitrary queries, and security. Although the Bacula project plans to support other major SQL databases, the current Bacula implementation interfaces only to PostgreSQL, MySQL, and SQLite.
The packages for MySQL and PostgreSQL are available for several operating systems. Alternatively, installing from the source is quite easy, see the Installing and Configuring MySQL and Installing and Configuring PostgreSQL chapters of this document for the details.
Configuring and building SQLite is even easier. For the details of configuring SQLite, please see the Installing and Configuring SQLite chapter of this document.
For more information on PostgreSQL, please see: www.postgresql.org.
For more information on MySQL, please see: www.mysql.com.
For technical and porting details see the Catalog Services Design Document in the developer's documented.
A Bacula Monitor service is the program that allows the administrator or user to watch current status of Bacula Directors, Bacula File Daemons and Bacula Storage Daemons. Currently, only a GTK+ version is available, which works with GNOME, KDE, or any window manager that supports the FreeDesktop.org system tray standard.
To perform a successful save or restore, the following four daemons must be configured and running: the Director daemon, the File daemon, the Storage daemon, and the Catalog service (MySQL, PostgreSQL or SQLite).
In order for Bacula to understand your system, what clients you want backed up and how, you must create a number of configuration files which identify to Bacula the resources, or “objects”, with which it will back up your files. The following presents an overall picture of this:
In upcoming sections, we'll explain what those definitions have to cover, and how to set them up to match your particular environment.
Bacula is in a state of evolution, and as a consequence, this manual will not always agree with the code. If an item in this manual is preceded by an asterisk (*), it indicates that the particular feature is not implemented. If it is preceded by a plus sign (+), it indicates that the feature may be partially implemented.
If you are reading this manual as supplied in a released version of the software, the above paragraph holds true. If you are reading the online version of the manual, www.bacula.org, please bear in mind that this version describes the current version in development (in the CVS) that may contain features not in the released version. Just the same, it generally lags behind the code a bit.
To get Bacula up and running quickly… well, you really can't necessarily get Bacula running quickly; it is large, and complex.
But you should follow the instructions in the manual on how to set Bacula up to back up the backup server itself first, and make sure you can make that work, before putting all the effort into designing a full network backup configuration.
The original author recommends that you first scan the Terminology section below, then quickly review the next chapter entitled The Current State of Bacula, then Getting Started with Bacula, which will give you a quick overview of getting Bacula running. It's a good idea to read through this – or at least skim it – before actually starting to install and configure the programs
After that, you should proceed to the chapter on Installing Bacula, then How to Configure Bacula, and finally the chapter on Running Bacula.
The person or persons responsible for administrating the Bacula system.
The term Backup refers to a Bacula Job that saves files.
The bootstrap file is an ASCII file containing a compact form of commands that allow Bacula or the stand-alone file extraction utility (bextract) to restore the contents of one or more Volumes, for example, the current state of a system just backed up. With a bootstrap file, Bacula can restore your system without a Catalog. You can create a bootstrap file from a Catalog to extract any file or files you wish.
The Catalog is used to store summary information about the Jobs, Clients, and Files that were backed up and on what Volume or Volumes. The information saved in the Catalog permits the administrator or user to determine what jobs were run, their status as well as the important characteristics of each file that was backed up, and most importantly, it permits you to choose what files to restore. The Catalog is an online resource, but does not contain the data for the files backed up. Most of the information stored in the catalog is also stored on the backup volumes (i.e. tapes). Of course, the tapes will also have a copy of the file data in addition to the File Attributes (see below).
The catalog feature is one part of Bacula that distinguishes it from simple backup and archive programs such as dump and tar.
In Bacula's terminology, the word Client refers to the machine being backed up, and it is synonymous with the File services or File daemon, and quite often, it is referred to it as the FD. A Client is defined in a configuration file resource.
The program that interfaces to the Director allowing the user or system administrator to control Bacula.
Unix terminology for a program that is always present in the background to carry out a designated task. On Windows systems, as well as some Unix systems, daemons are called Services.
The term directive is used to refer to a statement or a record within a Resource in a configuration file that defines one specific setting. For example, the Name directive defines the name of the Resource.
The main Bacula server daemon that schedules and directs all Bacula operations. Occasionally, the project refers to the Director as DIR.
A backup that includes all files changed since the last Full save started. Note, other backup programs may define this differently. Differentials will be larger than Incrementals, but you only need one, plus the last Full backup, to do a complete restore; they're a compromise between space and restore speed.
The File Attributes are all the information necessary about a file to identify it and all its properties such as size, creation date, modification date, permissions, etc. Normally, the attributes are handled entirely by Bacula so that the user never needs to be concerned about them. The attributes do not include the file's data.
Which attributes are available for a file depends on the filesystem upon which it was created, but generally, Bacula will be involved in restoring files only to the machine from which they were backed up, so this will not be an issue.
The daemon running on the client computer to be backed up. This is also referred to as the File services, and sometimes as the Client services or the FD.
A FileSet is a Resource contained in a configuration file that defines the files to be backed up. It consists of a list of included files or directories, a list of excluded files, and how the file is to be stored (compression, encryption, signatures). For more details, see the FileSet Resource definition in the Director chapter of this document.
A backup that includes all files changed since the last Full, Differential, or Incremental backup started. It is normally specified on the Level directive within the Job resource definition, or in a Schedule resource.
Incremental backups tend to be smaller than differentials, but you have to restore each one, in order, atop the most recent full backup, so they're less convenient.
A Bacula Job is a configuration resource that defines the work that Bacula must perform to backup or restore a particular Client. It consists of the Type (backup, restore, verify, etc), the Level (full, incremental,…), the FileSet, and Storage the files are to be backed up (Storage device, Media Pool). For more details, see the Job Resource definition in the Director chapter of this document.
The program that interfaces to all the daemons allowing the user or system administrator to monitor Bacula status.
A resource is a part of a configuration file that defines a specific unit of information that is available to Bacula. It consists of several directives (individual configuration statements). For example, the Job resource defines all the properties of a specific Job: name, schedule, Volume pool, backup type, backup level, …
A restore is a configuration resource that describes the operation of recovering a file from backup media. It is the inverse of a save, except that in most cases, a restore will normally have a small set of files to restore, while normally a Save backs up all the files on the system. Of course, after a disk crash, Bacula can be called upon to do a full Restore of all files that were on the system.
A Schedule is a configuration resource that defines when the Bacula Job will be scheduled for execution. To use the Schedule, the Job resource will refer to the name of the Schedule. For more details, see the Schedule Resource definition in the Director chapter of this document.
This is Windows terminology for a daemon – see above. It is frequently used in Unix environments as well.
The information returned from the Storage Services that uniquely locates a file on a backup medium. It consists of two parts: one part pertains to each file saved, and the other part pertains to the whole Job. Normally, this information is saved in the Catalog so that the user doesn't need specific knowledge of the Storage Coordinates. The Storage Coordinates include the File Attributes (see above) plus the unique location of the information on the backup Volume.
The Storage daemon, sometimes referred to as the SD, is the code that writes the attributes and data to a storage Volume (usually a tape or disk).
Normally refers to the internal conversation between the File daemon and the Storage daemon. The File daemon opens a session with the Storage daemon to save a FileSet or to restore it. A session has a one-to-one correspondence to a Bacula Job (see above).
A verify is a job that compares the current file attributes to the attributes that have previously been stored in the Bacula Catalog. This feature can be used for detecting changes to critical system files similar to what a file integrity checker like Tripwire does. One of the major advantages of using Bacula to do this is that on the machine you want protected such as a server, you can run just the File daemon, and the Director, Storage daemon, and Catalog reside on a different machine. As a consequence, if your server is ever compromised, it is unlikely that your verification database will be tampered with.
Verify can also be used to check that the most recent Job data written to a Volume agrees with what is stored in the Catalog (i.e. it compares the file attributes), *or it can check the Volume contents against the original files on disk.
An Archive operation is done after a Save, and it consists of removing the Volumes on which data is saved from active use. These Volumes are marked as Archived, and may no longer be used to save files. All the files contained on an Archived Volume are removed from the Catalog. NOT YET IMPLEMENTED.
An Update operation causes the files on the remote system to be updated to be the same as the host system. This is equivalent to an rdist capability. NOT YET IMPLEMENTED.
There are various kinds of retention periods that Bacula recognizes. The most important are the File Retention Period, Job Retention Period, and the Volume Retention Period. Each of these retention periods applies to the time that specific records will be kept in the Catalog database. This should not be confused with the time that the data saved to a Volume is valid.
The File Retention Period determines the time that File records are kept in the catalog database. This period is important because the volume of the database File records by far use the most storage space in the database. As a consequence, you must ensure that regular “pruning” of the database file records is done. (See the Console retention command for more details on this subject).
The Job Retention Period is the length of time that Job records will be kept in the database. Note, all the File records are tied to the Job that saved those files. The File records can be purged leaving the Job records. In this case, information will be available about the jobs that ran, but not the details of the files that were backed up. Normally, when a Job record is purged, all its File records will also be purged.
The Volume Retention Period is the minimum of time that a Volume will be kept before it is reused. Bacula will normally never overwrite a Volume that contains the only backup copy of a file. Under ideal conditions, the Catalog would retain entries for all files backed up for all current Volumes. Once a Volume is overwritten, the files that were backed up on that Volume are automatically removed from the Catalog. However, if there is a very large pool of Volumes or a Volume is never overwritten, the Catalog database may become enormous. To keep the Catalog to a manageable size, the backup information should be removed from the Catalog after the defined File Retention Period. Bacula provides the mechanisms for the catalog to be automatically pruned according to the retention periods defined.
A Scan operation causes the contents of a Volume or a series of Volumes to be scanned. These Volumes with the information on which files they contain are restored to the Bacula Catalog. Once the information is restored to the Catalog, the files contained on those Volumes may be easily restored. This function is particularly useful if certain Volumes or Jobs have exceeded their retention period and have been pruned or purged from the Catalog. Scanning data from Volumes into the Catalog is done by using the bscan program. See the bscan section of the Bacula Utilities Chapter of this manual for more details.
A Volume is an archive unit, normally a tape or a named disk file where Bacula stores the data from one or more backup jobs. All Bacula Volumes have a software label written to the Volume by Bacula so that it identifies what Volume it is really reading. (Normally there should be no confusion with disk files, but with tapes, it is easy to mount the wrong one.)
Bacula is a backup, restore and verification program and is not a complete disaster recovery system in itself, though it can be a key part of one if you plan carefully and follow the instructions included in the Disaster Recovery Chapter of this manual.
With proper planning, Bacula can be a central component of your disaster recovery system. For example, if you have created an emergency boot disk and a Bacula Rescue disk to save the current partitioning information of your hard disk, and maintain a complete Bacula backup, it is possible to completely recover your system from “bare metal” – that is, starting from an empty disk.
If you have used the WriteBootstrap record in your job or some other means to save a valid bootstrap file, you will be able to use it to extract the necessary files (without using the catalog or manually searching for the files to restore).
The following block diagram shows the typical interactions between the Bacula Services for a backup job. Each block represents in general a separate process (normally a daemon). In general, the Director oversees the flow of information. It also maintains the Catalog.
As we've suggested above, most of these processes can run on a single machine, in sufficiently small environments, except for the client, which you will obviously (well, we hope it's obvious, anyway have to run on each client machine which you wish to back up.