User Tools

Site Tools


application_specific_backups:postgresql

PostgreSQL

Databases in seperate files

Contributed by Marcel Gsteiger

After a few hours of troubleshooting, I finally have my PostgreSQL database backup running. My backup job creates separate schema/data backups for each database, along with a separate backup of global objects. This is much easier and safer to handle than “all-in-one-file” backups. Moreover, my scripts backup the data through pipes. So there is no need for additional disk space for large database backups. The script automatically determines which databases are on the server. My server is a CentOS 4.2 w/SELinux enabled running bacula-1.38.5-4 with postgresql backend on a x86_64 dual xeon box with a ultrium-3 tape attached. The data is always spooled through a separate RAID 0 array (the tape is too fast for my other disks). My postgresql is 8.1.2, but my scripts should also work with versions >= 7.3 or perhaps 7.4 .

I hope this is useful for somebody else, too.

regards –Marcel

First, create the directory /var/lib/pgsql/data/dump and /var/lib/pgsql/data/dump/fifo , chown postgres:bacula, chmod 750. Ensure that the database user postgres running on the local host has “trust” access to all databases (no passwords needed). This script also works for backup of remote databases, but ensure that access rights are set properly.

If you prefer to have a password, you can uncomment the lines EXPORT PGPASSWORD=xxxxxxxx or, even better, add a ~/.pgpass file.

Create these files:

/etc/bacula/make_database_backup: owner root, group postgres, chmod g+x:

#!/bin/sh
exec > /dev/null
DUMPDIR=/var/lib/pgsql/data/dump
FIFODIR=$DUMPDIR/fifo
export PGUSER=postgres
#export PGPASSWORD=xxxx                       # only when pg_hba.conf requires it
/usr/bin/pg_dumpall -g >$DUMPDIR/globalobjects.dump   # hopefully never a big file, so no need for a fifo
rm -f $FIFODIR/*.data.dump
for dbname in `psql -d template1 -q -t <<EOF
select datname from pg_database where not datname in ('bacula','template0') order by datname;
EOF
` 
do
 mkfifo $FIFODIR/$dbname.schema.dump
 /usr/bin/pg_dump --format=c -s $dbname --file=$FIFODIR/$dbname.schema.dump 2>&1 < /dev/null & 
 mkfifo $FIFODIR/$dbname.data.dump 
 /usr/bin/pg_dump --format=c -a $dbname --file=$FIFODIR/$dbname.data.dump 2>&1 < /dev/null &
done

/etc/bacula/delete_database_backup: owner root, group postgres, chmod g+x:

#!/bin/sh
DUMPDIR=/var/lib/pgsql/data/dump
FIFODIR=$DUMPDIR/fifo
for dbname in `psql -U postgres -d template1 -q -t <<EOF
select datname from pg_database where not datname in ('bacula','template0') order by datname;
EOF
` 
do
 kill `ps aux | fgrep "/usr/bin/pg_dump" | fgrep " --file=$FIFODIR/$dbname.schema.dump" | awk '{print $2}'`
 rm -f $FIFODIR/$dbname.schema.dump
 kill `ps aux | fgrep "/usr/bin/pg_dump" | fgrep " --file=$FIFODIR/$dbname.data.dump" | awk '{print $2}'`
 rm -f $FIFODIR/$dbname.data.dump
done
rm -f $DUMPDIR/globalobjects.dump

….use this helper file to determine the backups needed: /etc/bacula/listdbdump

#!/bin/sh
FIFODIR=/var/lib/pgsql/data/dump/fifo
for dbname in `psql -d template1 -q -U postgres -h $1 -p $2 -t  <<EOF
select datname from pg_database where not datname in ('bacula','template0') order by datname;
EOF
`
do
 echo "$FIFODIR/$dbname.schema.dump"
 echo "$FIFODIR/$dbname.data.dump"
done

….create these entries in bacula-dir.conf:

Job {
  Name = "hymost-db"
  JobDefs = "DefaultJob"
  Level = Full
  FileSet="myhost-db"   
  Client = myhost-fd   
  Schedule = "WeeklyCycleAfterBackup"
  # This creates a backup of the databases with pg_dump to fifos
  Client Run Before Job = "su - postgres -c \"/etc/bacula/make_database_backup\"" 
  # This deletes the backup and fifo files
  Client Run After Job  = "su - postgres -c \"/etc/bacula/delete_database_backup\""
  Priority = 17                   # run after main backup
}

FileSet {
  Name = "myhost-db"   
  Include {
     Options {
        signature = MD5
        readfifo = yes   
     }
       File = "/var/lib/pgsql/data/dump/globalobjects.dump"
       File = "|/etc/bacula/listdbdump myhost-or-ip.mynet.lan 5432"
  }
}

�D��D

All databases in a single file (pg_dumpall)

When trying to backup my PostgreSQL databases with pg_dumpall through a pipe, i noticed that the little trick with “exec > /dev/null” does not work for me. Probably because pg_dumpall sends data to stdout and “exec > /dev/null” redirects stdout to “/dev/null” (correct me if i am wrong). So i use a slightly different approach with a terminal multiplexer (screen)

backupPostgresqlBefore.sh

#!/bin/bash

FIFO="/var/tmp/postgresql.dump"

if [ -r $FIFO ]; then
        rm -f $FIFO
        if [ "$?" -ne "0" ]; then
                exit 1
        fi
fi

mkfifo $FIFO
if [ "$?" -ne "0" ]; then
  exit 1
fi

screen -d -m -S pgdump -t pgdump su -s /bin/bash -c "pg_dumpall -o -U postgres > ${FIFO}"

backupPostgresqlAfter.sh

#!/bin/bash

FIFO="/var/tmp/postgresql.dump"

rm $FIFO

Single pass backup with bpipe

You can also do backup with bpipe. The downside for bpipe is that you can't restore a database side by side (not with this configuration anyway). So the only way to restore is to drop the database and restore or restore to a separate host. The benefits when compared to FIFO or File based backups is that backups are made directly from the database and no pre and post scripts are needed. Also you don't need to prepare separately for the restore, as you do with FIFO backups.

Create the following script on the client: /etc/bacula/scripts/bpipe-pg-db

#!/bin/sh
# Define the command to get proper rights for postgresql access
SUCMD="su - postgres -c"

# Get databases to backup
for dbname in `$SUCMD 'psql -d template1 -q -U postgres -t'  <<EOF
select datname from pg_database where not datname in ('bacula','template0') order by datname;
EOF
`
do
 echo "bpipe:/PGSQL/$dbname.dump:$SUCMD '/usr/bin/pg_dump --format=c $dbname':$SUCMD 'pg_restore -d template1 -C'"
done

Make sure that your client is configured for plugins, the Plugins Directory must be set in the client configuration for bpipe to work.

Next create the backup fileset:

FileSet {
  Name = "PG Database Fileset"
  Include {
    Plugin = "\\|/etc/bacula/scripts/bpipe-pg-db"
  }
}

Run a backup and verify that you can see /PGSQL/* files.

Incremental backups and point-in-time recovery (PITR) with WAL archiving

Contributed by Pablo Marques

You can do a backup of the database logs and recover the database to any point in time using WAL archiving:

http://www.postgresql.org/docs/8.4/static/continuous-archiving.html

Create a directory to hold the logs:

mkdir /wals
chown postgres.postgres /wals
chmod 770 /wals

Change these lines in your postgresql.conf:

archive_mode = on               # allows archiving to be done
                                # (change requires restart)
archive_command = '/var/lib/pgsql/bin/copy_wal.sh "%f" "%p"'

Create the directory to hold the script:

mkdir /var/lib/pgsql/bin
chown postgres.postgres /var/lib/pgsql/bin

create the file /var/lib/pgsql/bin/copy_wal.sh

#!/bin/bash

# Modify this according to your setup
DEST=/wals
DATE=`date +"%b %d %T"`
if [ -e /var/lib/pgsql/backup_in_progress ]; then
 echo "$DATE - backup_in_progress" >> $DEST/wal-copy-log.txt 
 exit 1
fi
if [ -e $DEST/$1 ]; then 
 echo "$DATE - old file still there" >> $DEST/wal-copy-log.txt 
 exit 1
fi
echo "$DATE - /bin/cp $2 $DEST/$1" >> $DEST/wal-copy-log.txt
/bin/cp $2 $DEST/$1

and make it executable:

chown postgres /var/lib/pgsql/bin/copy_wal.sh
chmod 750 /var/lib/pgsql/bin/copy_wal.sh

Restart postgresql.

You have to make sure WAL archiving is working and the logs are copying to /wals before you go any further.

Next install a working bacula client on the database server and create the following files:

/etc/bacula/pre-base-backup.sh

#!/bin/bash

su - postgres -c "psql -c \"select pg_start_backup('Full_Backup',true);\""

/etc/bacula/post-base-backup.sh

#!/bin/bash

su - postgres -c "psql -c \"select pg_stop_backup();\""

/etc/bacula/pre-logs-backup.sh

#!/bin/bash

touch /var/lib/pgsql/backup_in_progress

/etc/bacula/post-logs-backup.sh

#!/bin/bash

# Modify this according to your setup
DEST=/wals

rm -f $DEST/*
rm -f /var/lib/pgsql/backup_in_progress

Secure the files:

chmod 750 /etc/bacula/pre-base-backup.sh /etc/bacula/post-base-backup.sh /etc/bacula/pre-logs-backup.sh /etc/bacula/post-logs-backup.sh

To be safe, we'll backup the log files just after the base backup finishes, so we need to create /etc/bacula/run_job.sh on the bacula server.

#!/bin/sh

sleep 10
echo "run job=$1 yes" | bconsole

Now on the bacula server create a FULL Database backup job and an INCREMENTAL Database backup job:

Job {
  Name = "db-server-postgresqlLogs-fd"
  JobDefs = "jobbaculadefs"
  Client = db-server-fd
  FileSet = "Linux-PostgresqlLogs-set"
  Accurate = no   # I want to see/restore all the files from all previous incremental backups
  #...
  #... 
  Schedule = "DatabaseLogsCycle-SQL"
  ClientRunBeforeJob = "/etc/bacula/pre-logs-backup.sh" 
  ClientRunAfterJob = "/etc/bacula/post-logs-backup.sh"
}

Job {
  Name = "db-server-postgresqlFull-fd"
  JobDefs = "jobbaculadefs"
  Client = db-server-fd
  FileSet = "Linux-PostgresqlFull-set"
  Level = Full
  Schedule = "DatabaseFullCycle-SQL"
  #...
  #...
  ClientRunBeforeJob = "/etc/bacula/pre-base-backup.sh" 
  RunScript {
    RunsWhen = After
    Command =  "/etc/bacula/post-base-backup.sh"
  }
  RunScript {
    RunsWhen = After
    RunsOnClient = no
    Command = "/etc/bacula/run_job.sh db-server-postgresqlLogs-fd"
  }
}

FileSet {
  Name = "Linux-PostgresqlFull-set"
  Include {
    Options {
      signature = MD5
      compression = LZO
      Sparse = yes
      aclsupport = yes
      xattrsupport = yes
     }
    File = "/var/lib/pgsql"
    }
    Exclude {
     File = "/var/lib/pgsql/data/pg_xlog"
    }
}
FileSet {
  Name = "Linux-PostgresqlLogs-set"
  Include {
    Options {
      signature = MD5
      compression = LZO
      Sparse = yes
      aclsupport = yes
      xattrsupport = yes
     }
    File = "/wals"
    }
}
# I run the FULLs on Sundays and the Incrementals every weekday.
Schedule {
  Name = "DatabaseLogsCycle-SQL"
  Run = Level = Incremental mon-sat at 01:10
}
Schedule {
  Name = "DatabaseFullCycle-SQL"
  Run = Level = Full sun at 01:00
}

You need to have a FULL backup before you can restore anything, so run a FULL ASAP.

In case of a disaster or if you just want to bring the database back to a previous date follow the instructions in section 24.3.3:

http://www.postgresql.org/docs/8.4/static/continuous-archiving.html

  1. Stop the server, if it's running.
  2. Clean out all existing files and subdirectories under the cluster data directory and under the root directories of any tablespaces you are using.
  3. Restore the database files from your FULL backup. Be careful that they are restored with the right ownership (the database system user, not root!) and with the right permissions. If you are using tablespaces, you should verify that the symbolic links in pg_tblspc/ were correctly restored.
  4. Restore all files to /wals from the Incremental backup up to the date you want to recover.
  5. Create a recovery command file recovery.conf in the cluster data directory (see [http://www.postgresql.org/docs/8.4/static/continuous-archiving.html#RECOVERY-CONFIG-SETTINGS)(usually restore_command = 'cp /wals/%f “%p”'). You might also want to temporarily modify pg_hba.conf to prevent ordinary users from connecting until you are sure the recovery has worked.
  6. Start the server
application_specific_backups/postgresql.txt · Last modified: 2017/04/05 13:07 by tessiof