Finally, we are able to make real backups! We acquired a Quantum SuperLoader 3, a 16 tape LTO-6 library. There were some unexpected hoops and bumps down the road, but everything appears to be working now.

Tape Library for Backups

We acquired a Quantum SuperLoader3 16 tape LTO-6 library. This will allow us to hopefully make proper backups.

Packages

Following packages were installed:

  • scsitools
  • mt-st
  • mtx

Simplify things and setup

You can set the TAPE and CHANGER environment variables, then you do not need specify -f TAPEDEVICE with mt or -f CHANGERDEVICE with mtx, respectively.

There are a number of options possible to setup tape libraries, like default block size compression, etc. This is done via /etc/stinit.def:

# This file contains example definitions for different kinds of tape
# devices. 
#
# You can find some examples in /usr/share/doc/mt-st/examples.
#
# Common definitions to be applied to all tape devices
# (This is the driver s default)
#{buffer-writes read-ahead async-writes}

manufacturer=IBM model="ULTRIUM-HH6" {
buffer-writes 
read-ahead=1
async-writes=1
scsi2logical=1
can-partitions=1
can-bsr=1
drive-buffering=1
mode1 blocksize=0 compression=0
mode2 blocksize=0 compression=1
mode3 disabled=1
mode4 disabled=1
}

The most important setting was scsi2logical, without this it is not possible to read or write from the drive. This will logical block addresses instead of device specific addresses. For more options see man stinit.

mode=1-4 are the accessed by the device names /dev/st0, /dev/st0l, /dev/st0a and /dev/st0m. By prepending n we access the non-rewinding devices, i.e. the tape is not rewound after every operation. This is what is used most of the time.

Performance tests

Naive testing with tar and mbuffer resulted in a pretty bad write performance of around 55 MB/s:

tar cf - Wieth | mbuffer -m 4G -P80 -o /dev/nst0

even if the data was in memory!

By catting that data to the tape drive, this command did improve the performance to the specified 160 MB/s:

tar cf - Wieth | mbuffer -m 4G -P80 | cat - > /dev/nst0

Even faster write speeds, depending on the compressibility of the data, when using hardware compression (/dev/nst0l):

tar cf - Wieth | mbuffer -m 4G -P80 |cat - > /dev/nst0l

This gives transfer speeds of about 200 MB/s.

Interestingly, this did not improve it:

tar cf - Wieth | mbuffer -m 4G -P80 > /dev/nst0

So what are the reasons for these discrepancies?

The reason for this is that default block size is wrong. Testing with different block sizes showed that with a block size of 1 MByte we get optimal performance. The maximum block size seems to be 4 Mbytes.

tar cf - Wieth | mbuffer -m 4G -P80 -b 1M > /dev/nst0

Oh, and it helps to defragment the directory to be backed up first! We need this because our shared directory is very defragmented due to the long period of 95% used space. This is a known problem for any file system which is full ( i.e. ~90 %).

For ext4 use e4defrag and for XFS use xfs_fsr.

Autochanger Usage

Before unloading a type we need to offline the tape drive:

mt -f /dev/st0 offline

Unload the drive and insert the tape in slot 1:

mtx -f /dev/sg29 unload 1

Load the drive from slot 2:

mtx -f /dev/sg29 load 2

Script for manual backup of directories

This is the current backup script:

!/bin/bash

# nst0 compression=0
# nst0l compression=1 
TAPE_DEVICE=/dev/nst0l
MBUF_BLKSIZE=512k
MBUF_BUFSIZE=4G # mbuffer buffer size (-m)
MBUF_MINPERC=90 # mbuffer start writing if buffer is MBUF_BUFSIZE % full (-P)
READY=ONLINE    # string from mt status describing device is ready
confirm () {
    read -r -p "${1:-Are you sure? [y/N]} " response
    case $response in
        [yY][eE][sS]|[yY])
            true
            ;;
        *)
            false
            ;;
    esac
}
echo "# Date: $(date +"%F %T")" >> backup.log

echo "Will backup files from dirs: $@"
echo "to tape device: $TAPE_DEVICE"
du -s "$@" | awk '{ sum+=$1} END {print "Total backup size: " sum/1024**2 " GBytes"}'
! confirm && exit 0
echo "Rewind tape?"
confirm && mt -f $TAPE_DEVICE rewind && echo "# Rewound tape" >> backup.log

# write each dir as a single file. 
# Use fsf or asf with mt command to advance to files if you want to extract them
echo -e "# Parameters: \n# blksize $MBUF_BLKSIZE\n# tape $TAPE_DEVICE" >> backup.log
echo "# $(mtx -f /dev/sg1 status|grep Transfer |grep -o -e 'VolumeTag.*L6')" >> backup.log
echo "# directory startblock endblock" >> backup.log
for DIR in "$@"; do
    START_BLOCK=$(mt -f $TAPE_DEVICE tell | grep -o -e "[0-9]*")
    echo "Directory: $DIR (starting at block $START_BLOCK)"
    #find $DIR -depth -print0 \
    #| afio -o -1 -0 -b 10k -\
    sleep 30
    tar cf - $DIR \
    | mbuffer -m ${MBUF_BUFSIZE} -P${MBUF_MINPERC} -s ${MBUF_BLKSIZE} -o $TAPE_DEVICE
    RET=$?
    if [ $RET -ne 0 ]; then
            echo "ERROR: an error occurred while writing $DIR (RETCODE=$RET)"
            exit 1
    fi
    END_BLOCK=$(mt -f $TAPE_DEVICE tell | grep -o -e "[0-9]*")
    python -c "print 'Wrote %.3f GBytes' % (($END_BLOCK - $START_BLOCK)/1024./2)"
    echo "Last block $END_BLOCK"
    echo "$DIR $START_BLOCK $((END_BLOCK - 1))" >> backup.log
done

The lines

	find $DIR -depth -print0 \
	| afio -o -1 -0 -b 1m -\

are not working properly for some reason. When looping through the directories I encounter broken pipe errors from afio. Using tar I do not to see the behavior. (Actually I did see the same behavior, the sleep 30 seems to fix it. This will be investigated further)

  • I did not specify a bigger block size (-b, --blocking-factor) in the tar command because I will read the drive via mbuffer
  • a block size of 512k gives, despite the former tests, a sustained uncompressed data write speed of 160 MBytes/s to tape
  • the smaller block size, together with the sleep 30 between each directory fixed the broken pipe errors.

Reading the archive back with mbuffer

mbuffer -i /dev/nst0l -s 1M | tar tf -

Sometimes it can not read the tape, despite having the correct block size. Doubling the block size in such a case makes the archive readable again … this only seems to be the case for very small archives, like 1 - 8 MByte. Bigger ones work as advertised.

Here is some typical output of the backup script:

Directory: svenja (starting at block 553721)

in @ 0.0 kB/s, out @ 0.0 kB/s, 141 GB total, buffer 0% fulll

summary: 141 GByte in 22 min 28.9 sec - average of 107 MB/s, 5x empty

Wrote 141.387 GBytes

Last block 843281

Directory: timothy (starting at block 843281)

in @ 0.0 kB/s, out @ 0.0 kB/s, 161 GB total, buffer 0% fulll

summary: 161 GByte in 17 min 45.2 sec - average of 155 MB/s

Wrote 161.296 GBytes

Last block 1173615

Directory: robin (starting at block 1173615)

in @ 0.0 kB/s, out @ 0.0 kB/s, 168 GB total, buffer 0% full

summary: 168 GByte in 18 min 48.2 sec - average of 152 MB/s

Wrote 167.808 GBytes

Last block 1517285

Directory: dominic (starting at block 1517285)

in @ 0.0 kB/s, out @ 0.0 kB/s, 270 GB total, buffer 0% full

summary: 270 GByte in 30 min 14.6 sec - average of 153 MB/s

Wrote 270.467 GBytes

Last block 2071201

Directory: tatjana (starting at block 2071201)

in @ 151 MB/s, out @ 151 MB/s, 323 GB total, buffer 100% full

As you can see in the output the buffer was 5 times empty while backing up the svenja folder. The reason is that this folder has some subfolders with 10 thousands of very small files inside.

Compression can lead to truly fantastic transfer rates (if the data is compressible):

in @ 323 MB/s, out @ 445 MB/s, 127.1 GB total, buffer 95% full

Bacula integration

The tape library will also be used with Bacula, so a test with it is necessary. This is very nicely documented, one have to use the btape command.

First, we need to configure the store daemon. This is part a working configuration file of the storage daemon of bacula (/etc/bacula/bacula-sd.conf):

# Our Tape library
Autochanger {
  Name = SuperLoader3                      #
  Device = LTO-6
  Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"
  Changer Device = /dev/sg28
  #
  # Enable the Alert command only if you have the mtx package loaded
  # Note, apparently on some systems, tapeinfo resets the SCSI controller
  #  thus if you turn this on, make sure it does not reset your SCSI
  #  controller.  I have never had any problems, and smartctl does
  #  not seem to cause such problems.
  #
  # Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
  # If you have smartctl, enable this, it has more info than tapeinfo
  # Alert Command = "sh -c 'smartctl -H -l error %c'"
}


Device {
  Name = LTO-6
  Media Type = LTO-6
  RandomAccess = no;
  Archive Device = /dev/nst0l    # Normal archive device
  Autochanger = yes
  LabelMedia = no;
  AutomaticMount = yes;
  AlwaysOpen = yes;
  Minimum block size = 524288
  Maximum block size =  524288
  Maximum file size = 64g
  Maximum spool size = 256g #
  Spool Directory = /tank/bacula/spool
}

Messages {
  Name = Standard
  director = nas2-dir = all
}

Note: If the bacula storage daemon is running, one cannot access the tape drive! For manual backups either stop the bacula-sd daemon or offline the drive in bconsole (unmount). Don’t forget to mount the drive again or bacula will not be able to use the device!

In order to test the tape drive and library you have to stop the storage daemon first (service bacula-sd stop), then run btape -c /etc/bacula/bacula-sd.conf /dev/nst0 command. This will give a prompt similar to bconsole. The first command to run is test, this will perform some testing of the tape library, if this test fails you can not use bacula with this tape drive.

Another test you can do with btape is speed, this will test the speed of your backup drive for several block sizes.

If the NAS can not keep up the data rate with the tape drive, you will get terrible performance for backups. To prevent shoe-shining, we stage the backup data first on a fast disk and then stream it to the tape. Bacula calls this “Spooling” and it is done automatically if configured properly. The drawback is that you get overall only 50% of the rated tape performance because, unlike with tar and mbuffer, the data is only staged again when the previous data was completely written to tape. It keeps the tape drive healthy, though.

Bacula says there are 4,903,542 512k blocks on one tape.

Configuration of Bacula

# Incremental Pool definition
Pool {
  Name = DataInc
  Storage = LTO6
  Pool Type = Backup
  Recycle = yes                       # Bacula can automatically recycle Volumes
  AutoPrune = yes                     # Prune expired volumes
  Volume Retention = 3 Weeks          # 3 Weeks
  Maximum Volume Bytes = 0 # Limit Volume size to something reasonable: 2.5e12/1024**4*1e3 = 2273.73 or 0 to use full tape
  Maximum Volumes = 2                 # Limit number of Volumes in Pool: 2 * 2.5 TB = 5 TB
  Cleaning Prefix = "CLN"
}

We use three pools for Full, Differential, and Incremental backup jobs for the main data backup. An extra pool is kept for additional servers (mail, share, database, etc.).