Quantum SuperLoader3 LTO6 Tape Library with Debian Wheezy
Finally, we are able to make real backups! We acquired a Quantum SuperLoader 3, a 16 tape LTO-6 library. There were some unexpected hoops and bumps down the road, but everything appears to be working now.
Tape Library for Backups
We acquired a Quantum SuperLoader3 16 tape LTO-6 library. This will allow us to hopefully make proper backups.
Packages
Following packages were installed:
- scsitools
- mt-st
- mtx
Simplify things and setup
You can set the TAPE and CHANGER environment variables, then you do not need specify -f TAPEDEVICE
with mt or -f CHANGERDEVICE
with mtx, respectively.
There are a number of options possible to setup tape libraries, like default block size compression, etc. This is done via /etc/stinit.def
:
# This file contains example definitions for different kinds of tape
# devices.
#
# You can find some examples in /usr/share/doc/mt-st/examples.
#
# Common definitions to be applied to all tape devices
# (This is the driver s default)
#{buffer-writes read-ahead async-writes}
manufacturer=IBM model="ULTRIUM-HH6" {
buffer-writes
read-ahead=1
async-writes=1
scsi2logical=1
can-partitions=1
can-bsr=1
drive-buffering=1
mode1 blocksize=0 compression=0
mode2 blocksize=0 compression=1
mode3 disabled=1
mode4 disabled=1
}
The most important setting was scsi2logical, without this it is not possible to read or write from the drive. This will logical block addresses instead of device specific addresses. For more options see man stinit
.
mode=1-4
are the accessed by the device names /dev/st0
, /dev/st0l
, /dev/st0a
and /dev/st0m
. By prepending n
we access the non-rewinding devices, i.e. the tape is not rewound after every operation. This is what is used most of the time.
Performance tests
Naive testing with tar and mbuffer resulted in a pretty bad write performance of around 55 MB/s:
tar cf - Wieth | mbuffer -m 4G -P80 -o /dev/nst0
even if the data was in memory!
By catting that data to the tape drive, this command did improve the performance to the specified 160 MB/s:
tar cf - Wieth | mbuffer -m 4G -P80 | cat - > /dev/nst0
Even faster write speeds, depending on the compressibility of the data, when using hardware compression (/dev/nst0l
):
tar cf - Wieth | mbuffer -m 4G -P80 |cat - > /dev/nst0l
This gives transfer speeds of about 200 MB/s.
Interestingly, this did not improve it:
tar cf - Wieth | mbuffer -m 4G -P80 > /dev/nst0
So what are the reasons for these discrepancies?
The reason for this is that default block size is wrong. Testing with different block sizes showed that with a block size of 1 MByte we get optimal performance. The maximum block size seems to be 4 Mbytes.
tar cf - Wieth | mbuffer -m 4G -P80 -b 1M > /dev/nst0
Oh, and it helps to defragment the directory to be backed up first! We need this because our shared directory is very defragmented due to the long period of 95% used space. This is a known problem for any file system which is full ( i.e. ~90 %).
For ext4 use e4defrag and for XFS use xfs_fsr.
Autochanger Usage
Before unloading a type we need to offline the tape drive:
mt -f /dev/st0 offline
Unload the drive and insert the tape in slot 1:
mtx -f /dev/sg29 unload 1
Load the drive from slot 2:
mtx -f /dev/sg29 load 2
Script for manual backup of directories
This is the current backup script:
!/bin/bash
# nst0 compression=0
# nst0l compression=1
TAPE_DEVICE=/dev/nst0l
MBUF_BLKSIZE=512k
MBUF_BUFSIZE=4G # mbuffer buffer size (-m)
MBUF_MINPERC=90 # mbuffer start writing if buffer is MBUF_BUFSIZE % full (-P)
READY=ONLINE # string from mt status describing device is ready
confirm () {
read -r -p "${1:-Are you sure? [y/N]} " response
case $response in
[yY][eE][sS]|[yY])
true
;;
*)
false
;;
esac
}
echo "# Date: $(date +"%F %T")" >> backup.log
echo "Will backup files from dirs: $@"
echo "to tape device: $TAPE_DEVICE"
du -s "$@" | awk '{ sum+=$1} END {print "Total backup size: " sum/1024**2 " GBytes"}'
! confirm && exit 0
echo "Rewind tape?"
confirm && mt -f $TAPE_DEVICE rewind && echo "# Rewound tape" >> backup.log
# write each dir as a single file.
# Use fsf or asf with mt command to advance to files if you want to extract them
echo -e "# Parameters: \n# blksize $MBUF_BLKSIZE\n# tape $TAPE_DEVICE" >> backup.log
echo "# $(mtx -f /dev/sg1 status|grep Transfer |grep -o -e 'VolumeTag.*L6')" >> backup.log
echo "# directory startblock endblock" >> backup.log
for DIR in "$@"; do
START_BLOCK=$(mt -f $TAPE_DEVICE tell | grep -o -e "[0-9]*")
echo "Directory: $DIR (starting at block $START_BLOCK)"
#find $DIR -depth -print0 \
#| afio -o -1 -0 -b 10k -\
sleep 30
tar cf - $DIR \
| mbuffer -m ${MBUF_BUFSIZE} -P${MBUF_MINPERC} -s ${MBUF_BLKSIZE} -o $TAPE_DEVICE
RET=$?
if [ $RET -ne 0 ]; then
echo "ERROR: an error occurred while writing $DIR (RETCODE=$RET)"
exit 1
fi
END_BLOCK=$(mt -f $TAPE_DEVICE tell | grep -o -e "[0-9]*")
python -c "print 'Wrote %.3f GBytes' % (($END_BLOCK - $START_BLOCK)/1024./2)"
echo "Last block $END_BLOCK"
echo "$DIR $START_BLOCK $((END_BLOCK - 1))" >> backup.log
done
The lines
find $DIR -depth -print0 \
| afio -o -1 -0 -b 1m -\
are not working properly for some reason. When looping through the directories I encounter broken pipe errors from afio. Using tar I do not to see the behavior. (Actually I did see the same behavior, the sleep 30 seems to fix it. This will be investigated further)
- I did not specify a bigger block size (
-b, --blocking-factor
) in the tar command because I will read the drive via mbuffer - a block size of 512k gives, despite the former tests, a sustained uncompressed data write speed of 160 MBytes/s to tape
- the smaller block size, together with the
sleep 30
between each directory fixed the broken pipe errors.
Reading the archive back with mbuffer
mbuffer -i /dev/nst0l -s 1M | tar tf -
Sometimes it can not read the tape, despite having the correct block size. Doubling the block size in such a case makes the archive readable again … this only seems to be the case for very small archives, like 1 - 8 MByte. Bigger ones work as advertised.
Here is some typical output of the backup script:
Directory: svenja (starting at block 553721)
in @ 0.0 kB/s, out @ 0.0 kB/s, 141 GB total, buffer 0% fulll
summary: 141 GByte in 22 min 28.9 sec - average of 107 MB/s, 5x empty
Wrote 141.387 GBytes
Last block 843281
Directory: timothy (starting at block 843281)
in @ 0.0 kB/s, out @ 0.0 kB/s, 161 GB total, buffer 0% fulll
summary: 161 GByte in 17 min 45.2 sec - average of 155 MB/s
Wrote 161.296 GBytes
Last block 1173615
Directory: robin (starting at block 1173615)
in @ 0.0 kB/s, out @ 0.0 kB/s, 168 GB total, buffer 0% full
summary: 168 GByte in 18 min 48.2 sec - average of 152 MB/s
Wrote 167.808 GBytes
Last block 1517285
Directory: dominic (starting at block 1517285)
in @ 0.0 kB/s, out @ 0.0 kB/s, 270 GB total, buffer 0% full
summary: 270 GByte in 30 min 14.6 sec - average of 153 MB/s
Wrote 270.467 GBytes
Last block 2071201
Directory: tatjana (starting at block 2071201)
in @ 151 MB/s, out @ 151 MB/s, 323 GB total, buffer 100% full
As you can see in the output the buffer was 5 times empty while backing up the svenja folder. The reason is that this folder has some subfolders with 10 thousands of very small files inside.
Compression can lead to truly fantastic transfer rates (if the data is compressible):
in @ 323 MB/s, out @ 445 MB/s, 127.1 GB total, buffer 95% full
Bacula integration
The tape library will also be used with Bacula, so a test with it is necessary. This is very nicely documented, one have to use the btape command.
First, we need to configure the store daemon. This is part a working configuration file of the storage daemon of bacula (/etc/bacula/bacula-sd.conf
):
# Our Tape library
Autochanger {
Name = SuperLoader3 #
Device = LTO-6
Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"
Changer Device = /dev/sg28
#
# Enable the Alert command only if you have the mtx package loaded
# Note, apparently on some systems, tapeinfo resets the SCSI controller
# thus if you turn this on, make sure it does not reset your SCSI
# controller. I have never had any problems, and smartctl does
# not seem to cause such problems.
#
# Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
# If you have smartctl, enable this, it has more info than tapeinfo
# Alert Command = "sh -c 'smartctl -H -l error %c'"
}
Device {
Name = LTO-6
Media Type = LTO-6
RandomAccess = no;
Archive Device = /dev/nst0l # Normal archive device
Autochanger = yes
LabelMedia = no;
AutomaticMount = yes;
AlwaysOpen = yes;
Minimum block size = 524288
Maximum block size = 524288
Maximum file size = 64g
Maximum spool size = 256g #
Spool Directory = /tank/bacula/spool
}
Messages {
Name = Standard
director = nas2-dir = all
}
Note: If the bacula storage daemon is running, one cannot access the tape drive! For manual backups either stop the bacula-sd daemon or offline the drive in bconsole (unmount
). Don’t forget to mount the drive again or bacula will not be able to use the device!
In order to test the tape drive and library you have to stop the storage daemon first (service bacula-sd stop
), then run btape -c /etc/bacula/bacula-sd.conf /dev/nst0
command. This will give a prompt similar to bconsole.
The first command to run is test, this will perform some testing of the tape library, if this test fails you can not use bacula with this tape drive.
Another test you can do with btape is speed, this will test the speed of your backup drive for several block sizes.
If the NAS can not keep up the data rate with the tape drive, you will get terrible performance for backups. To prevent shoe-shining, we stage the backup data first on a fast disk and then stream it to the tape. Bacula calls this “Spooling” and it is done automatically if configured properly. The drawback is that you get overall only 50% of the rated tape performance because, unlike with tar and mbuffer, the data is only staged again when the previous data was completely written to tape. It keeps the tape drive healthy, though.
Bacula says there are 4,903,542 512k blocks on one tape.
Configuration of Bacula
# Incremental Pool definition
Pool {
Name = DataInc
Storage = LTO6
Pool Type = Backup
Recycle = yes # Bacula can automatically recycle Volumes
AutoPrune = yes # Prune expired volumes
Volume Retention = 3 Weeks # 3 Weeks
Maximum Volume Bytes = 0 # Limit Volume size to something reasonable: 2.5e12/1024**4*1e3 = 2273.73 or 0 to use full tape
Maximum Volumes = 2 # Limit number of Volumes in Pool: 2 * 2.5 TB = 5 TB
Cleaning Prefix = "CLN"
}
We use three pools for Full, Differential, and Incremental backup jobs for the main data backup. An extra pool is kept for additional servers (mail, share, database, etc.).