Command Line Options
-c <filename>
path to the configuration file
-t
test mode; report what would be done, but don't actually do anything
-f
force backup even if it has already been done today (use with caution; this may overwrite the current diff and thereby lose data)
-d
print debug messages
Configuration Options
First off you'll need some general configuration statements:
DiffDir
a directory where tar can store its diff files (necessary for incremental backups).
Bucket
the destination bucket on S3.
GpgRecipient
the address of the public key to use for encryption. The gpg keyring of the user you're running the script as (i.e., root, for a systemwide cron job) must contain a matching key. If this option is not specified, no encryption is performed.
GpgKeyring
path to the keyring file containing the public key for the GpgRecipient. Defaults to ~/.gnupg/pubring.gpg
S3KeyFile
the file containing your AWS authentication keys.
ChunkSize
the size of the chunks to be stored on S3, in bytes.
Then you can specify as many directories, databases, and repositories as you like to be backed up. These may be contained in <Cycle> blocks, for the sake of reusing timing configuration, or may be blocks themselves with individual timings.
<Cycle name>
<name>: a unique identifier for the cycle. This is not used except to establish the uniqueness of each block.
Frequency
how often a backup should be made at all, in days.
Phase
Allows adjusting the day on which the backup is made, with respect to the frequency. Can take values 0 <= Phase < Frequency; defaults to 0. This can be useful, for instance, if you want to alternate daily backups between two backup sets. This can be accomplished by creating two nearly identical backup specifications, both with Frequency 2, but where one has a Phase of 0 and the other has a Phase of 1.
Diffs
tells how many incremental backups to make between full backups. E.g., if you want daily diffs and weekly fulls, set this to 6.
Fulls
tells how many total cycles to keep.
Directory <name> or <Directory name>
<name> a directory to be backed up. May appear as a property within a cycle block, or as a block in its own right, e.g. <Directory /some/path>. The latter case is just a shorthand for a cycle block containing a single Directory property.
MySQL <databasename> or <MySQL databasename>
In order for this to work, the user you're running the script as must be able to mysqldump the requested databases without entering a password. This can be accomplished through the use of a .my.cnf file in the user's home directory. <databasename> names a single database to be backed up, or "all" to dump all databases. The Diffs property is ignored, since mysql dumps are always "full".
Subversion <repository> or <Subversion repository>
In order for this to work, the user you're running the script as must have permission to svnadmin dump the requested repository. <repository> names a single svn repository to be backed up. Incremental backups are handled by storing the latest backed-up revision number in a file under DiffDir. As elsewhere, setting Diffs to 0 (or just leaving it out) results in a full dump every time. (Thanks to Kevin Ross for adding the incremental behavior here).
SubversionDir <repository-dir> or <SubversionDir repository-dir>
<repository-dir> a directory containing multiple subversion repositories, all of which should be backed up. (this feature was inspired by http://www.hlynes.com/2006/10/01/backups-part-2-subversion)
UseTempFile
Causes the data to be backed up to be dumped to a local file before being streamed to S3. This is most useful in a MySQL block. because the slow upload speed to S3 can cause mysqldump to time out when dumping large tables. Letting mysqldump write to a temp file before uploading it obviously avoids this problem. An alternate solution is to set long mysqld timeouts in my.cnf:
net_read_timeout=3600 net_write_timeout=3600
That may be the right solution for some circumstances, e.g. if the databases are larger than the available scratch disk. The UseTempFile configuration will work for regular filesystem backups and Subversion backups as well, at the cost of (temporary) disk space and more disk activity.
Next: Recovery
