If you run a self-hosted WordPress Blog, it’s probably a good idea to back it up now and again. Here I present some code that I use to back up my blog to Amazon’s S3 service, which might help you to do the same.
Database
The first thing to do is to back up the MySQL database which underlies WordPress. Hopefully, you’ve got a dedicated MySQL DB for this purpose; if you do, backing it up is simply a matter of using mysqldump:
mysqldump -u ::USERNAME:: ::DATABASE_NAME:: -p > ::OUTPUT_FILENAME::
For instance, if your blog DB is named “blog_db”, and you’ve got a DB admin user named “admin”, the command line might look like this:
mysqldump -u admin blog_db -p > blog_db_bak.sql
(The mysqldump program will prompt you for your password.)
WordPress Data
In principle, I believe that the only parts of a WordPress installation that you need to back up are the wp-config.php file and the wp-content directory tree. I’m not certain, however, and since bandwidth, storage, and CPU are all so cheap, let’s just back up the entire installation:
tar -cf ::ARCHIVE_FILENAME:: -C ::BLOG_PARENT_DIRECTORY:: ::BLOG_DIRECTORY_NAME::
For instance, if your blog is installed in /var/www/html/blog, the command line might look like this:
tar -cf blog_20090327.tar -C /var/www/html/ blog
(Structuring the tar command in this way roots all pathnames at blog, instead of, say, /.)
Composing the Archive
Next, you’ll want to fold the MySQL backup you made above into the archive you just created, and then compress the resulting file:
tar -rf blog_20090327.tar blog_db_bak.sql
gzip blog_20090327.tar
Posting to S3
I like to use the Python S3 wrappers provided by Amazon. Assuming that you have their S3.py file stored on your import path, the following script will store a specified file to a specified bucket. (The filename will be used as the key.)
import os
import sys
import S3
def make_connection(fn):
return S3.AWSAuthConnection(*file(fn, 'rb').read().strip().split('\n'))
def guarantee_bucket(c, name):
return (c.check_bucket_exists(name).status == 200) or \
(c.create_bucket(name).http_response.status == 200)
def has_item(c, bucket, key):
return bool([e for e in c.list_bucket(bucket).entries if e.key == key])
def put_item(c, bucket, fn):
return c.put(bucket, os.path.split(fn)[1], file(fn, 'rb').read()).http_response.status == 200
if __name__ == '__main__':
if (len(sys.argv) != 4):
print 'Usage: python %s s3_credentials_filename bucket target_filename' % __file__
exit(4)
bucket = sys.argv[2]
fn = sys.argv[3]
name = os.path.split(fn)[1]
c = make_connection(sys.argv[1])
if (not guarantee_bucket(c, bucket)):
print 'Could not find or create bucket "%s"' % bucket
sys.exit(1)
if (has_item(c, bucket, name)):
print 'Item "%s" already exists in bucket "%s"' % (name, bucket)
sys.exit(2)
if (not put_item(c, bucket, fn)):
print 'Count not write item "%s" to bucket "%s"' % (name, bucket)
sys.exit(3)
For example, if the preceeding script is named “s3_post.py”, you’d invoke it like this (continuing the earlier example):
python s3_post.py aws.ak blog_backups blog_20090327.tar.gz
A few things to note:
- The script takes as its first argument a filename which is assumed to identify a file containing your S3 username and password, separated by a newline.
- If the specified bucket doesn’t exist, it will be created.
- If the specified bucket already contains the specified key, an error
will be reported.
Putting It Together
Here’s a script that wraps up all the steps we’ve just seen:
# 'Validate' arguments
if [ $# -ne 6 ]
then
echo "Usage: $(basename $0) mysql_user mysql_db s3_credentials s3_bucket archive_name blog_directory"
exit 1
fi
# Pathnames
cwd=$(dirname $0)
wrk=${cwd}/working
# Delete any pre-existing data in the working directory, and create a new one
rm -f -R $wrk
mkdir $wrk
# Backup the blog directory to a TAR file
tar -cf ${wrk}/${5} -C $(dirname $6) $(basename $6)
# Backup the DB (will prompt for password)
mysqldump -u $1 $2 -p > ${wrk}/${2}_bak.sql
tar -rf ${wrk}/${5} -C $wrk ${2}_bak.sql
# Zip up archive
gzip ${wrk}/${5}
# Post to S3
python s3_post.py $3 $4 ${wrk}/${5}.gz
# Clean up
rm -f -R $wrk
This script might be invoked as follows:
./backup admin blog_db aws.ak blog_backups blog_20090327.tar /var/www/html/blog
Naturally, in production, you’d want to hard-code some of those arguments.