Avoid Disaster: Script backups easily to Amazon S3

Avoid disaster

I released avoid_disaster today, it let's you do following things:
  • script backups easily via Python and upload them to Amazon S3
  • easily create daily, weekly or monthly backups
  • Amazon S3 is a cheap backup option and your backups are stored on 3 different data centers

The script is super simple, but it should be quite useful for anyone that wants to create cheap backups.

You are also welcome to fork the code on GitHub and supply extensions and patches :-)

Example usage

First install boto and avoid_disaster:

$ sudo easy_install boto
$ sudo easy_install avoid_disaster

Here is some example code that can get you started:

import os
from avoid_disaster import S3Uploader, gunzip_dir, generate_file_key

#--- Globals ----------------------------------------------
AWS_KEY = 'YOUR AWS KEY'
AWS_SECRET = 'YOUR AWS SECRET'

s3_uploader = S3Uploader(AWS_KEY,
                         AWS_SECRET,
                         'backups.your_domain.com')

#--- Easy usage ----------------------------------------------
#Daily
s3_uploader.compress_and_upload('test_dir/',
                                'test_dir.%(weekday)s.tgz',
                                replace_old=True)

#Monthly
s3_uploader.compress_and_upload('test_dir/',
                                'test_dir.%(month_name)s.tgz',
                                replace_old=True)

#Weekly
s3_uploader.compress_and_upload('test_dir/',
                                'test_dir.%(week_number)s.tgz',
                                replace_old=True)


#--- Generic usage ----------------------------------------------
file_key = generate_file_key('test_dir.%(weekday)s.tgz')
gz_filename = gunzip_dir('test_dir/', file_key)
s3_uploader.upload(file_key, gz_filename, replace_old=True)
os.remove(gz_filename)
Announcements · Python · Stuff · Todoist · Wedoist 29. Jun
3 comments so far

Hi,

Thank you for sharing your code. I just need a backup solution for my servers these days. And I also wrote some python tools for backup few days ago. Today I merge your avoid_disaster with my code. I make much more flexible, it can be extended easily now, you can not only upload your backup to S3, you can also write your own storage class, e.g. sftp. Also, you can write your own file handler which compress the files. What's more, I use standard Python modules only, it should be runnable in most os (I only test it in windows for now, but I will deploy it to my linux server later). Not only support static folder, mysql or svn can also be dumped. You can merge the files back from this repo:

http://github.com/victorlin/av...

It became a little complex, but however, it is very powerful. Here is a simple example:

import logging 

import avoid_disaster
from avoid_disaster.targets import static, mysql
from avoid_disaster.handlers import zip
from avoid_disaster.storages import s3
from avoid_disaster.managers import base

logging.basicConfig(level=logging.INFO)

handler = zip.Zip()
storage = s3.S3Storage('your_access_key', 
                       'your_secret_key', 
                       'bucket_name')

backup = base.BackupManager()

target = static.Static('/path/to/dir_or_file')
backup.add('backup_dir.%(week_number)s.zip', target, handler, storage)

target = mysql.Mysql('user', 'password', 'database')
backup.add('backup_mysql.%(week_number)s.zip', target, handler, storage)

backup.run()

This file can be found in /example.py in the git repo.

As you can see, there are different modules. Target is the file, database, svn repo or something like that to backup. And FileHandler is for compressing the target dump files. I have only write the zip handler for now. The Storage is where those backup files to go, for now I only implement S3 storage. And finally, the manager is the object maintains a list of task for backup and do the backup job. In short:

Target --dump--> Handler --process--> Storage --store--> done

I hope this could be helpful.

Thanks.

Victor:
Wow, that's a great extension! Thanks :)

Yes! S3 backup is perfect for disaster avoidance. However, I would instead run an EC2 instance with an rsync daemon, and then use rsync to backup. Rsync is far more efficient at sending less data over the wire, which makes the approach much more economical.

The main advantage (for me) of weekly/monthly/etc tarballs, is that you get course versioning for free, the downside is that it takes more data transfer and more storage space.

Post a comment
Commenting on this post has expired.
© 2000-2009 amix. Powered by Skeletonz.