When you store your objects on Amazon S3, but even though there are several features such as object lifecycle and versioning, there isn’t an out-of-the-box solution for cases where you just want to save an exact copy of objects that will never change (i.e. user generated content, files that
This is for you if…
- Your objects never change, and you want to back up exact copies of them.
- Having to wait for a few hours in case you need to recover a backup is not a problem for you.
- You are familiar with S3, EC2 and the AWS CLI and comfortable using it.
This *might* not be for you if …
- Your files change over time (i.e. create versions of them).
- You need a restore from backup to be quick.
You would rather stay away from unix commands and/or prefer a GUI (drag&drop backup/restore).
I’ve got about 2Tb of data on an Amazon S3 bucket that I’ve migrated from Rackspace (more on this on a future post) and I need to back them up… in case s**t goes wrong.
PART ONE – Our Backup Bucket
- We need to create a separate S3 bucket, where we will copy every new object in our main bucket. In order to follow convention and for this tutorial, we’ll name it mybucket-backup, so we now have two S2 buckets: mybucket and mybucket-backup.
- Next step is to enable versioning and configure the lifecycle policy to use Glacier on the backup bucket:
- Go into the bucket and show its properties
- Click on Versioning, and then Enable Versioning
- Now go to the Lifecycle tab and Add Rule, a dialog will open.
We want the following settings:
- Apply the rule to: Whole Bucket
- Actions on objects:
Untick everything else.
- Rule Name: Backup -> Create
If we did everything right, we should see the following with our new rule:
PART TWO – Our EC2 instance
Now we are going to create a background task on a Free Tier EC2 machine that will keep both mybucket and mybucket-backup synchronised.
- Spin up an EC2 instance from your console, t2.micro will be enough.
- Connect to your instance through SSH (help?)
- Edit your crontab:
- Add the following line and save the file
Shell1* /2 * * * aws s3 sync s3://mybucket s3://mybucket-backup >/dev/null 2>&1
How it works – Our backup strategy
All the changes we make onto mybucket will be replicated onto mybucket-backup every 2 hours.
Simultaneously, every file on mybucket-backup that’s older that 1 day will be stored using Amazon Glacier, which will save us costs for the backup storage.
Feel free to reach out to me if you have any question or problem!