As is well known, there are two types of people: 1. People who do backup 2. People who will start doing backup.
Let us guess, you still have some project with your data or even data of your application users without a backup? Why you don't do it:
- You might think it will cost you a lot of extra money
- You don't want to spend time on it
The post will provide ready-to-use solution which will minimize both problems in one shot.
We are talking about S3 Glacier which is service inside of Amazon Web Services. According to Glacier pricing it costs only
$0.004 per GB / Month in US regions. If your gzipped database takes
100Mb you can create a backup every week, and after a year you will collect
~5.2Gb of data so it will cost you only
$0.02 per month after a year. Is it a good price for the peace?
With this price you will never want to delete backups - it is cheaper just leave them there forever. Glacier provides
99.999999999% durability of data which means you can always be calm - your data will be there. Glacier uses extra-cheap storages like magnetic tapes/compact disks, which are served by some robot and stored in special storerooms, that's why data retrieval time may take up to several hours.
So you know about Glacier, but how to start using it by spending minimal time?
The simplest way how to use Glacier for backups of any data:
2. Download the
glacieruploader binary. You can do it with
3. Create a Glacier vault in AWS Console. I intentionally selected
us-east-1 region in AWS - even if you are far away from the US, backup speed is not important, but Glacier prices are cheaper in the US.
4. Create a bash script e.g. with
#!/bin/bash cd "$(dirname "$0")" # enter any filename you want filename=docker_volume_$(date +"%d_%m_%Y").tar.gz echo "Doing backup $filename" >> /tmp/backup.log # you can select a folder for backup here GZIP=-9 tar zcf $filename /var/lib/docker/volumes/stackname_db-data/_data # copy ACCESS_KEY_ID and SECRET_ACCESS_KEY pair from your newly created user export AWS_ACCESS_KEY_ID=XXXXXXXXXX export AWS_SECRET_ACCESS_KEY=YYYYYYYYYYYYYY # make sure you have the same region in endpoint URL, as in your vault java -jar glacieruploader-impl-0.1.1-jar-with-dependencies.jar --endpoint https://glacier.us-east-1.amazonaws.com --vault backups --upload $filename echo "Doing backup $filename done" >> /tmp/backup.log
Set executable permission to file:
chmod +x /home/user/doBackup.sh
5. Use some scheduler e.g. cron, to run backup script periodically. For example to execute it at 05:00 every Monday, edit crontab file with
crontab -e command and type:
0 5 * * 1 /home/user/doBackup.sh
Then save the file.
To recover backups use some utility, for example fastglacier.
Recover price for Standart Retrieval Time is
~$0.01 per GB, if you will restore one
100Mb backup will cost you
0.001$. Also Amazon charges to any requests to Glacier which fastglacier will do before downloading, requests cost
$0.05 per 1,000, it will be also a very small charge.
Some recommendation to minimize backup expenses and time:
- When you develop your application keep application state separately and in one place (one or several folders). For example in Docker use volumes and backup only volumes. All codebase (stateless part) should be stored on the repository, not backed up.
- Use maximum
GZIPcompression level. In the script it is
-9which is the strongest compression.
- Calculate the size of your data and costs and select backups period appropriately