Backup your EC2 Amazon Linux WordPress Blog to S3

So I finally decided to run my own Linux server and utilize the AWS free tier for a year.

It was a great learning experience and I wanted to share the most difficult part of the process, backing up my new blog to S3. Automatically of course.

I had just finished configuring my sever how I wanted. I followed these great guides I found on the net to get me up and running.

After I wrote a few posts and configured some plugins on this here blog it was time to figure out how to automate Linux. Something I have never done before.

Step 1) Generate a script to take backups of my site.

This wasn’t easy, and took a few hours of my time. Over an hour of which was finally tracked down to starting my .sh file on a windows system (using notepad++). Apparently the carriage return character on Windows and Linux is different and there was something in this file that made all my files get generated with ‘?’ in the file name. When I tried to download the files being created by the backup script in WinSCP I was greeted with invalid file name syntax errors. It wasn’t until I ran the bash script with sudo that an prompt appeared upon file deletion showing me ‘\r’ was in the file name and not a question mark.

Once I FINALLY tracked down the root cause of my file creation issues I was off to the races. Thankfully during all this I got the hang of Nano (after admitting temporary defeat learning VIM) and was able to easily create a new shell script file from the ssh window and get my script working. Below is the code I ended up with. Mostly based off this LifeHacker article.

Actually starting Step1:

So here is what you need to do to configure automatic WordPress backups to S3. My approach is to backup weekly and keep 1 month of backups on the server and 90 days of backups in S3.

I started off by making a /backups and /backups/files directory in my ec2-user home directory. This folder will hold my scripts and backup files going forward. This is the directory you will be in by deault after you SSH into an amazon linux instance as ec2-user.

With Nano open, copy and paste the below code into nano. Then press Control+X to save the file.

Once the backups.sh file is created, we need to give it execute privileges.

Now we can run it to make sure it works with bash. Or move right onto scheduling it to occur automatically as a cron job.

Checking it with bash:

 

Step 2) Configuring the script to run automatically

Scheduling it with cron:

First things first for me, scheduling cron jobs is done with crontab. Crontab’s default editor was VIM which is very confusing to a Linux novice such as myself. Lets change the default crontab editor to nano…

And now lets configure our backup shell script to run Sunday mornings at 12:05 AM EST (0505 UTC).

A great guide is found here.

Don’t forget to Control-X to have nano save the edited crontab file. It appears as Amazon Linux automatically elevates to sudo to accomplish crontab changes because I configured everything without sudo.

Now our site is backing up automatically. So lets offload these backups to S3.

Step 3) Syncing the automatic weekly backup files to S3

H/T to this helpful blog post for guidance.

Create an S3 bucket. Then create an IAM user, assign it to a group, and give the group the following policy to restrict it to only having access to the new bucket. Replace the bucketname as needed.

Or you can just use your root IAM credentials, whatever floats your boat.

Next up, install s3cmd onto your Amazon Linux instance. While s3cmd is very useful, its third party developed and not an actual Amazon command line feature, so we have to download it from another repository. We can install s3cmd onto an Amazon Linux instance with the following command.

You will have to accept some certificate prompts during the install.

Once s3cmd is installed we can configure it with our IAM credentials. Don’t worry, with proper restricted IAM credential setup it will fail the configuration check at the end.

Create a shell script to sync our backup files to s3. Make sure we are still in the backups directory and use nano to create the script.

Paste the following code into nano and press Control-X to save. Don’t forget to change the bucket name.

Configure the script to be executed.

Now you can use some of the steps above to execute the script manually to make sure it works or schedule the script to run a few minutes after the backup script via cron.

You can check the logfile with tail for more information.

Wrapping it up

At this point you should have your compressed WordPress database backups and compressed Apache files being created weekly. Then they are being synchronized to S3 shortly after. What if we want to keep the files on S3 longer than the files on the server?

All we need to do is enable versioning on the bucket. Then apply a lifecycle policy to permanently delete previous versions after 60 days. Now we have 90 day backup retention.

Anyway, I hope this helps. I tried to link to all blogs that helped me get up and running.

Leave a Comment