Backing up and restoring an Amazon AWS instance

On Sunday I got my virtual server running in Amazon EC2 and it has been happily runnning there since. I have done some homework and know not to rely on Amazon keeping the instance running forever. I should expect it to fail. At the beginning this felt suspicious… I should expect the server to fail? What kind of a service is that? After giving this some thought I eventually realized this is how we should of course expect all IT systems to behave. It just happens that mostly IT people tend to rely on good luck instead of having a tested backup scheme, not to mention a tested disaster recovery plan.

Amazon forces you to think in a different way. The virtual machines run on not-so-high-end server hardware, which means that at any given moment, the server can fail and the poor sysadmin has to figure his/her way out of the situation. Good thing is, Amazon provides some tools to do the task as well, for example the Amazon Simple Storage Service and the Amazon EBS, which provide persistent storage for files used by the instances. The characteristic difference between these two is, that the instance can mount an EBS volume, but S3 works with REST and SOAP interfaces, thus making EBS fast but expensive and S3 slow but cheap.

In my previous post, I setup my first instance and application running in the cloud. I was a bit lucky, since the system was up untill today because I did not do the basic steps of bundling my instance and shapshotting the volume. So, to do this a few things have to be done.

  1. Test if you have ec2-ami-tools installed on the instance you are bundling. Install them if they are missing.
  2. Move your private key (pk-something.pem) and certificate (cer-something.pem) file to the instance’s /mnt directory (this is fine since /mnt will not be bundled).
  3. Use ec2-bundle-vol command to build the bundle (for example: ec2-bundle-vol -d /mnt/image -k pk.pem -c cert.pem -u “AWS account id” -r i386). You might get an error like this, if you are using the standard ami, but this is no reason for concern, as it will most likely complete the bundling:

    NOTE: rsync with preservation of extended file attributes failed. Retrying rsync
    without attempting to preserve extended file attributes…
    NOTE: rsync seemed successful but exited with error code 23. This probably means
    that your version of rsync was built against a kernel with HAVE_LUTIMES defined,
    although the current kernel was not built with this option enabled. The bundling
    process will thus ignore the error and continue bundling. If bundling completes
    successfully, your image should be perfectly usable. We, however, recommend that
    you install a version of rsync that handles this situation more elegantly
    .

  4. Use ec2-upload-bundle command to upload the bundle to S3 (ec2-upload-bundle -b myimages -m image/image.manifest.xml -a “Access Key ID” -s “Secret Access Key”).
  5. Register your private AMI. I did this through the AWS management console.

If all went well, you now should have your private AMI created and ready to be provisioned. I proceeded to test my setup with first doing a snapshot of the EBS volume. I did this with xfs_freeze -f /mountpoint command and then snapshotting the volume through AWS console. Of course, I should have done sync and database lock too, but decided to live dangerously since this is just a test setup. After the snapshot was completed I unfreezed the partition and terminated the running instance. I started provisioning the replacement server, but to my surprise I did not have the option of a small instance anymore. They started from large. I was puzzled and did not really want to start provisioning of a large instance. It could have been something with the web GUI so I changed my zone from EU to US and back and suddenly I did get the small one in the list as well. Great! It’s fantastic though to realize how easy it is to scale vertically with Amazon AWS, though it’s not too different than with VMware ESX which goes along the same lines: shut down, change the VM properties and boot up with more RAM and/or CPU.

The server was initiated and I could log on. Apache and MySQL did not start though, because the EBS volume was not attached. I proceed to add that and gave it another reboot but still the services did not start. I then went to look for the mountpoint which had disappeared. Adding this and another reboot later the services were running happily! I also had to manually give the Elastic IP to the new instance. I suppose the mount point information is not included within the bundle, but I have to investigate this further.

I now have my system running again. I also have a little more confidence for the Amazon AWS as well. This effort though required a good part of manual work. The next step would be to automate some of this and to create a recovery plan in case the server fails.

Pauli Haikonen

Tags: , , ,

One comment

  1. Update on the fstab issue. It actually is a documented feature with the ec2-bundle-vol that you need to add -fstab parameter to preserve the current fstab file. Othewise a new one is written. More details can be found here.

Leave a comment