Amazon AWS


20
May 10

EBS-based instance problems

The instance I run this blog was slightly impacted a few days ago. All of a sudden I could not ssh into the instance and the Apache was running really painfully slow. It did not really work at all. While I was already fantasizing that my really super awesome new web-2.0-youtube-facebook-twitter crossbreed vKaiser.com had gotten some traction and was overloaded by the publicity, I ended up in the AWS site to see the service status. The service status was fine and my hopes were still high. Then the truth hit me, there were others as well in the forums who had similar issues, EBS based image becomes unresponsive and reboot does not help. Can’t either take a spapshot of the EBS volume, but stopping and starting might help. Just have to prepare for the instance to go down very, very slowly.

So, as I could not take a snapshot and was not particulary interested in using a few days old snapshots, I decided to just shut down the instance and give it the time it needs. Eventually, the server went down and I could restart it just fine. Situation back to normal. This incident could of course have been avoided easily by having a backup system ready or even a load balanced setup if I would have the money to run it.

No luck in getting traction.


11
Feb 10

Dear Amazon, please make DevPay available in Europe!

So you have your great new application utilizing all the awesomeness of Amazon AWS? How you gonna sell it?

There are vague definitions of what is a cloud service and one of the prerequisites was that you could buy the service with your credit card and to pay only for the resources you use. Amazon DevPay allows developers to sell their AMIs (with application installed), all you need is a business in the US.

There are two ways how you can use DevPay: through AMIs you have built with the service installed and also by selling an application which uses S3 as the storage location. It’s a really good start, but still lacks a few things, like if you actually would like to provide high availability for the client who bought your AMI, the client will have to roll their own solution to achieve this. I bet in principle not many clients are willing to do that and would just like the application to be available.

When I think about this dilemma, a solution might be to have some kind of a root AMI which a customer would buy (and pay by the hour like crazy). This would then take care of the availability of the service by seeding new servers which are members of the application through some very wicked autoconfiguration. Actually, Elastra does almost this as their product allows the user to define architectures and then deploy them to Amazon. In principle, it would be possible to have an Elastra AMI with some configuration inside which the client could then deploy, but it does sound like a hack and not really something you could sell as a product. By the way, the Elastra’s product looks great for defining and deploying architectures at least internally within an organization.

The seeding of an application from a root AMI might make clients able to buy a redundant application and not pieces of it. So far, though, this is just a beautiful dream since the DevPay is not available in Europe so it isn’t possible to start with even the simplest model of serving an AMI of your superduperapp for the public. That is, if you don’t implement billing yourself. If I’m not totally wrong, the billing API is not public which makes rolling out your own solution impossible (you can’t either put limits based on usage charges in your account due to this reason).

I would be willing to adjust the definition of a cloud service regarding the pay by the hour and order by credit card before DevPay is available in Europe.


15
Dec 09

Unexpected Outage

The site went down today for a few hours and the worst thing is, it was sort of my own fault. The last time I was playing with booting from Amazon EBS I must have made a mistake when detaching volumes from the (wrong) instances. Thus, the incident was caused by the EBS volume not being attached. When I was trying out that EBS booting the one and only EBS volume which is attached to this EC2 instance had the “Attachment Information” as “busy” and not as “attached” which seems to be the standard status of a well working volume. I probably detached the volume and the status changed to “busy” state.

I remember wondering what that “busy” meant at that time. Now I know.

It should go without saying that this status information of “busy” is really, really uninformative. How about “detaching” instead when a user wants to detach a volume? And why did it take about one and a half weeks to detach? Is there a log somewhere I really did detach a volume? The lesson learned from this incident is to act if your EBS volume goes to “busy” state. All might work fine for a while but be warned, it will detach at some point. Also, it would be really nice if there would be some abstraction layer in between the real names of the volumes and instances and the ones available to customers. With this layer a user could add more descriptive names to instances or what ever objects there are. Or then really start using different accounts for development and production stuff… Really.


6
Dec 09

Booting from Amazon EBS

Amazon has announced a new feature of booting instances from EBS volume. This feature changes radically the way how AWS instances can be preserved if compared to the traditional volume bundling and uploading to S3.

Though this all sounds nice, it isn’t really too easy to convert existing instaces to boot from EBS. All previous instances boot from the local instance disk. Amazon AWS management console indicates the location where the instance boots with the Root Device Type column. Previous instances have the root device type as instance-store while EBS images have the type as ebs.

To get started with the EBS images, there are a few images from Amazon which are useful as a base image. It was really easy to just boot one of them and mount one EBS volume which contained a snapshot of the database and the www root. Installing basic LAMP stuff, changes in httpd.conf and my.cnf to point in the EBS volume and the AWS instance which boots from EBS was ready. I could now create snapshots of the system in minutes and also shut down the system when I don’t need it and thus not get billed for the instance. Awesome! The snapshot also had the EBS volume snapshotted which was mounted to the instance.

The EBS image feature is likely to open a wide range of new applications and really change the way how an elastic service is been constructed. Basically, a member of a pool of web servers can now be created in advance and just turned on when there is a demand to use it. Of course, it first must update itself to be on par with the other pool members.

I am not really sure if it was my old lap top which I used to work with the EBS images or what, but the AWS management console was painfully slow in responding, especially when using Firefox. And when using IE, I did not get anything else in the pop-up window than the button to create the snapshot:

createImage

Firefox, though really slow in responding, gave the option of typing the name in the required field. Also, if you create EBS image and then decide to get rid of the EBS image, you have to delete the AMI first, otherwise the management console will complain that it’s in use.

I have yet to decide should I go with the instance-store or EBS with my instance. It will add something to my costs of running my site in AWS, but that shouldn’t be too much. I find a lot more benefits with EBS than running in instance-store, but then again I fear getting lazy in responding to possible threats of instances going down and disaster recovery.

Pauli Haikonen


8
Nov 09

Monitoring an Amazon AWS instance

I have put together a task list of what I would like to test with the Amazon AWS infrastructure and so far I have gotten my web server running with EBS. Also the volume bundling and instance creation has been tested a few times. The system has been running quite ok for the past two weeks. It has been interesting to view the error log on Apache, people searching for example the page for phpmyadmin…

Anyway, the next thing I would want to test is to get some kind of monitoring in place. I have some experience with Nagios so I took that route and installed it on the basic m1.small instance using these instructions which got me a clean installation of Nagios. I could then add a host definition of this site and the service which to monitor (http).

I did use the public interface (elastic IP) since it is the only static ip I own. This is, though, the first implication of the problems related to running monitoring system in cloud. With Amazon AWS, you can get by default, five elastic IPs. That will not get you too far, but of course 20 instances is the maximum amount of instances by default anyway, but I have understood more can be purchased if there is a need. How do you deal with the instances that don’t have an elastic IP? You could get around this problem by creating all your instances with a VPN connection and then registering those ips, but well… does not sound too easy.

And then there is the actual alerting when something goes wrong. It’s kind of difficult to have the monitoring server to send SMS messages since it’s impossible to connect a physical device to a virtual machine. I will try installing Skype on the monitoring server and then use Skype to actually send the SMS onwards, but it will still use Internet while on its way to Skype SMS gateway. If there is a connection problem somewhere, the message will not reach me. I should also consider the reliability of the VM running Nagios, which is best effort by most. The system should be clustered using some method, but have to see how Nagios supports this. Oh, and by the way, there has been a few cases when the elastic IP address block (the whole /17) has been blacklisted for spam which in effect stops you from receiving the alerts with email.

To summarize, if I would have the option, I would not run monitoring in Amazon or any other Cloud Computing facility. I would have it the old way – physical – and enjoying the pleasure of firmware upgrades and power failures and all the good stuff.

Pauli Haikonen