Windows Server Performance on Amazon EC2

One of the trending conversations on the Web at the moment (and has been growing for quite some time) has been the idea of Cloud-based Computing.  While distributed storage has come quite a long way since the conversation began, its only really recently that we’ve had a choice of cloud based computing.

Amazon Web Services winning at Crunchies 2009

The idea behind computing in the cloud is genuinely, and extremely exciting.  Amazon Web Services, including EC2, S3 and the others – are a stroke of architectural genius.  But the problem is, that we’ve been given the false impression that cloud based computing is going to change the web.  We’re spun stories about how its going to radically decrease our infrastructure costs and we’re spoon fed fairy-tales that our scale issues are going to be as easy to fix as double clicking an icon.

You see the problem is this: at the end of the day you’re dealing with a Virtualized Environment – and its always slower than the real-deal.

While working on a project recently we bought into the whole “Elastic Cloud” as well.  We quickly learned that even though its relatively painless to spawn new instances your still ultimately bound to the same rules as you would with a cluster – if your code isn’t built to scale across several machines, its not going to.

After about 2.5 weeks of playing, tuning, perfecting the Amazon EC2 Windows instance we were running – the performance compromise was simply too great to validate its use in a production environment.  I suspect that the virtualization software being used by Amazon actually blocks processes from running in parallel (as they normally would on a physical server), since the machine had extreme difficulty in running more than one thing at a time.  And we found that Apache would do busy-waits when performing PHP Restful API calls to our other systems.  This resulted in 2 concurrent users using 100% CPU usage for the entirety of their sessions.

In the end, the Windows Amazon EC2 solution was completely untenable.  It wouldn’t even have been satisfactory for development let alone production.  So giving up on trying to find a magical “setting” – we thought we’d scale up to a more powerful Amazon instance.  But I didn’t get far before I was casually told that the AMI (the name for an Amazon VM image) I had lovingly crafted for 3 days to our own purposes, was not compatible with the Medium and Large instance settings (since I’d used a 32bit Windows Server as the base of the AMI).  At this pricing level, to constantly run the servers 24/7 for a whole month was going to cost the same, if not more than a similar(ish) physical machine hosted in the ‘old fashioned way. EPIC FAIL!

In the end we did get that physical machine, and despite having less physical memory than is available through EC2, the machine is using virtually 0% CPU and is serving stuff up faster than even we’d thought it would.

Perhaps virtualization technology will improve, and perhaps Microsoft’s Azure platform will be more beneficial – but in my books, using a Windows Server machine on Amazon’s EC2 is about as much fun as putting bamboo shoots under your fingernails.  It really does feel like a wolf in sheep’s clothing