So Amazon’s S3 had a major outage this weekend. There service dashboard had the service unavailable for over 6 hours. This of course was bad news for all the startups that rely on S3 for their applications, as most were crippled with this outage.
I am a big believer in cloud computing and feel strongly that all new software should be built leveraging these new architectures, but when they have outages (and of course they will) it is so important to build your application to handle these sorts of issues–in a sense to fail gracefully. Regardless of if you use cloud computing or you host your own servers most outages are inevitable. There will always be the chance that there are natural disasters or that hardware could fail–both of which are impossible to predict. By being able to trust and leverage a service that has experts in storage it allows you to build your business on that expertise in a cost effective way. Also chances are that a company with a large operations team is going to be able to respond and diagnose issues quicker. In addition in Amazon’s case–they have several data centers and redundancy that is only available to businesses on a very large scale. There are very fewer startups that would want to spend their precious capital buying hardware in two geographical locations for increased redundancy (nor would this be a smart use of one’s capital). I would like to believe that because of Amazon’s large infrastructure and expertise in storage that their outages would be fewer and far between than if I was relying on my own storage solution. Although only time will tell if this is the case.
One thing though, is that this does present opportunity. If someone could build storage on top of amazon’s or could promise higher up-time some companies would pay a premium for that higher guarantee. This could be a good business. And one could leverage a bunch of different cloud computing offerings underneath their abstraction layer. Google has a storage service, Nirvanix is another competitor, and as this model becomes more popular I am sure others will emerge. There is still work to be done with cloud computing but it is an economical and game changing development in technology. I think that in a few years there will be computing services, storage services, etc and only a few companies will control all the hardware.
So what do you do if you have built your application using S3 and it goes down? Make sure you address your customers’ experience. This is the most important thing you can do. If your application is inoperable, provide your customers a means to reach your tech support. If part of your application can still work, make sure it does. Try to segment your applicaiton so your customers aren’t blocked. And finally, if possible, build a back up method for storage. Maybe it is using another web service, or your own local servers to keep things functioning. Then customers can still add new data, they just may not be able to access the old data. Or if you cache some stuff, try to write a smart service that will look in the cache first. The main thing is: focus on the user experience and make sure that they don’t encounter a dead application.