Updated: Amazon (s AMZN) web services are having trouble this evening and in the process are taking down some major sites and services. Among sites being impacted are Quora and HipChat. In addition, the Amazon outage has had an impact on Heroku, a division of Salesforce (s CRM).
Amazon is one of the key infrastructure providers to some of the biggest and many well known startups such as Pinterest and Dropbox. The outages were related to Amazon’s EC2 and RDS services and the problems it seemed were localized to Amazon’s Virginia datacenter. Other services in the North Virginia data center such as ElastiCache and Elastic Beanstalk were also impacted. The problem appears to be rooted in a power outage.
On their status website, regarding EC2 Amazon notes:
We continue to investigate this issue. We can confirm that there is both impact to volumes and instances in a single AZ in US-EAST-1 Region. We are also experiencing increased error rates and latencies on the EC2 APIs in the US-EAST-1 Region.
9:55 PM PDT We have identified the issue and are currently working to bring effected instances and volumes in the impacted Availability Zone back online. We continue to see increased API error rates and latencies in the US-East-1 Region.
On the issue of RDS problems, AWS notes:
9:33 PM PDT Some RDS DB Instances in a single AZ are currently unavailable. We are also experiencing increased error rates and latencies on the RDS APIs in the US-EAST-1 Region. We are investigating the issue.10:05 PM PDT We have identified the issue and are currently working to bring the Availability Zone back online. At this time no Multi-AZ instances are unavailable.00:11 AM PDT As a result of the power outage tonight in the US-EAST-1 region, some EBS volumes may have inconsistent data.
01:38 AM PDT Almost all affected EBS volumes have been brought back online. Customers should check the status of their volumes in the console. We are still seeing increased latencies and errors in registering instances with ELBs.
AWS has suffered outages in past. A widespread problem impacted major websites in April 2011. In July 2008, Amazon’s S3 service was offline and caused major problems for many of its customers. I have been in touch with folks from Amazon and Heroku to get better idea of what is going on. In the interim enjoy some of the tweets about the outage.
have people not learned anything from past aws outages? Get yourself some fancy DNS routing and go multi-region
— Bryan Brandau (@agent462) June 15, 2012
Amazing how much of the world is powered by EC2.
Food truck guy just told me they ran out of spicy tofu, thanks to AWS.
What a world.
— Sandeep Parikh (@crcsmnky) June 15, 2012
Guys, stop saying "our infrastructure provider is having trouble". When half the internet disappears, everyone knows you're on AWS EC2.
— ceejayoz (@ceejayoz) June 15, 2012
As if millions of startups suddenly cried out in terror and were suddenly silenced. #AWS
— Joe Zadeh (@joebot) June 15, 2012
Heroku and parts of AWS are down. That means that half the Valley startups are down. Fortunately, they don't have any customers.
— Rakesh Pai (@rakesh314) June 15, 2012
I think this is the third time I've been very happy that @votizen is not running on the Northern Virginia #AWS.
— Jason Putorti (@putorti) June 15, 2012
RT @vlkun I love that #AWS status page considers "RDS DB instances unavailable" as "performance issues" and not "service disruption."
— Ylastic (@ylastic) June 15, 2012
AWS is down (and took half the web with it). Guess we'll just go to sleep and hope it's better in the morning. #justkidding #upallnight #aws
— exfm (@exfm) June 15, 2012
Image courtesy of Shutterstock user michaket.
We got impacted too, outage impacted us in us east 1b, funny thing is that our paging service pagerduty also went down temporarily
Outage lasted 15mins for us and now we are able to bring up boxes
Which major sites did suffer from this outage ? looking to see improvement in comparison to Apr-2011 outage.
Heroku was down and that impacted several apps. Quora was down. Asana was down. exfm was down. Svbtle was offline as well. It was a long list of people who were set back.
There’s no comparison to the EBS outage. This was an annoying glitch; something went blooey in a single AZ in US-EAST, but it was very, very minor compared what happended with EBS. That was a systemic infrastructure failure, but even then, applications that were properly designed for the AWS environment did not go down. The lesson is still to architect for failure; sites that haven’t grokked AWS’s strentghs and weaknesses at this point don’t do this don’t have much to gripe about. Not to mention, this is a vocal, engaged, but tiny part of the internet we’re talking about. It’s not like anything really bad happened.
Thumb app is down 🙁
Yeah, we’re still down due to this @Thumb!
The real issue is poorly designed and managed infrastructure. http://blog.controlgroup.com/2012/06/15/the-real-issues-with-the-aws-outage/
I agree. If you’re not designing for failure in your data center or cloud build outs shame on you.
People http://Lunacloud.com is the solution… I recently move there and i love it.
This is nuts, 3 days and the geniuses can’t fix it?
Soundtracker Radio was down as well. All back up and normal soon after that
Amazon provided more detail about the outage early Saturday morning. Turns out startups that were deployed across multiple availability zones were not affected. Anyone have any idea what this kind of redundancy costs?
Amazon has really started to annoy me these days. Apart from their immoral attempt to take over all things relating to buying and selling on the internet they now cause havoc by taking half the internet down with them. They have also recently entered the B2B market and begun to compete with sites such as Thomasnet and Daily Sales Exchange but as this news peice shows ( http://news.yahoo.com/amazon-getting-too-big-britches-160025093.html ) not all is well in the Amazon camp. Well i’m happy about that then 🙂
Thank you RUben for your sugestion, I have tested Lunacloud and it’s great (until now). I have made some tests and the performance is amazing 🙂
Thank you once again!!