AWS apologizes for February 28 outage, takes steps to prevent similar events

AWS apologizes for February 28 outage, takes steps to prevent similar events

AWS apologizes for February 28 outage, takes steps to prevent similar events

It appears a single programming misstep caused a cascade of events that resulted in intermittent outages for Amazon clients ranging from government websites to music streaming services.

AWS took a lot of heat when its S3 storage component went down for several hours on Monday, and rightly so, but today they published a post-mortem explaining exactly what happened complete with technical details and how they plan to prevent a similar event from occurring again in the future.

Amazon Web Services, the remote data centers that power some of the world's most popular websites, experienced a major disruption lasting several hours on Tuesday that left numerous apps and websites - including Business Insider - hard to access for many users. Without these two systems operating, Amazon said it was unable to handle any customer requests for S3 itself, or those from services like EC2 and Lambda functions connected to S3.

Amazon.com's cloud computing unit said that the outage that shook up a sizable part of the internet Tuesday was caused by human error.

Not only that, but other Amazon AWS services also use S3, so they, too, were affected.

An enormous number of sites, including Airbnb, Business Insider, Expedia, Medium, Netflix, Quora, Slack, Trello, and the Securities and Exchange Commission experienced issues related to the outage, VentureBeat reported at the time of the outage. Both systems required a full restart.

4 billion year-old fossils found in Canadian quartz
However, lead researcher Matthew Dodd is confident that his team's Canadian discovery will hold up to the scrutiny. Now the challenge for the researchers is to prove the structures were life and not just inorganic chemistry.

Hugh Jackman Breaks His 'Wolverine' Diet With Jimmy Fallon
But too often here it's silly, like a bunch of teenage boys trying out the new words they heard on the playground for shock value. Laura just wants to be a child, but being the genetic copy of Wolverine gives her no such youthful luxuries.

Simultaneous Taliban attacks kill at least 16 in Kabul
The gunfire was concentrated to a district police headquarters located near a military training school, Reuters reports. The carnage underscores rising insecurity from the resurgent Taliban in the war-scarred country.

"As of 1:49 PM PST, we are fully recovered for operations for adding new objects in S3, which was our last operation showing a high error rate".

AWS said it has modified the ability for "too much capacity to be removed too quickly".

That, according to AWS, should prevent an incorrect input from triggering another outage. To be fair, AWS outages like this one are extremely rare.

The team also reprioritized work to partition one of the affected subsystems into smaller "cells", which was planned for later this year but will now begin right away.

"We want to apologize for the impact this event caused for our customers".

"While we are proud of our long track record of availability with Amazon S3, we know how critical this service is to our customers, their applications and end users, and their businesses", the company wrote in an online message.

Related news