Home / Industry News / AWS sneezed, and the internet caught a cold: inside the October 20th outage

AWS sneezed, and the internet caught a cold: inside the October 20th outage

/

  • DNS issues with DynamoDB endpoints in US‑East‑1 caused widespread global disruption.
  • Consumer apps, enterprise tools, finance, and media services went offline for a number of hours.
  • AWS powers about one‑third of global cloud infrastructure, including critical systems.
  • Outage highlights fragility of relying on a few hyperscalers.
  • Multi‑cloud and regional providers can offer resilience and sovereignty.

Yesterday, the internet had one of those “where were you when…” moments. Amazon Web Services (AWS)—the backbone of much of the modern web—suffered a major outage that rippled across industries, continents, and millions of users.

For CSPs, MSPs, and IT leaders, this wasn’t just another headline about downtime. It was a stark reminder of how deeply global digital infrastructure and operations depend upon a handful of hyperscalers—and what that means for resilience, sovereignty, and diversification.

What happened?

The disruption began in the early hours of October 20th, when AWS reported increased error rates and latency across multiple services in its US-East-1 region, stemming from its northern Virginia data center plant (incidentally its oldest and biggest site). The culprit? DNS resolution issues tied to DynamoDB API endpoints—a critical piece of AWS’s infrastructure that underpins countless applications.

While Amazon engineers worked to mitigate the problem, the outage rippled across the world. By mid-morning US time, millions of users were reporting problems worldwide, and the internet felt like it was running on half power.

Who was affected?

The short answer: almost everyone.
The long answer:

  • Consumer apps: Snapchat, Fortnite, Venmo, Roblox, Peloton, and even Amazon’s own Alexa assistant went dark or spluttered.
  • Enterprise tools: Zoom, Slack, and Canva—staples of the modern workplace—were hit hard.
  • Financial services: Banking platforms and payment apps like Venmo saw widespread disruption. UK banks Lloyds, Halifax, and Bank of Scotland were also affected; and even the government’s tax, payment, and customs authority HMRC experienced downtime.
  • Streaming and media: Disney+, Prime Video, and other entertainment platforms were knocked offline.

When AWS falters, it’s not just a few websites that start behaving like it’s the late Nineties—it sends shockwaves across the globe’s entire digital infrastructure and affects daily life.

How much of the internet runs on AWS?

Put simply, a lot… or too much, depending on who you ask. By most estimates, it powers roughly one-third of the global Cloud infrastructure market. That translates into thousands of businesses—from startups to Fortune 500 giants—building their operations on AWS.

Critical infrastructure, from healthcare platforms to government services, also leans heavily on AWS. That’s why even a few hours of downtime can feel like a global-scale event.

What is a DynamoDB API endpoint?

For those in the trenches of Cloud architecture, DynamoDB needs no introduction. It’s AWS’s fully managed NoSQL database service, prized for its scalability and low-latency performance.

The DynamoDB API endpoint is essentially the gateway through which applications communicate with the database. Every read, write, or query request flows through these endpoints. If DNS resolution for those endpoints fails—as it did on October 20th—applications can’t talk to their databases. The result? Apps stall, transactions fail, and services grind to a halt.

For CSPs, this is a reminder that even the most robust managed services are only as resilient as their underlying control planes and DNS infrastructure.

The million-dollar question: are we too reliant on a few providers?

For CSPs, the October 20th outage underscores a structural vulnerability in the global internet: concentration risk. AWS, Microsoft Azure, and Google Cloud collectively account for the lion’s share of Cloud workloads worldwide. According to recent estimates by Synergy Research Group, Amazon’s Cloud infrastructure market share for Q2 2025 was 30%, followed by Microsoft Azure at 20% and Google Cloud at 13%. That dominance has obvious benefits—economies of scale, global reach, and constant innovation—but it also creates a single point of failure on a massive scale.

When AWS falters, CSPs and MSPs downstream are left in a difficult position. Even if your own infrastructure is healthy, your customers don’t see the distinction between your service and the hyperscaler you’re built on. To them, downtime is downtime. That means:

  • Accountability without control: You’re in the hot seat to explain outages you didn’t cause and can’t fix.
  • Erosion of trust: Clients may question why their critical workloads are tied so tightly to one provider.
  • Operational fragility: Outages in core services like DynamoDB, S3, or EC2 can cascade into failures across SaaS platforms, enterprise tools, and even IoT systems.

This raises a strategic question for the industry: is it sustainable for so much of the world’s digital infrastructure to hinge on three US-based hyperscalers? For CSPs, the answer isn’t just philosophical—it’s about risk management, customer retention, and long-term competitiveness.

The silver lining: opportunities for diversification

For CSPs and MSPs, every outage is also an opportunity to differentiate and add value. Clients are increasingly aware of the risks of hyperscaler concentration, and they’re looking to their providers for solutions. This is where diversification strategies come into play:

  • Multi-Cloud architectures: Position yourself as the partner who can design, implement, and manage workloads across multiple providers. This isn’t just about redundancy—it’s about optimizing for cost, compliance, and performance.
  • Regional and sovereign Clouds: Non-US providers, particularly in Europe and Asia, can leverage sovereignty and compliance as a competitive edge. For industries like healthcare, finance, and government, data residency and regulatory alignment are as important as uptime.
  • Edge and specialized services: Smaller providers can carve out niches by offering ultra-low-latency compute at the Edge, industry-specific compliance frameworks, or tailored managed services that hyperscalers can’t match at scale.
  • Advisory and trust: CSPs who can guide clients through the complexity of diversification—from workload portability to cost modeling—will become indispensable partners rather than commodity resellers.

In short, AWS’s stumble is a reminder that resilience is a powerful and sometimes-underrated differentiator. Providers who can help clients spread risk, maintain sovereignty, and build flexible architectures will not only weather these storms but also grow stronger because of them.

The October 20th AWS outage will be remembered as one of those moments when the internet collectively held its breath. For Cloud professionals, it’s a wake-up call: resilience isn’t just about uptime SLAs— it’s about strategy, diversification and trust.

While Amazon is yet to release a full statement on what went wrong, the outage is certainly food for thought. Those who turn this disruption into an opportunity and help clients navigate a multi-Cloud world will do so by championing sovereignty—and by building the next generation of Cloud services that don’t grind to a halt when one hyperscaler sneezes.

Francesca Cotton Avatar

This might also interest you .