When ChatGPT experienced a global outage on June 10, 2025, more than 500,000 Google searches were conducted about the problem, ranking it as the second most-searched topic nationwide and highlighting the widespread impact. This search volume represents roughly one out of every 680 Americans actively seeking information about what went wrong.
What seemed like a few hours of downtime revealed a stark fact: many software developers now rely completely on AI to do their coding work.
That’s not inherently bad. But it’s worth understanding where AI actually helps and where it just creates expensive new problems.
Automating the tedious
AI shines at the tedious stuff nobody wants to do. Not the interesting architectural decisions. The boring, repetitive tasks that make experienced developers want to quit and become baristas. This includes:
- Boilerplate generation. Need test cases? JSON serialization code? AI cranks it out faster than you can copy-paste from Stack Overflow. These are perfect automation candidates because they’re well-defined and low-risk.
- Syntax wrestling. Debugging SQL queries or building complex regex patterns? AI saves hours of documentation diving. It’s particularly good at the “I know what I want but can’t remember the exact syntax” problems.
- Project scaffolding. AI excels at brainstorming ideas and scripting mostly complete Terraform to get projects started. It provides solid foundations you can build on.
Code review and quality assurance: good when there’s human judgment
AI tools are increasingly valuable for catching common bugs, enforcing coding standards, and suggesting improvements during code reviews.
GitHub Copilot and similar tools can flag potential security vulnerabilities, suggest performance optimizations, and ensure consistent formatting across teams. However, they excel at surface-level issues rather than deeper architectural problems that require human judgment about business logic and system design.
Infrastructure as code: context is everything
LLMs can generate Terraform and CloudFormation scripts efficiently. But there’s a catch that bites most teams.
As one senior engineer at Microsoft put it: AI works great for open-source projects but struggles with internal services because it “hasn’t been trained on the internal codebase or any of these SDKs.” You get “context-blind” errors that look plausible but don’t actually work.
Claude 3 often performs better out-of-the-box for infrastructure code. When developers tested IaC generation, Claude’s drafts “needed the least tweaking.” Most teams use GPT-4 for first drafts, then iterate manually. Smart teams build RAG pipelines to inject context about their actual environments.
Log analysis and incident response
This is where things get interesting. Models like Claude 3.7 can process massive amounts of logs and spot patterns you might miss. The large context window is a game-changer for analyzing complex systems.
But here’s the problem: when an AI hallucinates during a critical outage, it can send your on-call engineer down a rabbit hole in the middle of the night. The smart approach is to use AI as a conversation starter to get a quick summary and some hypotheses, but always verify against the raw data.
Documentation: useful but requires curation
All major LLMs can generate documentation from code, summarize technical discussions, and create runbooks from incident response chats. GPT-4o’s speed makes it particularly good for real-time summarization.
The challenge is avoiding volumes of low-quality “fluff” documentation. AI generates text that sounds professional but often lacks the practical insights that make documentation actually useful.
Always have human experts review and edit for clarity, accuracy, and conciseness. AI can handle the grunt work of initial drafts, but the value comes from human curation.
Cloud cost optimization: the wild west
Using LLMs for cost savings is still an emerging and experimental area.
In theory, an LLM could analyze billing logs and usage metrics to identify inefficiencies. Some AI-driven tools are beginning to incorporate these capabilities. Google’s Gemini Cloud Assist, for example, includes features that suggest improvements based on a user’s cost versus performance priorities.
In practice, there are no established benchmarks for FinOps tasks. While you can use GPT-4 to parse cost data, it will likely provide only generic advice without task-specific fine-tuning.
LLM-driven cost optimization is a promising but still developing field. For now, it’s best to rely on the native cost management tools provided by cloud platforms, which are increasingly incorporating their own AI-driven recommendations.
Final thoughts
This brings us to the most important point. AI doesn’t fix a skills gap; it widens it. As seasoned developers have pointed out, “AI can magnify the level of competence, it doesn’t necessarily improve it.”
The blunt reality is that AI makes crappy developers output more crappy code, and excellent developers more excellent code.
An LLM is a tool. It’s the engineer who provides the critical thinking, the architectural vision, and the domain knowledge. If you think good developers are expensive, wait until you see the damage a bad developer can do with an AI that lets them write broken code at ten times the speed.
Your job isn’t going away. It’s becoming even more critical. You are the expert who must set the vision, guide the implementation, and, most importantly, have the wisdom to know when the AI is wrong. The future isn’t about letting AI take the wheel; it’s about being the expert driver who knows how to use a powerful, and sometimes dangerous, new engine.
Learn more at CloudFest
CloudFest, the world’s premier cloud infrastructure festival, offered top-tier sessions this year on how to leverage AI effectively in cloud-native workflows. From mastering cost optimization to architecting AI-driven pipelines, it’s the go-to event for real-world insights (and yes, a great vibe too).
Check out the CloudFest 2025 AI Sessions for more.