There’s no denying AI’s warp-speed integration into global digital infrastructure. From search to cybersecurity and automation to analytics, its disruptive impact on our digital workflows and beyond has been nothing short of seismic.
None of this happens without the Cloud, of course. Cloud services are booming—but with this explosion in the volume and complexity of AI computing comes pressure. CSPs need to accommodate demand, while reining in the environmental impact of their expansion.
Even as the Cloud enables AI’s evolution, AI is pushing Cloud infrastructure to its limits.
Powering intelligence, consuming resources
AI is hungry… so very hungry. From training LLMs to processing real-time requests, it requires enormous computational horsepower, and with it, huge volumes of electricity. In 2023, AI workloads alone consumed an estimated 4.5 gigawatts of electricity, according to the International Energy Agency (IEA). That’s about 8% of total data center energy use. And that figure is only rising—quadrupling, according to the IEA, by 2030.
AI’s also thirsty. A study by the University of California, Riverside, projects global AI will account for roughly 6.6 billion cubic meters of water withdrawal a year by 2027, mainly through cooling systems. This is roughly equivalent to half of the UK’s water usage.
Meanwhile, even the simplest Chat GPT input—like those thank-you messages we tack on to keep AI polite—uses 10 times more energy than a Google search. Factor in a billion prompts a day and you’re looking at 2.9 million kilowatt-hours of energy—enough to power 100,000 US households.
Even the chips powering our everyday AI queries come at a steep cost. GPUs and other AI accelerators are dependent upon rare-earth elements and critical minerals, like cobalt, nickel lithium and neodymium, to function. Their extraction and production is highly resource-intensive.
Exact stats are hard to come by, but it takes something like 120kWh of energy in mineral extraction to produce a single Nvidia H100 AI accelerator. Put that equivalent charge in your electric car and it’ll get you from San Francisco to Los Angeles.
At hyperscale, the resource toll is staggering.
Chips and chill: what Cloud providers can do
Clearly CSPs need to be mindful of these figures. Performance and environmental cost are now fundamental design parameters for Cloud infrastructure. Here’s how to boost capacity while minimizing environmental impact.
1. Smarter silicon: efficiency gains from AI-optimized chips
Not all compute workloads are created equal—and neither are the chips that power them. As demand for AI workloads explodes, CSPs need to deliver massive performance gains without sending power bills through the roof. That’s where specialized accelerators like NVIDIA’s H100 GPU, AMD’s MI300, and Google’s TPU v5p are your friends.
Unlike general-purpose CPUs, they’re purpose-built to crunch through large-scale training and inference tasks. While an H100, for example, can draw up to 700 watts under load, its ability to slash training times makes it an energy and cost winner over time. Training a 7B GPT model on H100s using FP8 precision was three times faster than on NVIDIA A100s using BF16 precision.
In hyperscale environments, those time and energy savings translate directly to bottom-line benefits. Microsoft claims Azure AI Foundry, running Llama 70B models with TensorRT-LLM optimizations, achieved 45% higher throughput and a significant reduction in cost per token, particularly for inference workloads. So that’s a clear win on both performance and operational cost fronts.
For CSPs running serious AI workloads, the question is no longer whether to use accelerators—it’s how fast you can amortize the cost through throughput gains. Being faster and leaner will pay off quicker than you think.
2. CRACs of doom: air cooling has had its day
Traditional air-cooled CRAC systems simply can’t cope with AI-scale workloads. Alternatives such as direct-to-chip (DTC) liquid cooling and immersion cooling offer significant performance and sustainability benefits:
● DTC systems are 50–1000 times more efficient than air cooling at heat transfer
● Immersion cooling can cut overall cooling energy use by more than 90%
● Liquid systems reduce airflow demand, so less energy goes toward running fans
These technologies are not speculative. IBM’s Aquasar supercomputer uses hot-water cooling to slash emissions by 85% while recycling 80% of heat; and Meta, Microsoft, and Baidu are already incorporating these improvements into their new builds.
3. Modular and disaggregated data center design
There’s no denying the demand, but quantifying what that means in terms of actual data center newbuild isn’t easy. Build too much, too soon and you’ll have a bunch of dormant, overcooled hardware. Modular data centers enable CSPs to add capacity incrementally, minimizing idle overheads and underused cooling. The power usage effectiveness (PUE) of older, non-modular data centers could be as high as 2.0—but modular setups like Hewlett Packard’s EcoPod can hit a PUE ratio as low as 1.05.
Spurred on by the Open Compute Project (OCP), which catalyzed the adoption of disaggregated, energy-efficient hardware for data center infrastructure, the likes of Meta, Intel, and Google now use OCP-conformant servers to optimize airflow and thermal performance at the rack level.
Architecting efficient AI workloads
1. Model compression and pruning
Savvy CSPs aren’t just focusing on running AI on a larger scale—they’re making it leaner as well. CSPs can help clients reduce the energy cost of AI by enabling lighter models. Prune and distill your model and you can reduce inference compute by up to 60% without sacrificing performance.
2. Hybrid compute scheduling
Mixing low-power cores with GPUs and scheduling dynamically, based on real-time load, saves energy. Google and Microsoft are already ditching static allocation models in favor of predictive workload allocation. Microsoft claims this tactic has reduced underutilization by around 16%.
3. Efficient data pipelines
Data curation and pipeline design matter. Intelligent tiering, deduplication and use of high-efficiency formats like Parquet can dramatically cut I/O and storage overhead. It also reduces the need to move data between hot and cold layers—lowering both energy and cost.
What legislation is coming?
The days of sustainability being a “nice to have” are behind us. Compliance requirements are growing, and CSPs are increasingly on the hook.
Europe: The Corporate Sustainability Reporting Directive and associated European Sustainability Reporting Standards require companies operating in the EU to disclose energy, emissions and water use—and that includes their Cloud-based operations.
US: In states such as California, stricter energy efficiency standards for data centers and technology infrastructure are being implemented. Companies like Microsoft, Google and Amazon are all pursuing net-zero and water-positive targets.
Global: UNEP-backed initiatives such as the Coalition for Sustainable AI are developing voluntary global standards on carbon disclosure, water metrics, and green compute practices.
Those CSPs that can stay one step ahead of regulations stand to gain a reputational and operational advantage.
Sustainability as a strategic differentiator
Cloud service providers are the infrastructure layer supporting the world’s AI evolution. But with that role comes great responsibility. Customers are demanding transparency, resilience, and environmental accountability alongside top-notch services.
By building green by design—and by continually optimizing cooling, compute, and data architectures—CSPs can lead the way in balancing intelligence with impact. The market is ready and flourishing. The technology exists. The only question is who moves first.