Home / Guides / Engineering sustainability into AI: what cloud providers need to know

Engineering sustainability into AI: what cloud providers need to know

Sam Noble

July 29, 2025

On this page

Powering intelligence, consuming resources
Chips and chill: what Cloud providers can do
Smarter silicon: efficiency gains from AI-optimized chips
CRACs of doom: air cooling has had its day
Modular and disaggregated data center design
Architecting efficient AI workloads
Model compression and pruning
Hybrid compute scheduling
Efficient data pipelines
What legislation is coming?
Sustainability as a strategic differentiator

All the latest CloudFest news straight to your inbox? It would be weird to say no!

YES, SUBSCRIBE

About

CloudFest is the #1 internet infrastructure event in the world, helping the global cloud computing industry to reach their business goals—and have a great time doing it. This festival facilitates new connections, powerful ideas, and the best networking the industry has ever seen.

There’s no denying AI’s warp-speed integration into global digital infrastructure. From search to cybersecurity and automation to analytics, its disruptive impact on our digital workflows and beyond has been nothing short of seismic.

None of this happens without the Cloud, of course. Cloud services are booming—but with this explosion in the volume and complexity of AI computing comes pressure. CSPs need to accommodate demand, while reining in the environmental impact of their expansion.

Even as the Cloud enables AI’s evolution, AI is pushing Cloud infrastructure to its limits.

Powering intelligence, consuming resources

AI is hungry… so very hungry. From training LLMs to processing real-time requests, it requires enormous computational horsepower, and with it, huge volumes of electricity. In 2023, AI workloads alone consumed an estimated 4.5 gigawatts of electricity, according to the International Energy Agency (IEA). That’s about 8% of total data center energy use. And that figure is only rising—quadrupling, according to the IEA, by 2030.

AI’s also thirsty. A study by the University of California, Riverside, projects global AI will account for roughly 6.6 billion cubic meters of water withdrawal a year by 2027, mainly through cooling systems. This is roughly equivalent to half of the UK’s water usage.

Meanwhile, even the simplest Chat GPT input—like those thank-you messages we tack on to keep AI polite—uses 10 times more energy than a Google search. Factor in a billion prompts a day and you’re looking at 2.9 million kilowatt-hours of energy—enough to power 100,000 US households.

Even the chips powering our everyday AI queries come at a steep cost. GPUs and other AI accelerators are dependent upon rare-earth elements and critical minerals, like cobalt, nickel lithium and neodymium, to function. Their extraction and production is highly resource-intensive.

Exact stats are hard to come by, but it takes something like 120kWh of energy in mineral extraction to produce a single Nvidia H100 AI accelerator. Put that equivalent charge in your electric car and it’ll get you from San Francisco to Los Angeles.

At hyperscale, the resource toll is staggering.

Chips and chill: what Cloud providers can do

Clearly CSPs need to be mindful of these figures. Performance and environmental cost are now fundamental design parameters for Cloud infrastructure. Here’s how to boost capacity while minimizing environmental impact.

1. Smarter silicon: efficiency gains from AI-optimized chips

Not all compute workloads are created equal—and neither are the chips that power them. As demand for AI workloads explodes, CSPs need to deliver massive performance gains without sending power bills through the roof. That’s where specialized accelerators like NVIDIA’s H100 GPU, AMD’s MI300, and Google’s TPU v5p are your friends.

Unlike general-purpose CPUs, they’re purpose-built to crunch through large-scale training and inference tasks. While an H100, for example, can draw up to 700 watts under load, its ability to slash training times makes it an energy and cost winner over time. Training a 7B GPT model on H100s using FP8 precision was three times faster than on NVIDIA A100s using BF16 precision.

In hyperscale environments, those time and energy savings translate directly to bottom-line benefits. Microsoft claims Azure AI Foundry, running Llama 70B models with TensorRT-LLM optimizations, achieved 45% higher throughput and a significant reduction in cost per token, particularly for inference workloads. So that’s a clear win on both performance and operational cost fronts.

For CSPs running serious AI workloads, the question is no longer whether to use accelerators—it’s how fast you can amortize the cost through throughput gains. Being faster and leaner will pay off quicker than you think.

2. CRACs of doom: air cooling has had its day

Traditional air-cooled CRAC systems simply can’t cope with AI-scale workloads. Alternatives such as direct-to-chip (DTC) liquid cooling and immersion cooling offer significant performance and sustainability benefits:

● DTC systems are 50–1000 times more efficient than air cooling at heat transfer

● Immersion cooling can cut overall cooling energy use by more than 90%

● Liquid systems reduce airflow demand, so less energy goes toward running fans

These technologies are not speculative. IBM’s Aquasar supercomputer uses hot-water cooling to slash emissions by 85% while recycling 80% of heat; and Meta, Microsoft, and Baidu are already incorporating these improvements into their new builds.

3. Modular and disaggregated data center design

There’s no denying the demand, but quantifying what that means in terms of actual data center newbuild isn’t easy. Build too much, too soon and you’ll have a bunch of dormant, overcooled hardware. Modular data centers enable CSPs to add capacity incrementally, minimizing idle overheads and underused cooling. The power usage effectiveness (PUE) of older, non-modular data centers could be as high as 2.0—but modular setups like Hewlett Packard’s EcoPod can hit a PUE ratio as low as 1.05.

Spurred on by the Open Compute Project (OCP), which catalyzed the adoption of disaggregated, energy-efficient hardware for data center infrastructure, the likes of Meta, Intel, and Google now use OCP-conformant servers to optimize airflow and thermal performance at the rack level.

Architecting efficient AI workloads

1. Model compression and pruning

Savvy CSPs aren’t just focusing on running AI on a larger scale—they’re making it leaner as well. CSPs can help clients reduce the energy cost of AI by enabling lighter models. Prune and distill your model and you can reduce inference compute by up to 60% without sacrificing performance.

2. Hybrid compute scheduling

Mixing low-power cores with GPUs and scheduling dynamically, based on real-time load, saves energy. Google and Microsoft are already ditching static allocation models in favor of predictive workload allocation. Microsoft claims this tactic has reduced underutilization by around 16%.

3. Efficient data pipelines

Data curation and pipeline design matter. Intelligent tiering, deduplication and use of high-efficiency formats like Parquet can dramatically cut I/O and storage overhead. It also reduces the need to move data between hot and cold layers—lowering both energy and cost.

What legislation is coming?

The days of sustainability being a “nice to have” are behind us. Compliance requirements are growing, and CSPs are increasingly on the hook.

Europe: The Corporate Sustainability Reporting Directive and associated European Sustainability Reporting Standards require companies operating in the EU to disclose energy, emissions and water use—and that includes their Cloud-based operations.

US: In states such as California, stricter energy efficiency standards for data centers and technology infrastructure are being implemented. Companies like Microsoft, Google and Amazon are all pursuing net-zero and water-positive targets.

Global: UNEP-backed initiatives such as the Coalition for Sustainable AI are developing voluntary global standards on carbon disclosure, water metrics, and green compute practices.

Those CSPs that can stay one step ahead of regulations stand to gain a reputational and operational advantage.

Sustainability as a strategic differentiator

Cloud service providers are the infrastructure layer supporting the world’s AI evolution. But with that role comes great responsibility. Customers are demanding transparency, resilience, and environmental accountability alongside top-notch services.

By building green by design—and by continually optimizing cooling, compute, and data architectures—CSPs can lead the way in balancing intelligence with impact. The market is ready and flourishing. The technology exists. The only question is who moves first.

Get the latest CloudFest news
straight into your inbox

SUBSCRIBE TO NEWSLETTER »

AUTHOR