AWS and OpenAI strike $38B partnership to power next-gen AI workloads despite reliability concerns

Amazon Web Services and OpenAI have entered into a massive multi-year partnership that could reshape the cloud landscape. The $38 billion deal gives OpenAI access to AWS’s world-class compute infrastructure to run its most advanced artificial intelligence workloads. The agreement spans seven years and begins immediately, allowing OpenAI to ramp up its use of Amazon EC2 UltraServers powered by NVIDIA GB200 and GB300 GPUs.

These UltraServers are capable of clustering hundreds of thousands of chips, offering OpenAI an enormous pool of compute power for training and serving its next-generation models. The system can scale to tens of millions of CPUs, enabling the company to handle massive AI workloads that power tools like ChatGPT and future agentic systems.

Sam Altman, OpenAI’s co-founder and CEO, said that scaling frontier AI requires “massive, reliable compute.” He emphasized that the partnership with AWS strengthens the broader compute ecosystem needed to advance AI accessibility. But reliability may be the key word here, and not necessarily for the right reasons.

AWS has recently faced global reliability issues that rattled businesses, governments, and end users alike. A recent outage disrupted websites, apps, and connected devices across the world, leaving many questioning whether AWS can deliver the kind of uptime that OpenAI’s always-on services require. If a future incident were to impact ChatGPT or related systems, it could highlight the risk of depending too heavily on a single cloud provider.

Despite that, AWS remains the largest cloud infrastructure provider and has extensive experience running AI workloads securely and at scale. CEO Matt Garman said AWS’s infrastructure “will serve as a backbone” for OpenAI’s AI ambitions, pointing to the company’s track record in managing large, complex systems. The new architecture being built for OpenAI is optimized for AI efficiency and performance, clustering GPUs for low-latency communication between interconnected systems.

This partnership also builds on existing cooperation between the two companies. Earlier this year, OpenAI’s open weight foundation models became available through Amazon Bedrock, giving AWS customers direct access to OpenAI technology. That move quickly made OpenAI one of the most popular model providers in the Bedrock ecosystem, attracting clients like Comscore, Peloton, Thomson Reuters, and Verana Health.

By the end of 2026, AWS expects to have all the compute capacity in place to meet OpenAI’s needs, with room to expand further in 2027 and beyond. The scale of this deployment will likely make AWS one of OpenAI’s most critical infrastructure partners, setting up an intriguing dynamic as Microsoft remains its largest investor and Azure host.

As OpenAI continues to push the limits of generative AI, this new reliance on AWS underscores the growing demand for computing power and the growing complexity of the AI ecosystem itself. Whether AWS can consistently deliver the reliability required for this monumental partnership remains to be seen, but for now, the deal marks another milestone in the increasingly crowded race to power the future of AI.

Written by

Brian Fagioli ✔

Technology journalist and founder of NERDS.xyz

Brian Fagioli is a technology journalist and founder of NERDS.xyz. A former BetaNews writer, he has spent over a decade covering Linux, hardware, software, cybersecurity, and AI with a no nonsense approach for real nerds.

📄 More by Brian Fagioli ✖ Follow on X ▶ YouTube @ Threads 🐘 Mastodon

1 thought on “AWS and OpenAI strike $38B partnership to power next-gen AI workloads despite reliability concerns”

Leave a Comment Cancel reply