Amazon Web Services (AWS) will deploy approximately 1 million Nvidia GPUs by 2027 amid a broader infrastructure buildout focused on AI inference, according to an expanded strategic collaboration announced this week. Observers note the deal signals a shift in demand toward running AI models at scale, with inference now accounting for roughly two-thirds of AI compute.
Amazon Web Services plans to deploy about 1 million Nvidia GPUs through the end of 2027 as part of a significant AI infrastructure expansion. The rollout, commencing this year across AWS’s global cloud regions, includes expanded work with Nvidia on networking and systems designed for running AI at scale, particularly for agentic AI systems capable of reasoning, planning, and acting autonomously. An Nvidia executive confirmed the timeline.
AWS continues to develop its own AI chips for both training and inference. The collaboration suggests demand may be shifting across the AI stack, with a growing share of activity tied to running models in live services. Dermot McGrath, co-founder at ZenGen Labs, stated, “Nvidia is becoming the infrastructure layer underneath the cloud providers, not just a chip vendor to them.”
He noted that chips in the deal are geared toward lowering the cost of running AI models at scale, with inference now accounting for roughly two-thirds of AI compute, up from about a third in 2023. The market for inference-focused chips is expected to exceed $50 billion by 2026, according to Deloitte estimates he cited. Pichapen Prateepavanich, founder of Gather Beyond, said demand for inference is “driving long-term commitments” for more compute power and creating closer ties between cloud providers and chipmakers.
Berna Misa, deal partner at Boardy Ventures, described the dynamic as an “infrastructure flip,” with Nvidia embedding its full stack across compute, networking, and inference inside AWS data centers. She explained that while AWS develops its own AI chips, it doesn’t change the underlying equation, as inference relies on multiple components across the stack, with Nvidia supplying most of them. The deal comes amid broader scrutiny of chip supply chains, as U.S. prosecutors pursue a case alleging Nvidia chips were smuggled to China.
