4 min read

What Is Behind AWS's Push for a Custom AI Chip?

For years, Nvidia has been the undisputed king of the artificial intelligence boom, supplying the powerful graphics processing units (GPUs) that serve as the engine for services like ChatGPT. But a new front has opened in the AI arms race, and it’s being led by Nvidia’s own biggest customers.

This week, Amazon Web Services detailed plans for an upgraded version of its Graviton4 central processing unit (CPU) and highlighted the success of its custom-built Trainium AI chips, which are now powering models from major AI lab Anthropic. The moves are part of a broader, industry-wide rebellion against the high cost and supply constraints of Nvidia’s hardware, as tech giants from Google to Microsoft and Meta spend billions to design their own specialized chips. This shift raises a fundamental question: Is the era of Nvidia’s total dominance beginning to crack?

What Makes Nvidia the King of AI Chips?

Nvidia’s supremacy isn’t just about having the fastest hardware; it’s about a deeply entrenched ecosystem built over nearly two decades. The company’s CUDA software platform is the de facto standard for AI development, a proprietary programming model that makes it easy for millions of developers to harness the power of Nvidia’s GPUs.

Beyond software, Nvidia has established a lead in interconnect technology. Its proprietary NVLink system connects thousands of GPUs together, allowing them to function as a single, colossal AI brain. This is crucial for training the massive foundation models that define the current AI era. This combination of superior hardware, a locked-in software ecosystem, and system-level innovation has given Nvidia an estimated 90% share of the AI data center GPU market.

This dominance allows Nvidia to command incredible prices and margins. Its flagship H100 GPUs can sell for over $30,000 each, and the company has consistently reported gross margins of around 75%—a figure dragged down by its less profitable gaming division. On its high-end data center products, analysts estimate margins are closer to 90%.

Why Are Customers Building Their Own Chips?

That high margin is precisely why Big Tech is in revolt. Hyperscalers like Amazon, Google, and Microsoft are spending tens of billions of dollars annually on Nvidia GPUs, effectively paying a steep “Nvidia tax.” By designing their own custom silicon, they can bypass this markup and tailor chips specifically for their workloads, aiming for better cost-performance.

Amazon’s strategy is a prime example. AWS has been developing its Trainium chips as a direct alternative to Nvidia GPUs for AI training. The company is now deploying them at scale, with one cluster for its partner Anthropic featuring over 400,000 custom chips. While AWS executives concede that Nvidia’s latest Blackwell chips are higher-performing, they argue their Trainium chips offer better price-performance, a crucial metric as AI workloads scale exponentially. An AWS engineering director noted that the upcoming Trainium3 will double the performance of its predecessor while using 50% less energy.

Google has been on this path for nearly a decade with its Tensor Processing Units (TPUs). Now on their sixth generation, these custom chips power Google’s Gemini models and its cloud platform. By manufacturing its own silicon, Google may be running its AI workloads at just 20% of the cost incurred by companies buying Nvidia hardware, giving it a massive 4-6x cost efficiency advantage. Microsoft and Meta have followed suit, developing their own in-house AI accelerators to reduce their dependency on Nvidia.

Who Else Is Challenging Nvidia?

Beyond its own customers, Nvidia faces threats from traditional rivals and innovative startups. AMD has been a persistent challenger, with its MI300 series chips boasting superior on-paper memory specifications. However, the company has struggled to match Nvidia’s real-world performance, largely due to a less mature software stack.

Meanwhile, a new class of startups is attacking the problem with radical new architectures. Companies like Cerebras Systems, with its massive “wafer-scale” chips, and Groq, with its specialized chips for ultra-fast inference, are attempting to circumvent Nvidia’s interconnect advantage by redesigning the problem from the ground up. Qualcomm has also signaled a re-entry into the data center market with new CPUs specifically designed to work with AI accelerators.

Can Software and Algorithms Break the Lock-In?

The hardware threats are compounded by significant shifts in software and AI research. While CUDA has been a powerful moat, new high-level programming frameworks like OpenAI’s Triton and Google’s JAX are abstracting away the underlying hardware. These frameworks allow developers to write AI code once and then compile it to run efficiently on a variety of chips, whether they come from Nvidia, AMD, or Google. This trend mirrors the historical shift from hand-tuned assembly language to more flexible high-level languages like C++, which ultimately made the underlying processor less important.

Perhaps the most disruptive development came from Chinese startup DeepSeek. In early 2025, it released a series of AI models that it claimed achieved world-class performance using a fraction of the compute power—and therefore cost—of its Western rivals. DeepSeek’s technical papers suggested efficiency gains of up to 45 times, sending a shockwave through the market and briefly wiping hundreds of billions off Nvidia’s market value. The news raised a critical question: what if the race isn’t just about accumulating more and more GPUs, but about writing smarter software?

While some of these efficiency gains may be offset by what economists call the Jevons paradox—where greater efficiency leads to even greater consumption—it fundamentally changes the calculus. Microsoft CEO Satya Nadella has embraced this, noting that as AI becomes cheaper, its use will become more widespread, ultimately driving more demand for all types of compute infrastructure.

What’s the Next Frontier in the AI Chip War?

Even as its dominance at the chip level is challenged from all sides, Nvidia isn’t standing still. The company is increasingly shifting the battleground from individual chips to entire AI systems, or “AI Factories.” Its latest Blackwell platform is sold not just as a chip, but as a fully integrated, rack-scale system like the GB200 NVL72, which combines GPUs, CPUs, and high-speed networking into a single, optimized unit. This full-stack approach makes it harder for competitors to displace them.

In a sign of this shifting landscape, Nvidia recently announced it would open its NVLink interconnect technology to partners, allowing companies like Qualcomm and Marvell to build their custom CPUs directly into Nvidia’s server designs. The move shows that while the days of Nvidia’s near-total monopoly on AI chips may be numbered, the company is already preparing for the next battle: providing the foundational architecture for the entire AI data center.


Reference Shelf

AWS' custom chip strategy is cutting into Nvidia's AI dominance (CNBC) (source)

The new AI calculus: Google’s 80% cost edge vs. OpenAI’s ecosystem (VentureBeat)

Connectivity Is the New Battleground in the AI Chip War (ARPU)

Why Building an Nvidia-Killer Chip Is Harder Than It Looks (ARPU)