Google’s New Ironwood AI Chips Deliver 4X Performance, Lock Down Billion-Dollar Deal with Anthropic

Google AI chips

Photo by Google DeepMind on Unsplash

When Google says it’s going big on AI, it’s not just talk. Last week, Google Cloud unveiled its latest hardware for artificial intelligence—something they’re calling their most powerful AI infrastructure yet. At the center of it all is Ironwood, a new generation of Tensor Processing Units (TPUs) that aim to power not just training, but the fast, reliable serving of AI models at a massive scale.

And get this: AI company Anthropic, the folks behind the Claude models, signed a deal to access up to a million of these new chips. That’s a multi-year commitment worth tens of billions of dollars. Yes, you read that right—billions.

Here’s what you need to know about what Google just launched, why it matters, and how this could shape the future of AI infrastructure.


Why Google Built Ironwood

We’ve officially entered what Google calls “the age of inference.” It’s a shift from just training giant AI models to actually running them day-to-day, serving billions of real-time user requests.

Think about that for a second—training a model can take days or weeks and tolerate slower batch processing. But once it’s out in the wild, like powering a chatbot or coding assistant, it needs to be fast, reliable, and responsive. No one’s waiting 30 seconds for an AI to respond.

That’s exactly where Ironwood comes in.


Inside Ironwood: A Supercomputer in a Pod

AI supercomputer

Photo by Growtika on Unsplash

Ironwood isn’t your average chip refresh. Each “Pod,” essentially a self-contained AI supercomputer, links together up to 9,216 individual chips. They’re connected using Google’s own Inter-Chip Interconnect network that moves data at 9.6 terabits per second. That’s fast enough to download the entire Library of Congress in under two seconds.

And these aren’t just pumping data; they’re also sharing access to 1.77 petabytes of high-bandwidth memory. That’s room for about 40,000 Blu-ray movies, accessed simultaneously by thousands of processors.

Even better, Ironwood uses Optical Circuit Switching, a clever tech that reroutes traffic instantly if something breaks. The whole system is built to keep running without a hitch, even when components inevitably fail. Google’s had five previous TPU generations to perfect this, and they claim uptime of 99.999% since 2020.


The Anthropic Megadeal: A Massive Vote of Confidence

Anthropic AI

Photo by Marija Zaric on Unsplash

Then there’s Anthropic. Not only did they expand their existing partnership with Google, they signed on to access up to one million Ironwood chips. That’s huge.

“To put this in context,” said Krishna Rao, Anthropic’s CFO, “our customers depend on Claude for their most important work. This expanded capacity ensures we can meet our exponentially growing demand.”

They’re not just leasing server time. Anthropic’s getting access to over a gigawatt of compute capacity starting in 2026—enough to power a small city. They cited Ironwood’s “price-performance and efficiency” as key to the deal.

Analysts estimate the contract could be worth tens of billions. That’s among the largest known cloud infrastructure deals ever.


It’s Not Just TPUs: Meet Axion, Google’s Arm-Based CPU Family

Alongside the splashy Ironwood launch, Google also introduced updates to its Axion processors—custom Arm-based CPUs for all the behind-the-scenes work that powers AI apps. Think API calls, database queries, and containerized apps.

Their new N4A instances promise up to 2X better price-performance than traditional x86 virtual machines. Vimeo saw a 30% performance bump for transcoding tasks, and ZoomInfo noted a 60% improvement in price-performance for Java workloads. Not bad at all.

They’re also previewing C4A metal, their first bare-metal Arm instance, ideal for things like Android development or automotive computing where close-to-the-hardware access is vital.

In short—TPUs do the heavy AI lifting. Axion CPUs take care of the rest.


Making Silicon Actually Useful

Of course, raw chip power is just one piece. Developers need software tooling to make the most of it. Google’s addressing that too.

Here are a few noteworthy pieces:

  • Inference Gateway: Dynamically load-balances AI requests to reduce latency by up to 96% and cut serving costs by up to 30%.
  • Google Kubernetes Engine: Now optimized for TPU clusters with better maintenance and deployment smarts.
  • MaxText framework: Open-source support for advanced training techniques like supervised fine-tuning and reinforcement learning.

Google wraps all of this under its AI “Hypercomputer” initiative—a fully integrated supercomputing environment to wring every ounce of performance out of its chips.


Power and Cooling: The Silent Battles Behind the Scenes

AI infrastructure

Photo by Suppanuch Wongpasklang on Unsplash

All this hardware eats power—and dumps heat. A lot of it. Google is dealing with infrastructure demands most of us never think about.

At a recent industry summit, Google shared that it’s moving to deliver +/-400V direct current power to racks—enough to push one megawatt per server rack. That’s ten times more than usual.

They’re also teaming up with Meta and Microsoft to standardize these high-voltage systems. And for the cooling? Google’s been running liquid cooling at gigawatt scale, claiming 99.999% availability across thousands of TPU pods.

Why water cooling? It’s about 4,000 times more efficient than air when it comes to moving heat. As AI chips start generating over 1,000 watts on their own, air just won’t cut it.


So… What Does It All Mean?

With Nvidia still dominating the AI chip market, custom silicon is a bold move. It’s risky, expensive, and the software ecosystem isn’t as mature.

But Google’s betting that owning the full stack—from hardware to inference-serving software—yields better performance and costs. It’s how they built the first TPU ten years ago, which helped create the Transformer architecture most modern models are based on.

If Ironwood lives up to expectations—and deals like Anthropic’s are any indication—it could shift the balance in what infrastructure powers tomorrow’s AI.

That’s a big deal if you’re a cloud provider. Or a startup trying to serve AI reliably. Or just someone waiting for a chatbot to respond faster than a spinning wheel.

In the age of inference, response time matters. And Google’s betting Ironwood is fast enough to change the game. Quietly, but massively.

Keywords: Google AI, Ironwood AI Chips, Anthropic AI, AI Supercomputer, Arm-based CPU, AI Infrastructure


Read more of our stuff here!

Leave a Comment

Your email address will not be published. Required fields are marked *