A year in, Google wants its Axion processors to feel like a scheduling decision

Wait 5 sec.

At KubeCon Europe in Amsterdam, The New Stack sat down with Jago Macleod and Abdel Sghiouar from Google Cloud to talk about what a year of its Arm-based Axion processors in production has meant for Kubernetes users. To sum it up: their argument is that today, the question of whether to try Arm for a given workload has become something that’s easy to answer, because for the majority of containerized workloads, everything is now in place to move them to Arm.“It’s essentially going to boil down to tokens per watt. And I think we will end up selling watts, not CPUs.” — MacleodIn this episode, we’ll look at where Google’s custom Arm CPU fits on GKE, why compute classes matter more than the chip itself, and why the real ceiling on all of this is energy, not the instruction set.Where Axion sitsAxion is Google’s first custom Arm CPU. The company built it on Arm’s Neoverse platform and announced it in April 2024. C4A series, the first Axion virtual machine instance family, went GA in October 2024, followed by N4A — which is optimized to balance price and performance — in January 2026.Google claims 50 percent better performance and 60 percent better energy efficiency than comparable x86 instances, with N4A pushing 2x price-performance on general-purpose workloads.“You just compile to a different deployment target”Macleod’s core argument is that the cost of trying Axion should no longer look like a full-scale migration project.“One thing I hear a lot is customers perceive it to be a big migration from x86 to Arm,” he says. “That’s not the experience that I hear and see. It’s more about you just compile to a different deployment target.”Macleod added that it’s easy to add an Axion node pool to an existing Google Kubernetes Engine (GKE) cluster. All you have to do is rebuild the container image as multi-arch, that is, a container that works on both x86 and Arm chips, and tag the pods with a node selector.“You could do that gradually,” Sghiouar says. “You don’t have to do that all or nothing. You could do a canary deployment — 5%, 10% — and monitor your baseline for errors, for performance.”The team admitted that there are edge cases, but these are rare. There may be floating-point math that doesn’t behave identically across architectures, for example, and some low-level databases or caches, squeezing the last few percent out of the hardware, could become issues. “The ones that work all work in the same way,” Macleod says of these migrations. “The ones that don’t, all don’t work in different ways.”Compute classes are the on-ramp to AxionThe more interesting piece, maybe, is what Kubernetes itself has become underneath the Axion pitch. GKE’s compute classes feature lets a workload declare a priority list of VM shapes. Depending on the use case, that may be Axion first, with a fallback to x86 generation, and another fallback to spot capacity. GKE’s scheduler then resolves this automatically.That’s the mechanism that turns Axion from a procurement decision into a scheduling preference, the team argues. A workload simply declares what it wants, and the control plane figures out how to get there.“We have actually seen customers doing a compute class with eight, nine, ten priorities in the list,” Sghiouar says. “And during spikes, they can spike all the way up to the lowest priority virtual machine they want.”The same pattern applies to GPUs. Accelerator obtainability is its own problem, and compute classes help. So does dynamic resource allocation, a newer Kubernetes API that treats accelerators the way storage classes treat disks. Together, they let workloads declare what they need without hard-coding to a specific SKU.Macleod’s own caveat is worth keeping in mind, though. Plenty of enterprises still run Kubernetes like legacy VM fleets, with firewall rules and pet nodes. The canary-rollout pitch assumes a more mature architecture. That gap is exactly what Google is trying to close with compute classes and better scheduling primitives.Tokens per wattOne other throughline in our discussion is energy. AI workloads have made that ceiling visible in a way it wasn’t even a few years ago.“It’s essentially going to boil down to tokens per watt,” Macleod says. “And I think we will end up selling watts, not CPUs. We will be constrained by energy for the foreseeable future.”For the Axion team, the advantage here is that the money a workload saves on Axion becomes the company’s budget for more tokens.The post A year in, Google wants its Axion processors to feel like a scheduling decision appeared first on The New Stack.