The next stages of AI conformance in the cloud-native, open-source world

Wait 5 sec.

Until recently, running an AI model on Kubernetes was a guessing game. What worked on one cloud provider could fail on another in the face of different GPU drivers, network setups, or autoscaling behaviors. This becomes a big problem as organizations transition from AI in innovation labs to in production. Standardization of AI workloads on Kubernetes has become an urgent industry priority. The goal of the Cloud Native Computing Foundation’s (CNCF) Kubernetes AI conformance program is to standardize how AI and machine learning workloads run on Kubernetes, which 80% of enterprises already use to handle dramatic traffic fluctuations.In this episode of The New Stack Makers, I sat down with Jonathan Bryce, executive director of the CNCF, from the floor of the biggest KubeCon + CloudNativeCon yet this March in Amsterdam. We spoke about how this AI conformance program takes down silos and vendor lock-in, by increasing portability, predictability, and production readiness. “By the end of 2026 from the amount of compute that’s dedicated to AI workloads, two thirds of it is going to be for inference, and a third of it is going to be for training. Three years ago, that was completely flipped,” Bryce says. “This is shifting really rapidly, and we’re going to have 93 gigawatts of compute power dedicated to inference by the end of the decade,” which is more than all other compute combined. This is a sign that AI is reaching its next stage of maturity where models are trained, and now it’s down to real-world use cases. But it’s not without its challenges. Training usually happens overnight in batches, while inference is very real-time, always-on. But, if you ask Jimmy Song, VP of the open source ecosystem at Dynamia.AI, Kubernetes is the ideal runtime for AI inference because it delivers an “elastic, cost-efficient, low-latency model serving with GPU-aware autoscaling, versioning, and observability.” In fact, he goes as far to say that “AI Inference is retracing the path of cloud-native microservices, only the underlying compute has shifted from CPU to GPU.” This conformance ensures that each Kubernetes cluster is actually able to handle the high demands of GPUs, tensor processing units (TPUs), and complex AI scheduling — without having to customize for each cloud provider.With this in mind, it’s not surprising that the big three cloud providers, Red Hat and Nvidia were the first to earn this Kubernetes AI conformance stamp of a approval, since the program was launched in November 2025. Major European cloud provider OVHcloud is another early adopter, which shows another buzz around KubeCon Europe 2026 around cloud sovereignty.“It’s just growing so rapidly that there’s plenty of demand,” Bryce remarks, “so anything you can do to accelerate adoption in that market, helps everybody who is a major player.”With this in mind, this March, llm-d was launched into the CNCF incubator program, as it provides a pre-integrated, Kubernetes-native distributed reference framework and orchestration manager to bridge the gap between high-level control planes and low-level inference engines. “It integrates vLLM, which is an open source inference serving engine, into a Kubernetes cluster, where that makes a lot more specific decisions and opinionated deployment options that conformance program requires right now,” Bryce says. The llm-d project will then collaborate with the CNCF AI conformance program to further ensure interoperability across the cloud-native, open-source ecosystem.Nothing is set in stone as things are simply moving too fast.“We start out with a fairly small set of requirements with the things that are you know are going to be present in all environments,” Bryce explains. Once a company joins the hundreds that have already passed the standard Kubernetes conformance program, the first set of AI-specific standards is around exposing accelerators into a Kubernetes cluster in a standard way so that a workload can say: I need X type of accelerator. I need to be able to have this many of them for this long and that’s enabled by a feature in Kubernetes called DRA or dynamic resource allocation,” a new feature for Kubernetes that launched late 2025.As the program expands and AI-driven development settles, he continues, new requirements will come up around networking and storage, so these companies will have to re-certify. As the program grows in maturity, the cadence for recertification will change too.The AI program is moving toward a set of testing automation so many it easier to validate conformance, but Bryce asks that members of the cloud-native community join working group to help. Especially those coming from different verticals. “It’s really defined by the people who participate, to stay very close to real world needs,” Bryce says, while also sticking “to the common denominator of what every environment needs, and then if there is an additional security or regulatory requirement tat sits outside of that.”The post The next stages of AI conformance in the cloud-native, open-source world appeared first on The New Stack.