Kubecon: VCluster’s K8s Platform to Manage GPUs as a Service

Wait 5 sec.

ATLANTA — VCluster Labs (formerly Loft Labs) has released an augmented version of its namesake Kubernetes distribution, one customized for running NVIDIA GPUs, the preferred platform for running large, compute-intensive AI workloads.The company will be demonstrating the software at KubeCon+CloudNativeCon North America 2025, being held this week in Atlanta, at booth #421.The platform is officially called the Infrastructure Tenancy Platform for AI to Maximize GPU Efficiency on NVIDIA Kubernetes Environments. It combines advanced isolation, dynamic scaling, and hybrid networking to provide a platform for organizations to run GPU services in a cloud-like fashion for their internal users.Flexible Tenancy for GPUs“Our story is about flexible tenancy,” explained vCluster CEO Lukas Gentele, in an interview with TNS. “Sometimes you need separate clusters for individual tenants. A tenant can be one of your customers or one of your developer teams. It can be for an individual developer, or an application.”Two groups of users would find this technology potentially valuable, Gentele said. One would be large organizations that have many potential users vying for a limited set of GPUs. Another would be for a public cloud service that would want to offer GPU-based services for its own clientele.Flexibility is extremely important in both cases, given the dynamic nature of AI work, Gentele said. The ability to dynamically allocate and deallocate them quickly would be a premium feature for such an environment.Using vCluster’s ability to carve multiple individually-secured “virtual clusters” from one large cluster, companies can provision clusters more quickly, use more of their GPUs and manage Day 2 operations more effectively, according to the company.The Tenancy Platform enables “dynamic, multitenant GPU orchestration with the same elasticity and control enterprises expect from the public cloud,” but for private NVIDIA-powered AI systems,” further explained Paul Nashawaty, practice lead and principal analyst at theCUBE Research, in a statement. He noted that theCUBE Research found that 71% of organizations have reported GPU utilization inefficiency as a major challenge.VCluster has also published a reference architecture for running the Infrastructure Tenancy Platform on NVIDIA DGX line of turnkey GPU servers.The Infrastructure Tenancy PlatformThe distribution is built on a number of Kubernetes technologies, some recently introduced by vCluster, including:KubeVirt, for creating virtual machines, including those for GPUs.VCluster Private Nodes and Karpenter-based vCluster Auto Nodes, to enable virtual clusters to dynamically autoscale GPU and CPU capacity across clouds, data centers, and bare metal environments.VCluster VPN, a Tailscale-based overlay virtual private network.Netris network isolation controller, for network isolation, giving each tenant its own dedicated network path.VNode Runtime to provide a container sandbox that helps prevent container breakouts.It is directly integrated with the NVIDIA Base Command Manager (BCM) cluster management software. This is the software that NVIDIA provides to launch bare-metal GPU servers and hook them into the network.VCluster provides all the supporting software and an ease-of-use experience, Gentele said. The virtual GPUs can be provisioned through the Kubernetes Cluster API, or by Terraform, Helm charts, or kubectl.The new vCluster Reference Architecture for NVIDIA DGX systems provides a set of best practices for deploying virtual clusters on GPU-centric systems, enabling enterprises to deliver in-house a cloud-like Kubernetes experience.The post Kubecon: VCluster’s K8s Platform to Manage GPUs as a Service appeared first on The New Stack.