Accelerated Computing, Networking Drive Supercomputing in Age of AI

Wait 5 sec.

At SC25, NVIDIA unveiled advances across NVIDIA BlueField DPUs, next-generation networking, quantum computing, national research, AI physics and more — as accelerated systems drive the next chapter in AI supercomputing.NVIDIA also highlighted storage innovations powered by the NVIDIA BlueField-4 data processing unit, part of the full-stack BlueField platform that accelerates gigascale AI infrastructure.More details also came on NVIDIA Quantum-X Photonics InfiniBand CPO networking switches — enabling AI factories to drastically reduce energy consumption and operational costs — including that TACC, Lambda and CoreWeave plan to integrate them.Last month, NVIDIA began shipping DGX Spark, the world’s smallest AI supercomputer. DGX Spark packs a petaflop of AI performance and 128GB of unified memory into a desktop form factor, enabling developers to run inference on models up to 200 billion parameters and fine-tune models locally. Built on the Grace Blackwell architecture, it integrates NVIDIA GPUs, CPUs, networking, CUDA libraries and the full NVIDIA AI software stack.DGX Spark’s unified memory and NVIDIA NVLink-C2C deliver 5x the bandwidth of PCIe Gen5, enabling faster GPU-CPU data exchange. This boosts training efficiency for large models, reduces latency and supports seamless fine-tuning workflows — all within a desktop form factor.NVIDIA Apollo Unveiled as Latest Open Model Family for AI PhysicsNVIDIA Apollo, a family of open models for AI Physics, was also introduced at SC25. Applied Materials, Cadence, LAM Research, Luminary Cloud, KLA, PhysicsX, Rescale, Siemens and Synopsys  are among the industry leaders adopting these open models to simulate and accelerate their design processes in a broad range of fields — electronic device automation and semiconductors, computational fluid dynamics, structural mechanics, electromagnetics, weather and more.The family of open models harness the latest developments in AI physics, incorporating best-in-class machine learning architectures, such as neural operators, transformers and diffusion methods, with domain-specific knowledge. Apollo will provide pretrained checkpoints and reference workflows for training, inference and benchmarking, allowing developers to integrate and customize the models for their specific needs.NVIDIA Warp Supercharges Physics Simulations​ NVIDIA Warp is a purpose-built open-source Python framework delivering GPU acceleration for computational physics and AI by up to 245x.NVIDIA Warp provides a structured approach for simulation, robotics and machine learning workloads, combining the accessibility of Python with performance comparable to native CUDA code.Warp supports the creation of GPU-accelerated 3D simulation workflows that integrate with ML pipelines in PyTorch, JAX, NVIDIA PhysicsNeMo and NVIDIA Omniverse. This allows developers to run complex simulation tasks and generate data at scale without leaving the Python programming environment.By offering CUDA-level performance with Python-level productivity, Warp simplifies the development of high-performance simulation workflows. It is designed to accelerate AI research and engineering by reducing barriers to GPU programming, making advanced simulation and data generation more efficient and widely accessible.Siemens, Neural Concept, Luminary Cloud, among others, are adopting NVIDIA Warp.NVIDIA BlueField-4 DPU: The Processor Powering the Operating System of AI FactoriesShowcasing BlueField-4 for Powering the OS of AI Factories Unveiled at GTC Washington, D.C., NVIDIA BlueField-4 DPUs are powering the operating system of AI factories. By offloading, accelerating and isolating critical data center functions — networking, storage and security — they free up CPUs and GPUs to focus entirely on compute-intensive workloads.BlueField-4, combining a 64-core NVIDIA Grace CPU and NVIDIA ConnectX-9 networking, unlocks unprecedented performance, efficiency and zero-trust security at scale. It supports multi-tenant environments, rapid data access, and real-time protection, with native integration of NVIDIA DOCA microservices for scalable, containerized AI operations. Together, they are transforming data centers into intelligent, software-defined engines for trillion-token AI and beyond.As AI factories and supercomputing centers continue to scale in size and capability, they require faster, more intelligent storage infrastructure to manage structured, unstructured and AI-native data for large-scale training and inference.Leading storage innovators — DDN, VAST Data and WEKA — are adopting BlueField-4 to redefine performance and efficiency for AI and scientific workloads.DDN is building next-generation AI factories, accelerating data pipelines to maximize GPU utilization for AI and HPC workloads.VAST Data is advancing the AI pipeline with intelligent data movement and real-time efficiency across large-scale AI clusters.WEKA is launching its NeuralMesh architecture on BlueField-4, running storage services directly on the DPU to simplify and accelerate AI infrastructure.Together, these HPC storage leaders are demonstrating how NVIDIA BlueField-4 transforms data movement and management — turning storage into a performance multiplier for the next era of supercomputing and AI infrastructure.NVIDIA ConnectX-9 SuperNICAdopting NVIDIA Co-Packaged Optics for Speed and Reliability​TACC, Lambda and CoreWeave unveiled that they will integrate NVIDIA Quantum-X Photonics CPO switches into next generation systems as early as next year.NVIDIA Quantum-X Photonics networking switches enable AI factories and supercomputing centers to drastically reduce energy consumption and operational costs. NVIDIA has achieved this fusion of electronic circuits and optical communications at massive scale.As AI factories grow to unprecedented sizes, networks must evolve to keep pace. By eliminating traditional pluggable transceivers, a common cause of job runtime failures, NVIDIA Photonics switch systems not only deliver 3.5x better power efficiency, but also perform with 10x higher resiliency, enabling applications to run 5x longer without interruption.At GTC 2024 in Silicon Valley, NVIDIA unveiled NVIDIA Quantum-X800 InfiniBand switches, purpose-built to power trillion-parameter-scale generative AI models. These platforms deliver a staggering 800Gb/s end-to-end throughput — 2x the bandwidth and 9x the in-network compute of their predecessors — owing to such innovations as SHARPv4 and FP8 support.As NVIDIA Quantum‑X800 continues to be widely adopted to meet the demands of massive-scale AI, NVIDIA Quantum‑X Photonics, announced at GTC earlier this year, addresses the critical power, resiliency, and signal-integrity challenges of even larger deployments. By integrating optics directly on the switch, it eliminates failures caused by pluggable transceivers and link flaps, enabling workloads to run uninterrupted at scale and ensuring the infrastructure can support the next generation of compute-intensive applications up to 5x better than with pluggable transceivers.“NVIDIA Quantum‑X Photonics represents the next step in building high-performance, resilient AI networks,” said Maxx Garrison, product manager for cloud infrastructure at Lambda. “These advances in power efficiency, signal integrity and reliability, will be key to supporting efficient, large-scale workloads for our customers.”SHARPv4 enables in-network aggregation and reduction, minimizing GPU-to-GPU communication overhead. Combined with FP8 precision, it accelerates training of trillion-parameter models by reducing bandwidth and compute demands — delivering faster convergence and higher throughput and comes standard with NVIDIA Quantum‑X800 and Quantum‑X Photonics switches.“CoreWeave is building the Essential Cloud for AI,” said Peter Salanki, co-founder and chief technology officer at CoreWeave. “With NVIDIA Quantum-X Photonics, we’re advancing power efficiency, and further improving the reliability CoreWeave is known for in supporting massive AI workloads at scale, helping our customers unlock the full potential of next-generation AI.”The NVIDIA Quantum-X Photonics platform, anchored by the NVIDIA Quantum Q3450 CPO-based InfiniBand switch and ConnectX-8 SuperNIC, is engineered for the highest-performance environments that also require significantly lower power, higher resiliency and lower latency.Supercomputing Centers Worldwide Adopting NVQLinkMore than a dozen of the world’s top scientific computing centers are adopting NVQLink, a universal interconnect linking accelerated computing to quantum processors.NVQLink connects quantum processors with NVIDIA GPUs, enabling large‑scale workflows powered by the CUDA‑Q software platform. NVQLink’s open architecture provides the critical link supercomputing centers need to integrate diverse quantum processors while delivering 40 petaflops of AI performance at FP4 precision.In the future every supercomputer will draw on quantum processors to expand the problems they can solve and every quantum processor will depend on GPU supercomputers to run correctly.Quantum computing company Quantinuum’s new Helios QPU was integrated with NVIDIA GPUs through NVQLink, achieving the world’s first real‑time decoding of scalable qLDPC quantum error‑correction codes. The system maintained 99% fidelity compared with 95% without correction thanks to NVQLink’s microsecond low latencies.With NVQLink scientists and developers gain a universal bridge between quantum and classical hardware — making scalable error correction, hybrid applications and real‑time quantum‑GPU workflows practical.In the Asia‑Pacific region, Japan’s Global Research and Development Center for Business by Quantum-AI technology (G-QuAT) at the National Institute of Advanced Industrial Science and Technology (AIST) and RIKEN Center for Computational Science, Korea’s Korea Institute of Science and Technology Information (KISTI), Taiwan’s National Center for High-Performance Computing (NCHC), Singapore’s National Quantum Computing Hub (a joint initiative of Singapore’s Centre for Quantum Technologies, A*STAR Institute of High Performance Computing, and National Supercomputing Centre Singapore) — and Australia’s Pawsey Supercomputing Research Centre are among the early adopters.Across Europe and the Middle East, NVQLink is being embraced by CINECA, Denmark’s DCAI, operator of Denmark’s AI Supercomputer, France’s Grand Équipement National de Calcul Intensif (GENCI), the Czech Republic’s IT4Innovations National Supercomputing Center (IT4I), Germany’s Jülich Supercomputing Centre (JSC), Poland’s Poznań Supercomputing and Networking Center (PCSS), the Technology Innovation Institute (TII), UAE and Saudi Arabia’s King Abdullah University of Science and Technology (KAUST).In the United States, leading national laboratories including, Brookhaven National Laboratory, Fermi National Accelerator Laboratory, Lawrence Berkeley National Laboratory, Los Alamos National Laboratory, MIT Lincoln Laboratory, National Energy Research Scientific Computing Center, Oak Ridge National Laboratory, Pacific Northwest National Laboratory and Sandia National Laboratories are also adopting NVQLink to advance hybrid quantum‑classical research.Developing Real‑World Hybrid ApplicationsQuantinuum’s Helios QPU with NVQLink delivered:First real‑time decoding of qLDPC error‑correction codes~99% fidelity with NVQLink correction vs ~95% withoutReaction time of 60 microseconds, exceeding Helios’ 1‑millisecond requirement by 16xNVQLink unites quantum processors with GPU supercomputing for scalable error correction and hybrid applications. Scientists can gain a single programming environment through CUDA‑Q APIs. Developers can build and test quantum‑GPU workflows in real timeWith NVQLink the world’s supercomputing centers are laying the foundation for practical quantum‑classical systems, connecting diverse quantum processors to NVIDIA accelerated computing at unprecedented speed and scale.NVIDIA and RIKEN Advance Japan’s Scientific FrontiersNVIDIA and RIKEN are building two new GPU‑accelerated supercomputers to expand Japan’s leadership in AI for science and quantum computing. Together the systems will feature 2,140 NVIDIA Blackwell GPUs connected through the GB200 NVL4 platform and NVIDIA Quantum‑X800 InfiniBand networking, strengthening Japan’s sovereign AI strategy and secure domestic infrastructure.AI for Science System: 1,600 Blackwell GPUs will power research in life sciences, materials science, climate and weather forecasting, manufacturing and laboratory automation.Quantum Computing System: 540 Blackwell GPUs will accelerate quantum algorithms, hybrid simulation and quantum‑classical methods.The partnership builds on RIKEN’s collaboration with Fujitsu and NVIDIA to codesign FugakuNEXT, successor to the Fugaku supercomputer, expected to deliver 100x greater application performance and integrate production‑level quantum computers by 2030.The two new RIKEN systems are scheduled to be operational in spring 2026.Arm Adopting NVIDIA NVLink Fusion AI is reshaping data centers in a once-in-a-generation architectural shift, where efficiency per watt defines success. At the center is Arm Neoverse, deployed in over a billion cores and projected to reach 50% hyperscaler market share by 2025. Every major provider — AWS, Google, Microsoft, Oracle and Meta — is building on Neoverse, underscoring its role in powering AI at scale.To meet surging demand, Arm is extending Neoverse with NVIDIA NVLink Fusion, the high-bandwidth, coherent interconnect first pioneered with Grace Blackwell. NVLink Fusion links CPUs, GPUs, and accelerators into one unified rack-scale architecture, removing memory and bandwidth bottlenecks that limit AI performance. Connected with Arm’s AMBA CHI C2C protocol, it ensures seamless data movement between Arm-based CPUs and partners’ preferred accelerators.Together, Arm and NVIDIA are setting a new standard for AI infrastructure, enabling ecosystem partners to build differentiated, energy-efficient systems that accelerate innovation across the AI era.Smarter Power for Accelerated ComputingAs AI factories scale, energy is becoming the new bottleneck. The NVIDIA Domain Power Service (DPS) flips that constraint into an opportunity — turning power into a dynamic, orchestrated resource. Running as a Kubernetes service, DPS models and manages energy use across the data center, from rack to room to facility. It enables operators to extract more performance per megawatt by constraining power intelligently, improving throughput without expanding infrastructure.DPS integrates tightly with the NVIDIA Omniverse DSX Blueprint, a platform for designing and operating next-generation data centers. It works alongside technologies like Power Reservation Steering to balance workloads across the facility and the Workload Power Profile Solution to tune GPU power to the needs of specific jobs. Together, they form DSX Boost — an energy-aware control layer that maximizes efficiency while meeting performance targets.DPS also extends beyond the data center. With grid-facing APIs, it supports automated load shedding and demand response, helping utilities stabilize the grid during peak events. The result is a resilient, grid-interactive AI factory that turns every watt into measurable progress.