Microsoft deploys world's first 'supercomputer-scale' GB300 NVL72 Azure cluster — 4,608 GB300 GPUs linked together to form a single, unified accelerator capable of 1.44 PFLOPS of inference

Wait 5 sec.

Microsoft has just added a GB300 NVL72 supercluster to Azure, deploying a total of 4,608 GB300 GPUs across 64 racks. Each rack consists of 72x GPUs interconnected via NVLink, allowing for 1.44 PFLOPS of FP4 Tensor Core performance. Each rack is then intra-connected via Quantum-X800 InfiniBand at 800 Gb/s per GPU.