Nvidia's GB200 NVL72 Supercomputer Achieves 2.7× Faster Inference on DeepSeek V2

Wait 5 sec.

In collaboration with NVIDIA, researchers from SGLang have published early benchmarks of the GB200 (Grace Blackwell) NVL72 system, showing up to a 2.7× increase in LLM inference throughput compared to the H100 on the DeepSeek-V2 671B model. By Matt Foster