NVIDIA, Ineffable Intelligence Team Up to Build the Future of Reinforcement Learning Infrastructure

Wait 5 sec.

Reinforcement-learning agents — AI systems that learn by trial and error — can convert computation into new knowledge.That’s the focus of a new engineering-level collaboration between NVIDIA and Ineffable Intelligence, the London-based AI lab founded by AlphaGo architect David Silver in the wake of Ineffable’s emergence from stealth last week.“The next frontier of AI is superlearners — systems that learn continuously from experience,” said Jensen Huang, founder and CEO of NVIDIA. “We are thrilled to partner with Ineffable Intelligence to codesign the infrastructure for large-scale reinforcement learning as they push the frontier of AI and pioneer a new generation of intelligent systems.”Silver is one of the pioneers of reinforcement learning, an approach that has transformed AI research. He’s focused on further developing this approach into a new paradigm.“Researchers have largely solved the easier problem of AI: how to build systems that know all the things humans already know,” Silver said. “But now we need to solve the harder problem of AI: how to build systems that discover new knowledge for themselves. That requires a very different approach — systems that learn from experience.”That kind of learning needs a powerful and highly optimized pipeline to support it. Unlike pretraining, where a fixed dataset of human data flows through the system, reinforcement learning workloads generate their data on the fly. The system has to act, observe, score and update continuously in tight loops, which puts pressure on interconnect, memory bandwidth and serving in ways that pretraining doesn’t. Furthermore, the system will train on rich forms of experience that are quite distinct from human language and other human data, and may require novel model architectures and training algorithms. That’s where NVIDIA and Ineffable are focusing their technical work: building a pipeline that can feed reinforcement learning systems at scale. Engineers from both companies have teamed up to explore the best way to create this training pipeline. This work is starting on NVIDIA Grace Blackwell, and will be among the first to explore the upcoming NVIDIA Vera Rubin platform. The goal is to understand the next generation of hardware and software that will be required as the AI world shifts beyond human data toward models that learn through simulation and experience. Getting this infrastructure right will unlock an unprecedented scale of reinforcement learning in highly complex and rich environments, allowing agents to discover breakthroughs across all fields of knowledge.