Positron AI says its Atlas accelerator beats Nvidia H200 on inference in just 33% of the power — delivers 280 tokens per second per user with Llama 3.1 8B in 2000W envelope

Wait 5 sec.

Cloudflare is testing Positron AI's Atlas machine based on Archer accelerators, an inference-only solution that claims to outperform Nvidia's H200 DGX using one-third the power.