Positron AI says its Atlas accelerator beats Nvidia H200 on inference in just 33% of the power — delivers 280 tokens per second per user with Llama 3.1 8B in 2000W envelope
Read post on tomshardware.com
Cloudflare is testing Positron AI's Atlas machine based on Archer accelerators, an inference-only solution that claims to outperform Nvidia's H200 DGX using one-third the power.