Autonomous generation of decision-grade clinical evidence

Wait 5 sec.

Medical practice is bottlenecked by the slow production of high-quality clinical evidence. Despite progress in automating selected stages, autonomous conduct of the entire research life cycle remains beyond reach. Here we present OpenEBM, the first autonomous system to generate decision-grade clinical evidence by conducting evidence-synthesis research end to end. To enable and evaluate this, we develop OpenEBM-Corpus, a foundation resource of expert-annotated research trajectories that enables training of a specialist model, and OpenEBM-Bench, a multidisciplinary benchmark that evaluates the entire research life cycle. Our compact specialist model generates valid clinical evidence in 90.7% of end-to-end evaluations and matches expert performance across the research trajectory, whereas GPT-5 falls to 3.8% as failures propagate through dependent stages. In blinded evaluations across clinical domains, independent evaluators prefer OpenEBM at multiple stages and cannot distinguish its reasoning traces from expert-conducted work above chance. Applied to a question left unresolved by current guidelines, OpenEBM produces de novo evidence addressing the efficacy and safety of neoadjuvant chemotherapy for locally advanced rectal cancer. OpenEBM brings within reach the founding aspiration of evidence-based medicine and establishes a paradigm for scalable evidence generation.