NEWS27 May 2026Correction 27 May 2026The new open-source atlas, generated by an AI tool called ESMFold2, vastly increases the known protein universe.ByEwen Callaway &Miryam Naddaf1Ewen CallawayView author publicationsSearch author on: PubMed Google ScholarMiryam NaddafMiryam Naddaf is a reporter for Nature in London.View author publicationsSearch author on: PubMed Google ScholarThe AI tool designed binders against Cytotoxic T-lymphocyte-associated protein 4 (CTLA-4). Credit: Molekuul/Science Photo LibraryThe known protein universe just got a lot bigger. A newly released artificial-intelligence tool has generated an atlas of more than one billion predicted protein structures and billions more protein sequences.The database, known as the ESM Atlas, was unveiled today by researchers at the Chan Zuckerberg Initiative’s Biohub, a biomedical institute created in San Francisco, California, by Facebook founder Mark Zuckerberg and his wife, physician and educator Priscilla Chan.The atlas eclipses the AlphaFold Database of predicted protein structures by more than 800 million entries, and a previous ESM Atlas by some 300 million.The predictions were made using ESMFold2, an AI model that Biohub says surpasses the performance of AlphaFold3, the latest version of Google DeepMind’s system and other protein-structure prediction AIs. The atlas is described in a preprint released today1.“What this atlas does is it shows the totality of protein biology and especially the parts that are most unknown,” says Biohub science head Alex Rives, who led the effort. “We think it’s going to be a really powerful substrate for the discovery of new biology.”Other scientists are impressed with the results, especially that ESMFold2 is fully open source. But the Biohub model enters an increasingly crowded field, in which competing open-source and proprietary protein models are making gains at breakneck speed.Antibody predictionsESMFold2 is based on a ‘protein language’ model that Rives’s team unveiled in 2024, which was trained on billions of proteins from across the tree of life. It includes ‘metagenomic’ sequences from soil, ocean and other environments, which are absent from the AlphaFold database of predicted protein structures.Rives’ team say ESMFold2 outperforms existing methods, including AlphaFold3, at determining the correct structure of complexes of interacting proteins – including antibody molecules binding to their antigen molecular targets.AlphaFold is running out of data — so drug firms are building their own versionIn the preprint, the researchers describe how they used ESMFold2 to design new antibodies and other proteins that can strongly attach to proteins implicated in cancers and immunological conditions. When created and tested in the lab, a high proportion of the designs worked as predicted.Rives’s team used the tool to create an atlas containing 1.1 billion predicted protein structures as well as information on the sequences of 6.8 billion proteins. Most of these come metagenomic sequences that had been only poorly characterized. Rives hopes that the atlas — which will be freely accessible — will help scientists to make connections between the known and unknown parts of the protein universe. Using the atlas, the researchers found structural similarities between CRISPR microbial defence proteins and a gene-editing protein identified in a soil fungus in 2023 and found in other eukaryotic species.Supplementary databasedoi: https://doi.org/10.1038/d41586-026-01686-3Additional reporting by Miryam Naddaf.Updates & CorrectionsCorrection 27 May 2026: Chan Zuckerberg Initiative's Biohub is simply referred to as Biohub, not CZI-Biohub.ReferencesCandido, S. et al. Preprint (2026).Yeo, J. et al. Preprint at bioRxiv https://doi.org/10.1101/2025.04.23.650224 (2025).Download references The huge protein database that spawned AlphaFold and biology’s AI revolution AlphaFold is running out of data — so drug firms are building their own version Beyond AlphaFold: how AI is decoding the grammar of the genome AlphaFold is five years old — these charts show how it revolutionized science AlphaFold’s new rival? Meta AI predicts shape of 600 million proteins AlphaFold touted as next big thing for drug discovery — but is it? What’s next for AlphaFold and the AI protein-folding revolutionSubjectsBiotechnologyDatabasesMachine learningProteomicsLatest on:BiotechnologyDatabasesMachine learningJobs Full Professorship (W3) for "Biochemistry" (f/m/d)The Heidelberg University Biochemistry Center (BZH) is seeking to fill a Full Professorship (W3) for “Biochemistry” (f/m/d) as soon as possib...Heidelberg, Baden-Württemberg (DE)Universität HeidelbergGlobal Faculty Recruitment in Artificial Intelligence-The Chinese University of Hong Kong, ShenzhenCUHK-Shenzhen invites applications for faculty positions at all ranks to join our rapidly expanding AI ecosystem.Located in southern China's Guangdong Province, Shenzhen sits on the eastern shore of the Pearl River Estuary. It shares a southern border with Hong Kong and faces the South China Sea, forming a core geographic hub of the Greater Bay Area.The Chinese University of Hong Kong, ShenzhenFaculty Positions in Chemical Biology, Westlake UniversityWe are seeking outstanding scientists to lead vigorous independent research programs focusing on all aspects of chemical biology including...Hangzhou, Zhejiang, ChinaWestlake University School of Life Science