CRISPR-GPT for agentic automation of gene-editing experiments

Wait 5 sec.

MainLarge language models (LLMs) have demonstrated exceptional capabilities in language skills and encapsulate a substantial amount of world knowledge1,2,3,4,5. Recent research has also enhanced LLMs with external tools, improving their problem-solving abilities and efficiencies6,7,8. Moreover, LLMs have also demonstrated potential as tool makers9 and black-box optimizers10. To this end, researchers have explored LLM-based specialized models for various scientific domains11,12, particularly for mathematics and chemistry tasks. ChemCrow13 uses tool-augmented LLM for solving a range of chemistry-related tasks such as paracetamol synthesis, whereas Co-scientist14 integrates automated experimentation, achieving successful optimization of palladium-catalysed cross-coupling reaction. LLMs have also shown initial promise in generating biological protocols, as demonstrated by studies like BioPlanner15. While recent advancements, such as OpenAI’s o1 preview, have improved reasoning abilities in areas such as mathematics and coding, progress in biological tasks remains comparatively limited. This limitation stems from general-purpose LLMs’ lack of in-depth understanding of biology, compounded by the unique challenges of biological experiments, including the variability of living systems, the noisy nature of biological data and the highly specialized, less transferable nature of biological skills and tools.Gene editing has transformed biological research and medicine, allowing for precise DNA modifications for both therapeutic and experimental applications. CRISPR-Cas, the most well-known gene-editing technology, originated from bacterial immune systems16,17,18,19,20,21,22,23,24. Its development has led to advanced techniques like CRISPR activation and interference (CRISPRa/i)25,26,27,28,29, base editing30,31 and prime editing32,33, creating a powerful toolkit for genetic modification and epigenetic modulation. In basic biomedical research, CRISPR gene-editing has become one of the most frequently used laboratory techniques: at the largest non-profit plasmid DNA repository, Addgene, 8 of the 15 top requested plasmids worldwide were for CRISPR gene-editing34. On the application side, CRISPR has produced the first permanent cure for sickle cell disease (SCD)35 and β-thalassaemia36, as well as facilitating plant engineering for sustainable agriculture20. As one of the most powerful biotechnologies, numerous software and protocols exist for specific gene-editing tasks. Despite these resources, an end-to-end solution—from CRISPR-Cas system selection, guide (g)RNA design, off-target evaluation, to delivery and data analysis—remains complex, particularly for newcomers. AI-assisted tools can simplify gene-editing experiment design and data analysis, making the technology more accessible and accelerating scientific and therapeutic discoveries.We introduce CRISPR-GPT, a solution that combines the strengths of LLMs with domain-specific knowledge, chain-of-thought reasoning, instruction fine-tuning, retrieval techniques and tools. CRISPR-GPT is centred around LLM-powered planning and execution agents (Fig. 1). This system leverages the reasoning abilities of general-purpose LLMs and multi-agent collaboration for task decomposition, constructing state machines and automated decision-making (Fig. 2a). It draws upon expert knowledge from leading practitioners and peer-reviewed published literature in gene editing for retrieval-augmented generation (RAG)13.Fig. 1: Overview of CRISPR-GPT.CRISPR-GPT is an LLM-powered multi-agent system designed to provide AI copiloting for human researchers in gene editing. It supports four primary gene-editing modalities: knockout, base editing, prime editing and epigenetic editing (CRISPRa/i). The system offers three user interaction modes: Meta mode (step-by-step guidance on predefined tasks), Auto mode (customized guidance based on user requests) and Q&A mode (real-time answers to ad hoc questions), to streamline experiment design and planning. CRISPR-GPT consists of four core components: the User proxy, LLM planner, Task executor and Tool provider. Together, these components are equipped with a comprehensive suite of tools and decision-support capabilities to facilitate the design, planning and analysis of gene-editing workflows. To evaluate CRISPR-GPT’s performance, we developed the Gene-editing bench, a framework of 288 test cases covering tasks such as experimental planning, sgRNA design, delivery method selection and more. Figure was originally created with BioRender.com/tb8sq6f.Full size imageFig. 2: CRISPR-GPT adopts a compositional, multi-agent architecture to enable human–AI collaboration and automated experimental designs.a, The backbone of CRISPR-GPT involves multi-agent collaboration between four core components: (1) The LLM Planner agent is responsible for configuring tasks on the basis of the user’s needs. It automatically performs task decomposition on the basis of the user’s request, the descriptions of the currently supported tasks and internal knowledge. The state machines of the selected tasks are chained together to fulfill the user’s request. (2) The Task executor agent implements the chain of state machines from the Planner agent and is responsible for providing instructions and feedback, receiving input from the User-proxy agent and calling external tools. State machines are central to the Task executor, where each state is responsible for one round of interaction with the user. The instruction is provided to the user first with sufficient information for the current decision-making step and the required inputs. After receiving the response from the user, it provides output and feedback, where Tool providers are potentially called during the execution of the state. Afterwards, the state machine transits to the next state. (3) The LLM User-proxy agent is responsible for interacting with the Task executor on behalf of the user, where the user can monitor the process and provide corrections to the User-proxy agent if the generated content needs modification or improvement. It generates responses to every step of the state machine on behalf of the user. (4) Tool providers support diverse external tools and connect to search engines or databases via API calls. Part of the panel was created with BioRender.com/svkmgjk. b, Breakdown of individual tasks in a typical CRISPR-GPT workflow for gene-editing experiments.Full size imageResultsBuilding AI co-pilot harnessing LLM’s reasoning abilityCRISPR-GPT supports four major gene-editing modalities and 22 gene-editing experiment tasks (Fig. 1 and Supplementary Table 1). It offers tunable levels of automation via three modes: Meta, Auto and Q&A. They are designed to accommodate users ranging from novice PhD-level scientists fresh to gene editing, to domain experts looking for more efficient, automated solutions for selected tasks (Fig. 1). The ‘Meta mode’ is designed for beginner researchers, guiding them through a sequence of essential tasks from selection of CRISPR systems, delivery methods, to designing gRNA, assessing off-target efficiency, generating experiment protocols and data analysis. Throughout this decision-making process, CRISPR-GPT interacts with users at every step, provides instructions and seeks clarifications when needed. The ‘Auto mode’ caters to advanced researchers and does not adhere to a predefined task order. Users submit a freestyle request, and the LLM Planner decomposes this into tasks, manages their interdependence, builds a customized workflow and executes them automatically. It fills in missing information on the basis of the initial inputs and explains its decisions and thought process, allowing users to monitor and adjust the process. The ‘Q&A mode’ supports users with on-demand scientific inquiries about gene editing.To assess the AI agent’s capabilities to perform gene-editing research, we compiled an evaluation test set, Gene-editing bench, from both public sources and human experts (details in Supplementary Note C). This test set covers a variety of gene-editing tasks (Fig. 1). By using the test set, we performed extensive evaluation of CRISPR-GPT’s capabilities in major gene-editing research tasks, such as experiment planning, delivery selection, single guide (sg)RNA design and experiment troubleshooting. In addition, we invited human experts to perform a thorough user experience evaluation of CRISPR-GPT and collected valuable human feedback.Further, we implement CRISPR-GPT in real-world wet labs. Using CRISPR-GPT as an AI co-pilot, we demonstrate a fully AI-guided knockout (KO) of four genes: TGFβR1, SNAI1, BAX and BCL2L1, using CRISPR-Cas12a in human lung adenocarcinoma cell line, as well as AI-guided CRISPR-dCas9 epigenetic activation of two genes: NCR3LG1 and CEACAM1, in a human melanoma model. All these wet-lab experiments were carried out by junior researchers not familiar with gene editing. They both succeeded on the first attempt, confirmed by not only editing efficiencies, but also biologically relevant phenotypes and protein-level validation, highlighting the potential of LLM-guided biological research.CRISPR-GPT is a multi-agent, compositional system involving a team of LLM-based agents, including an LLM Planner agent, a User-proxy agent, Task executor agents and Tool provider agents (Fig. 2a). These components are powered by LLMs to interact with one another as well as the human user. We also refer to the full system as an ‘agent’ to encapsulate the overall functionalities.To automate biological experiment design and analysis, we view the overall problem as sequential decision-making. This perspective frames the interaction between the user and the automated system as a series of decision-making steps, each essential for progressing towards the ultimate goal. Take the Auto mode for example. A user can initiate the process with a meta-request, for example, “I want to knock out the human TGFβR1 gene in A549 lung cancer cells”. In response, the agent’s LLM Planner will analyse the user’s request, drawing on its extensive internal knowledge base via retrieval techniques. Leveraging the reasoning abilities of the base LLM, the Planner generates a chain-of-thought37,38 reasoning path and chooses an optimal action from a set of plausible ones while following expert-written guidelines. Consequently, the Planner breaks down the user’s request into a sequence of discrete tasks, for example, ‘CRISPR-Cas system selection’ and ‘gRNA design for knockout’, while managing interdependencies among these tasks. Each individual task is solved by an LLM-powered state machine, via the Task executor, entailing a sequence of states to progress towards the specific goal. After the meta-task decomposition, the Task executor will chain the state machines of the corresponding tasks together into a larger state machine and begin the execution process, systematically addressing each task in sequence to ensure that the experiment’s objectives are met efficiently and effectively (Fig. 2a).The User-proxy agent is responsible for guiding the user throughout the decision-making process via multiple rounds of textual interactions (typical user interactions required by each task detailed in Supplementary Table 2). At each decision point, the internal state machine presents a ‘state variable’ to the User-proxy agent, which includes the current task instructions, and specifies any necessary input from the user to proceed. The User-proxy agent then interprets this state given the user interactions and makes informed decisions as input to the Task executor on behalf of the user. Subsequently, the User-proxy agent receives feedback from the Task executor, including the task results and the reasoning process that led to those outcomes. Concurrently, the User-proxy agent continues to interact with the user and provides them with instructions, continuously integrating their feedback to ensure alignment with the user’s objectives (detailed in Methods; Fig. 2a and Supplementary Fig. 1).To enhance the LLM with domain knowledge, we enable the CRISPR agent to retrieve and synthesize information from published protocols, peer-reviewed research papers and expert-written guidelines, and to utilize external tools and conduct web searches via Tool provider agents (Fig. 2a).For an end-to-end gene-editing workflow, CRISPR-GPT typically constructs a chain of tasks that includes selecting the appropriate CRISPR system, recommending delivery methods, designing gRNAs, predicting off-target effects, selecting experimental protocols, planning validation assays and performing data analysis (Fig. 2b). The system’s modular architecture facilitates easy integration of additional functionalities and tools. CRISPR-GPT serves as a prototype LLM-powered AI co-pilot for scientific research, with potential applications extending beyond gene editing.CRISPR-GPT agents automate gene-editing research tasksCRISPR-GPT is able to automate gene-editing research via several key functionalities. For each functionality we discuss the agentic implementation and evaluation results.Experiment planningThe Task planner agent is charged with directing the entire workflow and breaking down the user’s meta-request into a task chain (Fig. 2b). While the Planner selects and follows a predefined workflow in the Meta mode, it is able to take in freestyle user requests and auto-build a customized workflow in the Auto mode. For example, a user may only need part of the pre-designed workflow including CRISPR-Cas system selection, delivery method selection, gRNA design and experimental protocol selection before the experiment. Then the Task planner agent extracts the right information from the user request and assembles a customized workflow to suit user needs (Fig. 3a). To evaluate CRISPR-GPT’s ability to correctly layout gene-editing tasks and manage intertask dependence, we compiled a planning test set, as part of the Gene-editing bench, with user requests and golden answers curated by human experts. Using this test set, we evaluated CRISPR-GPT in comparison with prompted general LLMs, showing that CRISPR-GPT outperforms general LLMs in planning gene-editing tasks (Fig. 3b). The CRISPR-GPT agent driven by GPT-4o scored over 0.99 in accuracy, precision, recall and F1 score, and had