Exapted CRISPR–Cas12f homologues drive RNA-guided transcription

Wait 5 sec.

AbstractBacterial transcription initiation is a tightly regulated process that canonically relies on sequence-specific promoter recognition by dedicated sigma (σ) factors, leading to functional DNA engagement by RNA polymerase (RNAP)1. Although the seven σ factors in Escherichia coli have been extensively characterized2, Bacteroidetes species encode dozens of specialized, extracytoplasmic function σ factors (σE) whose precise roles are unknown, pointing to additional layers of regulatory potential3. Here we uncover a mechanism of RNA-guided gene activation involving the coordinated action of σE factor in complex with nuclease-dead Cas12f (dCas12f). We screened a large set of genetically linked dCas12f and σE homologues in E. coli using RNA and chromatin immunoprecipitation experiments, revealing systems that exhibit robust guide RNA enrichment and DNA target binding with a minimal 5′-G target-adjacent motif. Recruitment of σE was dependent on dCas12f and guide RNA, suggesting direct protein–protein interactions, and co-expression experiments demonstrated that the dCas12f–gRNA–σE ternary complex was competent for programmable recruitment of the RNAP holoenzyme. Remarkably, dCas12f–RNA–σE complexes drove potent gene expression in the absence of any requisite promoter motifs, with de novo transcription start sites defined exclusively by the relative distance from the dCas12f-mediated R-loop. Our findings highlight a new paradigm of RNA-guided transcription that embodies natural features reminiscent of CRISPR activation (CRISPRa) technology4,5.This is a preview of subscription content, access via your institutionAccess optionsAccess Nature and 54 other Nature Portfolio journalsGet Nature+, our best-value online-access subscription27,99 € / 30 dayscancel any timeLearn moreSubscribe to this journalReceive 51 print issues and online access199,00 € per yearonly 3,90 € per issueLearn moreBuy this articlePurchase on SpringerLinkInstant access to the full article PDF.39,95 €Prices may be subject to local taxes which are calculated during checkoutFig. 1: Nuclease-dead Cas12f homologues are genetically associated with atypical σ factor genes.Fig. 2: Experimental discovery of guide RNA and target DNA substrates of dCas12f.Fig. 3: dCas12f-associated gRNAs target conserved ncRNA loci and regulate diverse gene expression programmes.Fig. 4: RNA-guided dCas12f recruits σE, but not HTH, to genomic target sites.Fig. 5: dCas12f and σE direct programmable, RNA-guided transcription with single base-pair resolution.Data availabilityNext-generation sequencing data are available in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (BioProject accession: PRJNA1247282) and the Gene Expression Omnibus (GSE293889). The published genomes used for bioinformatics analyses were obtained from NCBI (Supplementary Table 4). Datasets generated and analysed in the current study are available from the corresponding authors on reasonable request.Code availabilityCustom scripts used for bioinformatics are available at https://github.com/sternberglab/Hoffmann_et_al_2026.ReferencesFeklistov, A., Sharon, B. D., Darst, S. A. & Gross, C. A. Bacterial sigma factors: a historical, structural, and genomic perspective. Annu. Rev. Microbiol. 68, 357–376 (2014).Article  CAS  PubMed  Google Scholar Sharma, U. K. & Chatterji, D. Transcriptional switching in Escherichia coli during stress and starvation by modulation of σ70 activity. FEMS Microbiol. Rev. 34, 646–657 (2010).Article  CAS  PubMed  Google Scholar Casas-Pastor, D. et al. Expansion and re-classification of the extracytoplasmic function (ECF) σ factor family. Nucleic Acids Res. 49, 986–1005 (2021).Article  CAS  PubMed  PubMed Central  Google Scholar Bikard, D. et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR–Cas system. Nucleic Acids Res. 41, 7429–7437 (2013).Article  CAS  PubMed  PubMed Central  Google Scholar Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Makarova, K. S. et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 18, 67–83 (2020).Article  CAS  PubMed  Google Scholar Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Zetsche, B. et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR–Cas system. Cell 163, 759–771 (2015).Article  CAS  PubMed  PubMed Central  Google Scholar Shmakov, S. et al. Diversity and evolution of class 2 CRISPR–Cas systems. Nat. Rev. Microbiol. 15, 169–182 (2017).Article  CAS  PubMed  PubMed Central  Google Scholar Altae-Tran, H. et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374, 57–65 (2021).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Sasnauskas, G. et al. TnpB structure reveals minimal functional core of Cas12 nuclease family. Nature 616, 384–389 (2023).Article  ADS  CAS  PubMed  Google Scholar Meers, C. et al. Transposon-encoded nucleases use guide RNAs to promote their selfish spread. Nature 622, 863–871 (2023).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Durrant, M. G. et al. Bridge RNAs direct programmable recombination of target and donor DNA. Nature 630, 984–993 (2024).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Siddiquee, R., Pong, C. H., Hall, R. M. & Ataide, S. F. A programmable seekRNA guides target selection by IS1111 and IS110 type insertion sequences. Nat. Commun. 15, 5235 (2024).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Vaysset, H., Meers, C., Cury, J., Bernheim, A. & Sternberg, S. H. Evolutionary origins of archaeal and eukaryotic RNA-guided RNA modification in bacterial IS110 transposons. Nat. Microbiol. 10, 20–27 (2025).Article  CAS  PubMed  PubMed Central  Google Scholar Saito, M. et al. Fanzor is a eukaryotic programmable RNA-guided endonuclease. Nature 620, 660–668 (2023).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Altae-Tran, H. et al. Diversity, evolution, and classification of the RNA-guided nucleases TnpB and Cas12. Proc. Natl Acad. Sci. USA 120, e2308224120 (2023).Article  CAS  PubMed  PubMed Central  Google Scholar Wiegand, T. et al. TnpB homologues exapted from transposons are RNA-guided transcription factors. Nature 631, 439–448 (2024).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Workman, R. E. et al. A natural single-guide RNA repurposes Cas9 to autoregulate CRISPR–Cas expression. Cell 184, 675–688.e19 (2021).Article  CAS  PubMed  Google Scholar Sampson, T. R., Saroj, S. D., Llewellyn, A. C., Tzeng, Y.-L. & Weiss, D. S. A CRISPR/Cas system mediates bacterial innate immune evasion and virulence. Nature 497, 254–257 (2013).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Ratner, H. K. et al. Catalytically active Cas9 mediates transcriptional interference to facilitate bacterial virulence. Mol. Cell 75, 498–510.e5 (2019).Article  CAS  PubMed  PubMed Central  Google Scholar Wu, W. Y. et al. The miniature CRISPR–Cas12m effector binds DNA to block transcription. Mol. Cell 82, 4487–4502.e7 (2022).Article  CAS  PubMed  Google Scholar Huang, C. J., Adler, B. A. & Doudna, J. A. A naturally DNase-free CRISPR–Cas12c enzyme silences gene expression. Mol. Cell 82, 2148–2160.e4 (2022).Article  CAS  PubMed  Google Scholar Li, M. et al. Toxin-antitoxin RNA pairs safeguard CRISPR–Cas systems. Science 372, eabe5601 (2021).Article  ADS  CAS  PubMed  Google Scholar Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).Article  CAS  PubMed  PubMed Central  Google Scholar Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).Article  CAS  PubMed  PubMed Central  Google Scholar Zhang, X. et al. Multiplex gene regulation by CRISPR–ddCpf1. Cell Discov. 3, 17018 (2017).Article  CAS  PubMed  PubMed Central  Google Scholar Zalatan, J. G. et al. Engineering complex synthetic transcriptional programs with CRISPR RNA scaffolds. Cell 160, 339–350 (2015).Article  CAS  PubMed  Google Scholar Fontana, J. et al. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618 (2020).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Burgess, R. R., Travers, A. A., Dunn, J. J. & Bautz, E. K. F. Factor stimulating transcription by RNA polymerase. Nature 221, 43–46 (1969).Article  ADS  CAS  PubMed  Google Scholar Ross, W. et al. A third recognition element in bacterial promoters: DNA binding by the α subunit of RNA polymerase. Science 262, 1407–1413 (1993).Article  ADS  CAS  PubMed  Google Scholar Ishihama, A. Functional modulation of Escherichia coli RNA polymerase. Annu. Rev. Microbiol. 54, 499–518 (2000).Article  CAS  PubMed  Google Scholar Paget, M. S. Bacterial sigma factors and anti-sigma factors: structure, function and distribution. Biomolecules 5, 1245–1265 (2015).Article  CAS  PubMed  PubMed Central  Google Scholar Erickson, J. W. & Gross, C. A. Identification of the sigma E subunit of Escherichia coli RNA polymerase: a second alternate sigma factor involved in high-temperature gene expression. Genes Dev. 3, 1462–1471 (1989).Article  CAS  PubMed  Google Scholar Mecsas, J., Rouviere, P. E., Erickson, J. W., Donohue, T. J. & Gross, C. A. The activity of sigma E, an Escherichia coli heat-inducible sigma-factor, is modulated by expression of outer membrane proteins. Genes Dev. 7, 2618–2628 (1993).Article  CAS  PubMed  Google Scholar Hove, B. V., Staudenmaier, H. & Braun, V. Novel two-component transmembrane transcription control: regulation of iron dicitrate transport in Escherichia coli K-12. J. Bacteriol. 172, 6749–6758 (1990).Article  PubMed  PubMed Central  Google Scholar Pinto, D. & da Fonseca, R. R. Evolution of the extracytoplasmic function σ factor protein family. NAR Genom. Bioinformatics 2, lqz026 (2020).Article  Google Scholar Martens, E. C., Roth, R., Heuser, J. E. & Gordon, J. I. Coordinate regulation of glycan degradation and polysaccharide capsule biosynthesis by a prominent human gut symbiont. J. Biol. Chem. 284, 18445–18457 (2009).Article  CAS  PubMed  PubMed Central  Google Scholar Ades, S. E. Regulation by destruction: design of the σE envelope stress response. Curr. Opin. Microbiol. 11, 535–540 (2008).Article  CAS  PubMed  Google Scholar Xiao, R., Li, Z., Wang, S., Han, R. & Chang, L. Structural basis for substrate recognition and cleavage by the dimerization-dependent CRISPR–Cas12f nuclease. Nucleic Acids Res. 49, 4120–4128 (2021).Article  CAS  PubMed  PubMed Central  Google Scholar Aravind, L., Anantharaman, V., Balaji, S., Babu, M. & Iyer, L. The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol. Rev. 29, 231–262 (2005).Article  CAS  PubMed  Google Scholar Yokoyama, T. et al. The Escherichia coli S2P intramembrane protease RseP regulates ferric citrate uptake by cleaving the sigma factor regulator FecR. J. Biol. Chem. 296, 100673 (2021).Article  CAS  PubMed  PubMed Central  Google Scholar Hoffmann, F. T. et al. Selective TnsC recruitment enhances the fidelity of RNA-guided transposition. Nature 609, 384–393 (2022).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Harrington, L. B. et al. Programmed DNA destruction by miniature CRISPR–Cas14 enzymes. Science 362, 839–842 (2018).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Karvelis, T. et al. PAM recognition by miniature CRISPR–Cas12f nucleases triggers programmable double-stranded DNA target cleavage. Nucleic Acids Res. 48, 5016–5023 (2020).Article  CAS  PubMed  PubMed Central  Google Scholar Karvelis, T. et al. Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692–696 (2021).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Brooks, B. E. & Buchanan, S. K. Signaling mechanisms for activation of extracytoplasmic function (ECF) sigma factors. Biochim. Biophys. Acta 1778, 1930–1945 (2008).Article  CAS  PubMed  Google Scholar Noinaj, N., Guillier, M., Barnard, T. J. & Buchanan, S. K. TonB-dependent transporters: regulation, structure, and function. Microbiology 64, 43–60 (2010).Article  CAS  Google Scholar Birkholz, N. et al. Phage anti-CRISPR control by an RNA- and DNA-binding helix–turn–helix protein. Nature 631, 670–677 (2024).Article  ADS  CAS  PubMed  Google Scholar Bayley, D. P., Rocha, E. R. & Smith, C. J. Analysis of cepA and other Bacteroides fragilis genes reveals a unique promoter structure. FEMS Microbiol. Lett. 193, 149–154 (2000).Article  CAS  PubMed  Google Scholar Chen, S., Bagdasarian, M., Kaufman, M. G. & Walker, E. D. Characterization of strong promoters from an environmental Flavobacterium hibernum strain by using a green fluorescent protein-based reporter system. Appl. Environ. Microbiol. 73, 1089–1100 (2007).Article  ADS  CAS  PubMed  Google Scholar Xiao, R. et al. Structural basis of RNA-guided transcription by a dCas12f–σE–RNAP complex. Nature https://doi.org/10.1038/s41586-026-10178-3 (2026).Article  PubMed  PubMed Central  Google Scholar Harden, T. T. et al. Bacterial RNA polymerase can retain σ70 throughout transcription. Proc. Natl Acad. Sci. USA 113, 602–607 (2016).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Jacob, F. & Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318–356 (1961).Article  CAS  PubMed  Google Scholar Balleza, E. et al. Regulation by transcription factors in bacteria: beyond description. FEMS Microbiol. Rev. 33, 133–151 (2009).Article  CAS  PubMed  PubMed Central  Google Scholar Bak, G., Han, K., Kim, D. & Lee, Y. Roles of rpoS-activating small RNAs in pathways leading to acid resistance of Escherichia coli. MicrobiologyOpen 3, 15–28 (2014).Article  CAS  PubMed  Google Scholar Massé, E. & Gottesman, S. A small RNA regulates the expression of genes involved in iron metabolism in Escherichia coli. Proc. Natl Acad. Sci. USA 99, 4620–4625 (2002).Article  ADS  PubMed  PubMed Central  Google Scholar Madej, M. et al. Structural and functional insights into oligopeptide acquisition by the RagAB transporter from Porphyromonas gingivalis. Nat. Microbiol. 5, 1016–1025 (2020).Article  CAS  PubMed  PubMed Central  Google Scholar Grondin, J. M., Tamura, K., Déjean, G., Abbott, D. W. & Brumer, H. Polysaccharide utilization loci: fueling microbial communities. J. Bacteriol. 199, e00860-16 (2017).Article  PubMed  PubMed Central  Google Scholar Lapébie, P., Lombard, V., Drula, E., Terrapon, N. & Henrissat, B. Bacteroidetes use thousands of enzyme combinations to break down glycans. Nat. Commun. 10, 2043 (2019).Article  ADS  PubMed  PubMed Central  Google Scholar Tong, M. et al. A highly conserved SusCD transporter determines the import and species-specific antagonism of Bacteroides ubiquitin homologues. Nat. Commun. 15, 8794 (2024).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Martens, E. C., Chiang, H. C. & Gordon, J. I. Mucosal glycan foraging enhances fitness and transmission of a saccharolytic human gut bacterial symbiont. Cell Host Microbe 4, 447–457 (2008).Article  CAS  PubMed  PubMed Central  Google Scholar Feng, J. et al. Polysaccharide utilization loci in Bacteroides determine population fitness and community-level interactions. Cell Host Microbe 30, 200–215.e12 (2022).Article  CAS  PubMed  PubMed Central  Google Scholar Todor, H. et al. Rewiring the specificity of extracytoplasmic function sigma factors. Proc. Natl Acad. Sci. USA 117, 33496–33506 (2020).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Gray, D. A. et al. Insights into SusCD-mediated glycan import by a prominent gut symbiont. Nat. Commun. 12, 44 (2021).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Takeda, S. N. et al. Structure of the miniature type V-F CRISPR–Cas effector enzyme. Mol. Cell 81, 558–570.e3 (2021).Article  CAS  PubMed  Google Scholar Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).Article  PubMed  PubMed Central  Google Scholar Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).Article  CAS  PubMed  PubMed Central  Google Scholar Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).Article  ADS  PubMed  PubMed Central  Google Scholar Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).Article  CAS  PubMed  Google Scholar Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).Article  PubMed  PubMed Central  Google Scholar Winter, D. J. rentrez: An R package for the NCBI eUtils API. The R Journal 9, 520–526 (2017).Article  Google Scholar Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: Functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).Article  CAS  PubMed  PubMed Central  Google Scholar Tang, S. et al. De novo gene synthesis by an antiviral reverse transcriptase. Science 386, eadq0876 (2024).Article  CAS  PubMed  PubMed Central  Google Scholar Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetJ. 17, 10–12 (2011).Article  Google Scholar Vasimuddin, M., Misra, S., Li, H. & Aluru, S. in 2019 IEEE International Parallel and Distributed Processing Symposium 314–324 (IEEE Computer Society, 2019).Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).Article  PubMed  PubMed Central  Google Scholar Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).Article  PubMed  PubMed Central  Google Scholar Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).Article  CAS  PubMed  PubMed Central  Google Scholar Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).Article  CAS  PubMed  Google Scholar Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).Article  PubMed  PubMed Central  Google Scholar Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).Article  PubMed  PubMed Central  Google Scholar Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).Article  CAS  PubMed  PubMed Central  Google Scholar Cooper, L. A., Stringer, A. M. & Wade, J. T. Determining the specificity of cascade binding, interference, and primed adaptation in vivo in the Escherichia coli Type I-E CRISPR–Cas system. mBio 9, e02100-17 (2018).Article  PubMed  PubMed Central  Google Scholar Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).Article  CAS  PubMed  PubMed Central  Google Scholar Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).Article  PubMed  PubMed Central  Google Scholar Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).Article  CAS  PubMed  PubMed Central  Google Scholar Will, S., Joshi, T., Hofacker, I. L., Stadler, P. F. & Backofen, R. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18, 900–914 (2012).Article  CAS  PubMed  PubMed Central  Google Scholar Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).Article  CAS  PubMed  PubMed Central  Google Scholar Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).Article  ADS  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar Schwengers, O. et al. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom. 7, 000685 (2021).CAS  PubMed  PubMed Central  Google Scholar Rice, P., Longden, I. & Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).Article  CAS  PubMed  Google Scholar Wickham, H. Ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).Download referencesAcknowledgementsWe thank A. J. Robinson, S. Kang., J. L. Ramirez and T. M. Smith for laboratory support; E. A. Campbell and S. A. Darst for helpful discussion; Z. Hua for custom scripts; Z. Quan for providing Fta strains; the JP Sulzberger Columbia Genome Center for NGS support; and L. F. Landweber for qPCR and gel imager instrument access. S.T. was supported by a Ruth L. Kirchstein Individual Predoctoral Fellowship (F30AI183830) from the NIH. L.C. was supported by NIH grant R01GM138675 and by the National Science Foundation (NSF) Faculty Early Career Development Program (CAREER) Award 2339799. S.H.S. was supported by NIH grant R01EB031935, NSF Faculty Early Career Development Program (CAREER) Award 2239685, a Pew Biomedical Scholarship, an Irma T. Hirschl Career Scientist Award, the Howard Hughes Medical Institute Investigator Program, and a generous startup package from the Columbia University Irving Medical Center Dean’s Office and the Vagelos Precision Medicine Fund.Author informationAuthor notesFlorian T. HoffmannPresent address: Division of Hematology, Department of Medicine, Stanford University, Stanford, CA, USAChance MeersPresent address: Department of Biochemistry, Vanderbilt University, Nashville, TN, USAAuthors and AffiliationsDepartment of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USAFlorian T. Hoffmann, Tanner Wiegand, Adriana I. Palmieri, Stephen Tang, Hoang C. Le, Chance Meers, George D. Lampe & Samuel H. SternbergHoward Hughes Medical Institute, Columbia University, New York, NY, USATanner Wiegand, Chance Meers, George D. Lampe & Samuel H. SternbergDepartment of Cellular Physiology and Biophysics, Columbia University, New York, NY, USAJuniper Glass-KlaiberDepartment of Biological Sciences, Purdue University, West Lafayette, IN, USARenjian Xiao & Leifu ChangPurdue Institute for Cancer Research, Purdue University, West Lafayette, IN, USALeifu ChangAuthorsFlorian T. HoffmannView author publicationsSearch author on:PubMed Google ScholarTanner WiegandView author publicationsSearch author on:PubMed Google ScholarAdriana I. PalmieriView author publicationsSearch author on:PubMed Google ScholarJuniper Glass-KlaiberView author publicationsSearch author on:PubMed Google ScholarRenjian XiaoView author publicationsSearch author on:PubMed Google ScholarStephen TangView author publicationsSearch author on:PubMed Google ScholarHoang C. LeView author publicationsSearch author on:PubMed Google ScholarChance MeersView author publicationsSearch author on:PubMed Google ScholarGeorge D. LampeView author publicationsSearch author on:PubMed Google ScholarLeifu ChangView author publicationsSearch author on:PubMed Google ScholarSamuel H. SternbergView author publicationsSearch author on:PubMed Google ScholarContributionsF.T.H. and S.H.S. conceived the project. F.T.H. designed and performed most experiments. T.W. performed most bioinformatics analyses. F.T.H., A.I.P. and J.G.-K. performed cloning and RFP activation assays. R.X. cloned RNAP expression plasmids and contributed to data interpretation. F.T.H. and S.T. analysed RNA-seq and RIP-seq data. H.C.L. and C.M. performed initial phylogenetics and bioinformatics analyses. G.D.L. and F.T.H. performed flow cytometry assays. L.C. contributed to data interpretation. S.H.S. oversaw the project. F.T.H., T.W. and S.H.S. discussed the data and wrote the manuscript, with input from all authors.Corresponding authorCorrespondence to Samuel H. Sternberg.Ethics declarationsCompeting interestsS.H.S. is a co-founder and scientific advisor to Dahlia Biosciences, a scientific advisor to CrisprBits and Prime Medicine, and an equity holder in Dahlia Biosciences and CrisprBits. S.H.S., F.T.H. and T.W. are inventors on patents related to CRISPR–Cas-like systems and uses thereof. The other authors declare no competing interests.Peer reviewPeer review informationNature thanks Wen Wu who co-reviewed with Prarthana Mohanraju, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.Additional informationPublisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Extended data figures and tablesExtended Data Fig. 1 Evolutionary analysis and genomic context of diverse dcas12f and rpoE genes.a, Representative genomic neighbourhoods of predicted nuclease-dead Cas12f homologues that are not associated with rpoE (σE) genes. CRISPR arrays and putative gRNAs are annotated, as are nearby genes. Putative gRNAs were identified by detecting covariance in the intergenic regions upstream of CRISPR loci, and the predicted secondary structure of a representative example is shown in the inset. b, Map of a dcas12f locus and its putative gRNA target upstream of an RND efflux system (middle), in a metagenome assembled genome of a Chitinophagaceae bacterium (top). The putative guide-target duplex and predicted gRNA structure are highlighted (bottom). c, Correlation between the percent identity of select dCas12f homologues (y-axis) and either σE (rpoE) or HTH homologues (x-axis) to the F. taeanensis homologues. The stronger correlation with σE (R2 = 0.78) and weaker correlation with HTH (R2 = 0.19) suggest a tighter genetic linkage between dcas12f and rpoE genes rather than hth. d, Magnified and simplified view of a partial σE phylogenetic tree from Fig. 1f (left), showing homologues containing an additional C-terminal domain (CTD) that are associated with either dCas12f homologues (pink) or restriction enzyme (RE)-like homologues (lavender). Representative genomic neighbourhoods (right) highlight the tight operonic arrangement of rpoE and RE-like genes, supporting a potential model in which nuclease-dead RE proteins similarly recruit atypical σE proteins to sites of transcription.Extended Data Fig. 2 Pairwise sequence identity matrices for dCas12f, σE, HTH, and gRNAs.a, Heatmap of pairwise amino acid sequence identity percentages among dCas12f homologues tested in this study. The matrix is color-coded by sequence identity (see legend inset), and percentages are listed. b, Pairwise amino acid sequence identity between tested σE homologues, shown as in a. The Bba locus lacks an rpoE gene. c, Pairwise amino acid sequence identity between tested HTH homologues, shown as in a. Note that Sda, Lby, and Pdi loci contain two hth genes, while this gene is absent in the Bba locus. d, Pairwise nucleotide sequence identity between tested gRNAs, shown as in a. e, Annotated genomic loci of all homologous dCas12f-σE systems tested in this study, with hth, rpoE, and dcas12f genes labeled and colored in yellow, blue, and red, respectively. Predicted functions for other neighbouring genes in grey are indicated by COG (Clusters of Orthologous Groups) letters, as indicated in the legend below.Extended Data Fig. 3 Additional analysis of dCas12f-associated gRNAs from RIP–seq experiments and comparative genomics analyses.a, RIP–seq coverage plots for the five indicated dCas12f homologues, revealing a well-defined gRNA with scaffold (orange) and guide (purple) regions. Nucleotides outside the boundary of the presumed full-length gRNA are coloured in grey. Total gRNA lengths are noted to the right of each plot. The guide region of Zunongwangia profunda is plotted on a separate y-axis scale, for visual clarity. Coverage is shown as counts per million (CPM). b, Zoomed-out RIP–seq coverage plots for an additional dCas12f orthologue from Leeuwenhoekiella palythoae (Lpa), showing both a zoomed-out view (left), depicting the same data as shown in Fig. 2c, and magnified view as in a (right). Region 1 annotated in the operon schematic (top left) encodes multiple tandem gRNAs with similar scaffold and guide sequences. Coverage is shown as in a. c, RIP–seq coverage plots for an additional dCas12f orthologue from Paenimyroides ummariense (Pum), shown as in b, with the magnified view at right comparing the aligned scaffold and guide sequences. The first two gRNA sequences are identical, and bases that differ in the third gRNA are highlighted in red. d, Annotated genomic neighborhood of an rpoE-dcas12f operon that encodes both predicted full-length gRNAs (orange/purple) and multiple discontinuous CRISPR arrays (repeats in tan). The sequence similarity between the CRISPR repeats and gRNA scaffold (grey circles, bottom) suggests a potential evolutionary emergence of chimeric, dCas12f-associated single guide RNAs from CRISPR arrays.Extended Data Fig. 4 Culturing, whole genome sequencing, and RNA–seq of Flagellimonas taeanensis strains that encode dCas12f-σE systems.a, Summary table of strain information and culturing conditions for five F. taeanensis (Fta) strains and one Mucilaginibacter rigui strain, which were acquired because of the likely presence of rpoE-dcas12f loci. The number of loci identified after whole-genome sequencing (WGS) is highlighted. b, BioCircos plots of the six strains in a after WGS analysis, with the positions of rpoE-dcas12f loci highlighted in red. Information regarding the internal strain ID, total genome size, GC content; contigs are denoted in cases where genome assembly was incomplete. c, Genomic neighbourhoods of rpoE-dcas12f loci in the indicated strains from a. Genes encoding HTH (hth), σE (rpoE) and dCas12f (dcas12f) are shown in yellow, blue, and red, respectively; gRNAs and hth-associated ncRNAs are annotated in orange and magenta, respectively. d, Magnified RNA–seq coverage plots from Fta strain sSL4759 for two distinct loci, highlighting the abundance of reads corresponding to hth-associated ncRNAs at both rpoE-dcas12f locus 1 (left) and the presumed susC target site of dCas12f-associated gRNAs (right). The top panels show a 2-kbp window; the bottom panels zoom in on the ncRNA sequence. Coverage is shown as counts per million (CPM).Extended Data Fig. 5 ChIP–seq experiments reveal TAM and gRNA guide sequences for additional dCas12f homologues.a, Genome-wide representation of ChIP–seq data for the indicated dCas12f homologues (purple), compared to the input control scaled the same as Ebr (top). Coverage is shown as counts per million (CPM), normalized to the highest peak within each sample or to a value of 200, as shown. b, Binding events were analyzed by MEME-ChIP, which revealed strongly conserved consensus motifs for eight dCas12f homologues that correspond to the putative target-adjacent motif (TAM) and gRNA-matching target DNA sequence within the seed, for each homologue. E, E-value significance; n, number of peaks contributing to the motif. Percent of total peaks constituent for each motif are shown in parentheses. Motifs could not be confidently determined for the remaining five dCas12f homologs due to a paucity of enriched peaks in heterologous expression experiments.Extended Data Fig. 6 Investigative strategy to uncover putative regulatory functions of dCas12f-σE systems.a, Bioinformatics strategy to determine high-confidence covariance models (CM) for both the dCas12f-associated gRNA (top) and hth-associated ncRNA (bottom). b, Schematic of bioinformatics workflow to globally identify RNA-guided DNA targets of dCas12f-σE systems in sequenced bacterial genomes. After identifying dCas12f-associated gRNAs and extracting guide sequences, putative targets were identified that exhibit perfect complementarity within a 6-nt seed sequence, reside within intergenic regions, and exist upstream of protein-coding genes. Target loci were also analysed for the presence of predicted hth-associated ncRNAs. c, Histogram quantifying distances between the TAM of predicted dCas12f target sites and the start codons of associated target genes. 156 bioinformatically predicted dCas12f targets were included in this analysis; distances for Fta dCas12f.1 and dCas12f.2 are highlighted for reference. d, Table summarizing the bioinformatics results from b, listing the number of predicted DNA targets, gRNA guide sequences, and genomes for each class. Putative RNA-guided DNA targets fall into four functional categories that include regulation of transmembrane transport, regulation of two-component system (TCS)-like systems, auto-regulation of rpoE-dcas12f loci, and regulation of rpoE-dcas12f loci in trans; other predicted targets await further categorization and analysis. Note that some guides have multiple targets within a genome, so a single guide is represented in multiple functional categories and totals do not match totals in b. Each guide within a genome is also capable of targeting multiple loci. Thus, some genomes have gRNAs with targets spanning multiple functional classes and totals similarly do match totals in b. e, Exemplary dCas12f-σE system from Chryseobacterium gleum (Cgl), in which the dCas12f-associated gRNA putatively targets and transcriptionally regulates seven distinct genetic loci (1–7). The genome schematic (top) visualizes the approximate genomic location of each locus, and the magnified insets (below) report the position of dCas12f-gRNA targets (purple triangles) relative to nearby genes; note that each of the targets flanks a nearby hth gene and overlaps precisely with the predicted position of an hth-associated ncRNA (magenta rectangle). The schematics at right depict patterns of gRNA-DNA complementarity at each target site, relative to the TAM. f, Exemplary dCas12f-σE system from Sphingobacterium sp. DR205, in which both a single chimeric gRNA guide and two spacers from a vestigial CRISPR array target genomic sites proximal to susCD operons. The genome schematic (top) visualizes the approximate location of the rpoE-dcas12f locus and putative target sites, and the magnified insets (below) visualize the rpoE-dcas12f locus and RNA-guided DNA target sites, alongside corresponding published RNA–seq data for each locus. Coverage is shown as CPM. Three guides/spacers are indicated and labeled (circles), as well as their complementary targets (purple triangles), the predicted guide-target complementarity (purple shading), and the putative TAM (yellow shading).Extended Data Fig. 7 ChIP–seq experiments and analyses for dCas12f and HTH homologues.a, Schematic of ChIP–seq assay to study genome-wide binding of dCas12f homologues programmed with gRNAs targeting the indicated site upstream of yidX (left), and genome-wide representation of ChIP–seq data alongside the input control (right). Coverage is shown as counts per million (CPM), normalized to the highest peak in the targeting samples. b, Bioinformatics strategy to globally discover putative binding sites of HTH proteins (left). Based on ChIP–seq data for Fta HTH that revealed a highly enriched binding site with inverted repeats (IRs) upstream of its own open reading frame (ORF), 369 additional potential HTH motifs were identified that similarly exhibit IRs but share little conservation (WebLogo, bottom right), consistent with the broad diversity of hth genes encoded in rpoE-dcas12f loci. The histogram of IR mismatch (top right) plots the number of mismatches between left and right copies of putative IR substrates of HTH, and their relative distance from the start of the hth ORF.Extended Data Fig. 8 Optimization and additional analyses relating to RFP fluorescence reporter and ChIP–seq data.a, Scatter plot of RNA–seq data from mRFP transcriptional reporter assays in Fig. 5a, comparing three replicates of targeting (T) or non-targeting (NT) gRNA experiments. TPM, transcripts per million. b, Panel of modified mRFP reporter plasmids (left), in which perturbations were made within the native susC target region including deletions or multi-terminator cassette insertions, as indicated. The resulting OD-normalized RFP fluorescence measurements for transcriptional reporter assays using these constructs are shown (right), with the starting (WT) construct testing in both targeting (T) and non-targeting (NT) conditions. c, OD-normalized RFP fluorescence measurements from control assays to determine signal detection limits. Cell culture samples from targeting (T) and non-targeting (NT) experiments (left) were mixed together to simulate variable ratios of RFP signal (middle), and the resulting regression analysis (right) revealed the expected linear relationship, with excellent sensitivity down to low single-digit percentages. d, OD-normalized RFP fluorescence for the indicated reporter DNA constructs, in which the TAM was mutated to each of the alternative nucleotides while maintaining an invariant target and guide sequence. e, Flow cytometry analysis for T and NT samples from b, alongside a negative control (N.C.) encoding no mRFP1. f, Consensus WebLogo of dCas12f gRNA-matching target sites from 156 aligned genomic loci, demonstrating conservation within the TAM, the target region, and a short T-rich stretch immediately adjacent to the target — but an absence of substantial conservation in other promoter motifs or the TSS itself. This observation suggests that RNA-guided transcription proceeds largely without fixed DNA sequence requirements. Coordinates are numbered relative to the TAM (top, black text) or to the TSS (bottom, red text). g, Panel of mutations in the region between the RNA-matching target site and empirically determined TSS, which were tested in the transcription reporter assay shown in Fig. 5a. Nucleotides highlighted in dark grey were mutated; coordinates are numbered relative to the TAM (top, black text) or to the TSS (bottom, red text). h, OD-normalized RFP fluorescence for the indicated reporter DNA constructs shown in g. T, targeting gRNA with WT (unmutated) intergenic region; NT, non-targeting gRNA control. i, Magnified view of ChIP–seq data for Flag-tagged σE in the absence (-FtaRNAP) and presence (+FtaRNAP) of the native FtaRNAP, alongside an input control (top). Coverage is shown as counts per million (CPM), normalized to the highest peak in the +FtaRNAP sample. j, Panel of guide sequence truncations in 2-bp increments, which were tested in the transcription reporter assay shown in Fig. 5a. The nucleotides highlighted in brown were truncated from the targeting gRNA. k, OD-normalized RFP fluorescence for the indicated gRNA constructs shown in j, using the transcriptional reporter assay shown in Fig. 5a. NT, non-targeting gRNA control. Data in b, h, and k are shown as mean ± s.d. for n = 3 biologically independent samples. Data in c, d, and e are shown as mean ± s.d. for n = 3, n = 3, and for n = 5 technical replicates, respectively.Extended Data Fig. 9 Additional analyses relating to RNA–seq data.a, RNA enrichment measured by reverse transcription (RT)–qPCR of target-E RNA–seq sample (Fig. 5b) for two distinct primer pairs (left). The targeting condition (T) reveals around 140-fold enrichment of transcripts versus non-targeting (NT) condition. Standard curves for simulated gene activation to determine the limit of detection for primer pair 1 (center) and primer pair 2 (right) suggest a detection limit of around 5%. b, RNA–seq coverage plots for the target-E off-target site near tyrS, shown in Fig. 5f. The approximately 46 bp distance between TAM and TSS upstream of tyrS is indicated. Zoom-out view (right) shows pdxY is upregulated due to it being encoded directly downstream of tyrS. The TAM and extensive 11-bp complementarity between the off-target DNA site and guide sequence are shown below the coverage track. Coverage is shown for the reverse strand as counts per million (CPM). c, TSS plots derived from RNA-seq data, as shown in Fig. 5c, for all the individual gRNAs in Fig. 5g. Coverage is shown as CPM. d, RNA–seq tracks for a NT control and three distinct gRNAs designed for target-3 showing weak or no transcription initiation, likely due to binding site occlusion by other DNA binding factors involved in yidX regulation. Coverage is shown for the forward strand as CPM. NT, non-targeting. e, RNA–seq coverage plots for a gRNA designed for Target 6 (as shown in Fig. 5g) and three additional gRNAs that incrementally shift the TAM by 1 bp, changing it to A, T, and C. Targets flanked by a non-G-TAM fail to initiate transcription. Coverage is shown for the forward strand as CPM.Extended Data Fig. 10 Mechanisms of bacterial gene activation and repression, including newly discovered, RNA-guided pathways driven by dCas12f and TldR.RNA-based mechanisms (top row) can activate gene expression, such as the RprA small RNA (sRNA) activating rpoS by relieving secondary structure inhibition, or repress gene expression, such as the RyhB sRNA destabilizing sodB through base pairing and RNase recruitment. Protein-based mechanisms (second row from top) can activate gene expression, such as Crp (cAMP receptor protein) enhancing RNAP recruitment at the envZ promoter, or repress gene expression, such as HTH transcription factors binding DNA to block RNAP-promoter recognition. σ factor-based mechanisms (second row from bottom) can activate gene expression, such as extracytoplasmic function (ECF) σE factors recruiting RNAP to specific promoters to drive transcription, or repress gene expression, such as when their activities are inhibited by FecR anti-σ factors. Finally, we report novel RNA-guided pathways of gene regulation (bottom row). In previous work, we uncovered TnpB-like nuclease-dead repressors (TldR) that exploit gRNAs to bind complementary DNA target sites, thereby preventing promoter recognition by RNAP (bottom right). In this study, we uncover nuclease-dead Cas12f proteins that exploit gRNAs to bind complementary DNA target sites and directly recruit σE factors and RNAP, thereby driving promoter-independent transcription of diverse genetic operons such as susCD polysaccharide utilization loci (bottom left). Collectively, our work highlights a new axis of gene regulation control via exapted, RNA-guided transcription factors akin to CRISPRi and CRISPRa.Supplementary informationSupplementary Figure 1Representative gating schemes for flow cytometry analysis of RFP fluorescence reporter.Reporting SummarySupplementary TablesSupplementary Tables 1–8.Rights and permissionsSpringer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.Reprints and permissionsAbout this article