What does it take to learn the rules of RNA base pairing? A lot less than you may think

Wait 5 sec.

What does it take to learn the rules of RNA base pairing? A lot less than you may thinkDownload PDF Download PDF ArticleOpen accessPublished: 26 March 2026Jayanth S. Pratap ORCID: orcid.org/0009-0001-4920-27091,Ryan K. Krueger ORCID: orcid.org/0000-0001-6856-02482 &Elena Rivas ORCID: orcid.org/0000-0002-2084-269X1 Communications Biology , Article number: (2026) Cite this article We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.SubjectsComputational modelsMachine learningProgramming languageRNAStructural biologyAbstractAmidst the fast-developing trend of RNA large language models with millions of parameters, we asked what would be minimally required to rediscover the rules of RNA canonical base pairing that define secondary structure, namely the Watson-Crick-Franklin A:U, G:C and the wobble G:U base pairs. Here, we conclude that it does not require much at all. It does not require knowing secondary structures, it does not require aligning the sequences, and it does not require many parameters. We selected a probabilistic model (a stochastic context-free grammar or SCFG) with a total of just 21 parameters, that can describe arbitrary pairwise interactions including but not restricted to those of RNA base pairing. Using standard deep learning techniques, we estimate its parameters by implementing the generative process in an automatic differentiation (autodiff) framework and applying stochastic gradient descent (SGD). We define and minimize a loss function that does not use any structural or alignment information. Trained on as few as fifty RNA sequences, the specific rules of RNA base pairing emerge after only a few iterations of SGD. Crucially, the sole inputs are RNA sequences. When optimizing for sequences corresponding to structured RNAs, SGD also yields the rules of RNA base-pair aggregation into helices. In sharp contrast, when trained on shuffled sequences, the system optimizes by avoiding base pairing altogether. Trained on messenger RNAs, it reveals interactions that are different from those of structural RNAs, and specific to each mRNA. We demonstrate that our approach generalizes across diverse RNA families by testing on 1094 sequences from 22 structurally distinct RNA families. Our results show that the emergence of canonical RNA base-pairing can be attributed to sequence-level signals that are robust and detectable even without labeled structures or alignments, and with very few parameters. Autodiff algorithms for probabilistic models, such as, but not restricted to SCFGs, have significant potential as they allow these models to be incorporated into end-to-end RNA deep learning methods for discerning transcripts of different functionalities.ReferencesPenic, R. J., Vlasic, T., Huber, R. G., Wan, Y. & Sikic, M. RiNALMo: general-purpose RNA language models cangeneralize well on structure prediction tasks. Nat. Commun. 16, 5671 (2025).Akiyama, M. & Sakakibara, Y. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genom. Bioinform. 4, 4 (2022).Google Scholar Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).Google Scholar Gong, T. & Bu, D. Language models enable zero-shot prediction of RNA secondary structure including pseudoknots. bioRxiv https://www.biorxiv.org/content/10.1101/2024.01.27.577533v1 (2024).de Lajarte, A. A. et al. Diverse database and machine learning model to narrow the generalization gap in RNA structure prediction. Sci. Adv. 12, eadz4967 (2024).Wang, N. et al. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nat. Mach. Intell. 6, 548–557 (2024).Google Scholar Yu, H. et al. An interpretable RNA foundation model for exploring functional RNA motifs in plants. Nat. Mach. Intell. 6, 1616–1625 (2024).Google Scholar Zou, S. et al. A large-scale foundation model for RNA function and structure prediction. bioRxiv https://doi.org/10.1101/2024.11.28.625345 (2024).Sato, K., Akiyama, M. & Sakakibara, Y. RNA secondary structure prediction using deep learning with thermodynamic integration. Nat. Commun. 12, 941 (2021).Google Scholar Fu, L. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucl. Acids Res. 50, e14–e14 (2022).Google Scholar Singh, J. et al. Improved RNA secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning. Bioinformatics 37, 2589–2600 (2021).Google Scholar Sato, K. & Hamada, M. Recent trends in RNA informatics: a review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery. Brief. Bioinform. 24, bbad186 (2023).Google Scholar da Silva, P. T. et al. Nucleotide dependency analysis of DNA language models reveals genomic functional elements. Nat. Genet. 57, 2589–2602 (2025).Google Scholar Rivas, E., Clements, J. & Eddy, S. R. A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods 14, 45–48 (2017).Google Scholar Chomsky, N. Three models for the description of language. IRE Trans. Inf. Theory 2, 113–124 (1956).Google Scholar Dowell, R. D. & Eddy, S. R. Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinform. 5, 71 (2004).Google Scholar Bradbury, J. et al. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax (2018).Rabiner, L. R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–286 (1989).Google Scholar Eddy, S. R. & Durbin, R. RNA sequence analysis using covariance models. Nucl. Acids Res. 22, 2079–2088 (1994).Google Scholar Rivas, E., Lang, R. & Eddy, S. R. A range of complex probabilistic models for RNA secondary structure prediction that include the nearest neighbor model and more. RNA 18, 193–212 (2012).Google Scholar Knudsen, B. & Hein, J. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucl. Acids Res. 31, 3423–3428 (2003).Google Scholar Justyna, M., Antczak, M. & Szachniuk, M. Machine learning for RNA 2D structure prediction benchmarked on experimental data. Brief. Bioinform. 24, 1–9 (2023).Google Scholar Lari, K. & Young, S. J. The estimation of stochastic context-free grammars using the inside-outside algorithm. Comput. Speech Lang. 4, 35–56 (1990).Google Scholar Lari, K. & Young, S. J. Applications of stochastic context-free grammars using the inside-outside algorithm. Comput. Speech Lang. 5, 237–257 (1991).Google Scholar McCaskill, J. S. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 1105–19 (1990).Google Scholar Zuker, M. Mfold web server for nucleic acid folding of hyprodization prediction. Nucleic Acids Res. 31, 3406–3415 (2003).Google Scholar Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 1748–7188 (2011).Google Scholar Reuter, J. S. & Mathews, D. H. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinform. 11, 10 (2010).Google Scholar Rivas, E. The four ingredients of single-sequence RNA secondary structure prediction: a unifying perspective. RNA Biol. 10, 1185–1196 (2013).Google Scholar Eisner, J. Inside-outside and forward-backward algorithms are just backprop (tutorial paper). In Proc. 1st Workshop on Structured Prediction for NLP (ed. Balsubramanian, V. N. et al.) 1–17 (Association for Computational Linguistics, 2016).Matthies, M. C., Krueger, R., Torda, A. E. & Ward, M. Differentiable partition function calculation for RNA. NAR 52, e14 (2024).Google Scholar Krueger, R. K. & Ward, M. JAX-RNAfold: scalable differentiable folding. Bioinformatics 41, btaf203 (2025).Krueger, R. K., Aviran, S., Mathews, D. H., Zuber, J. & Ward, M. Differentiable folding for nearest neighbor model optimization. arXiv preprint https://arxiv.org/abs/2503.09085 (2025).Brown, J. W. The ribonuclease P database. Nucl. Acids Res. 27, 314 (1999).Google Scholar Szikszai, M., Wise, M., Datta, A., Ward, M. & Mathews, D. H. Deep learning models for RNA secondary structure prediction (probably) do not generalize across families. Bioinformatics 38, 3892–3899 (2022).Google Scholar Szikszai, M. et al. Deep learning for RNA secondary structure determination: Gauging generalizability and broadening the scope of traditional methods. RNA 32, 428–442 (2026).Cocco, S., Monasson, R. & Weigt, M. From principal component to direct coupling analysis of coevolution in proteins: low-eigenvalue modes are needed for structure prediction. PLoS Comput. Biol. 9, e1003176 (2013).Google Scholar Gao, W., Yang, A. & Rivas, E. Thirteen dubious ways to detect conserved structural RNAs. IUBMB Life 75, 471–492 (2022).Google Scholar Rivas, E. RNA structure prediction using positive and negative evolutionary information. PLOS Comput. Biol. 16, e1008387 (2020).Google Scholar Karan, A. & Rivas, E. All-at-once RNA folding with 3D motif prediction framed by evolutionary information. Nat. Methods 22, 2094–2106 (2025).Google Scholar Danaee, P. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res. 46, 5381–5394 (2018).Google Scholar Sloma, M. F. & Mathews, D. H. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structure. RNA 22, 1808–1818 (2016).Google Scholar Wheeler, T. J. & Eddy, S. R. nhmmer: DNA homology search with profile HMMs. Bioinformatics 29, 2487–2489 (2013).Google Scholar Download referencesAcknowledgementsThis work was supported by NIH grant R01-GM144423 to E.R. This material is based in part upon work supported by the National Science Foundation under Grant no. UWSC13223 (R.K.K.). We thank Marcell Szikszai for help running the software MXFOLD2 and RiNALMo, and Max Ward for insights into automatic differentiation of RNA folding models. We thank William Gao for providing the fungal mRNA sequences. We thank Sean R. Eddy and William Gao for a critical reading of the manuscript. E.R. acknowledges the hospitality of the Centro de Ciencias de Benasque Pedro Pascual, Benasque, Spain, during the completion of this manuscript. We also thank the reviewers for their insightful comments.Author informationAuthors and AffiliationsDepartment of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USAJayanth S. Pratap & Elena RivasSchool of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USARyan K. KruegerAuthorsJayanth S. PratapView author publicationsSearch author on:PubMed Google ScholarRyan K. KruegerView author publicationsSearch author on:PubMed Google ScholarElena RivasView author publicationsSearch author on:PubMed Google ScholarContributionsE.R. conceived the research. J.S.P. and R.K.K. implemented the algorithms for the G5 grammar. E.R. implemented the algorithms for the G6 grammar. E.R. performed the experiments and wrote the manuscript. All authors edited the manuscript.Corresponding authorCorrespondence to Elena Rivas.Ethics declarationsCompeting interestsThe authors declare no competing interests.Peer reviewPeer review informationCommunications Biology thanks Kengo Sato and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Michal Kolář and Mengtan Xing. A peer review file is available.Additional informationPublisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Availability: rivaslab.org, https://github.com/EddyRivasLab/R-scape/tree/master/python/d-SCFGSupplementary informationsupplemental materialReporting summary (download PDF )Transparent Peer Review file (download PDF )Rights and permissionsOpen Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.Reprints and permissionsAbout this articleDownload PDF