RibonanzaNet

RNA foundation model trained on chemical-mapping data from millions of sequences, predicting reactivity, secondary structure, and degradation.

Released: February 2024

RibonanzaNet (also called RNet) is an RNA foundation model from the Das Lab at Stanford University, built on the Ribonanza dataset — chemical-mapping measurements on roughly two million diverse RNA sequences, totaling on the order of hundreds of millions of nucleotide-level measurements. The data were collected through "dual crowdsourcing": citizen-scientist sequence design via the Eterna platform and a public Kaggle competition (the Stanford Ribonanza RNA Folding challenge) that solicited and prospectively benchmarked deep learning models. The work was posted to bioRxiv in February 2024 and published in Cell in 2024.

Chemical mapping probes how accessible each nucleotide is in solution, providing an experimental readout that correlates with base pairing and structure. RibonanzaNet distills the winning ideas from the Kaggle challenge — together with the earlier RNAdegformer architecture — into a single self-contained network that predicts these reactivity profiles directly from sequence, without relying on pre-computed base-pairing matrices or external secondary-structure algorithms.

Because it is a pretrained model that captures general RNA structural signal, the fixed checkpoint can be applied to new problems without retraining, or fine-tuned for specialized tasks. A recent application paper (Townley, Kladwang, Baker, Das and colleagues) uses RibonanzaNet to guide de novo design of RNA pseudoknots, illustrating its role as a reusable structural oracle for RNA engineering.

Key Features

Trained on experimental chemical mapping at scale: Learns from ~2 million RNA sequences and hundreds of millions of reactivity measurements (2A3 and DMS profiles), rather than from limited 3D structure data.
Dual-crowdsourced data: Combines Eterna citizen-scientist sequence design with a public Kaggle challenge that prospectively benchmarked model submissions.
Self-contained sequence-to-structure: Predicts reactivity and structure directly from sequence, with no dependence on external base-pairing or folding algorithms.
Reusable, fine-tunable checkpoint: A fixed pretrained model that can be applied zero-shot to guide RNA design, or fine-tuned for secondary structure, hydrolytic degradation, and experimental dropout.
Open inference: The rnet-inference repository provides automatable, self-contained inference under an MIT license, with weights distributed via a Git submodule and on Kaggle.

Technical Details

RibonanzaNet is a deep network combining Transformer encoder layers with 1D convolutions, adapted from the RNAdegformer design. It maintains both a sequence-level representation and a pairwise representation, exchanging information between them through outer-product-mean and triangular multiplicative updates — mechanisms reminiscent of structure-prediction networks — which lets structural signal flow bidirectionally rather than depending on pre-computed inputs. The exact parameter count of the base model is not prominently reported; the publicly cited "~100M-parameter" figure refers to the later RibonanzaNet2 successor, so the base size should be treated as unspecified. When fine-tuned, RibonanzaNet reaches state-of-the-art results on auxiliary benchmarks, including secondary-structure prediction (mean F1 around 0.89 on a PDB-derived test set and ~0.94 on CASP15 targets), improved pseudoknot recovery, and RNA degradation modeling that surpasses prior OpenVaccine competition winners. There is currently no HuggingFace model card; the canonical distribution is the GitHub rnet-inference repository with its RibonanzaNet-Weights submodule and Kaggle.

Applications

RibonanzaNet serves RNA biologists and designers who need fast, experimentally grounded structural predictions. It supports secondary-structure and chemical-reactivity prediction, modeling of RNA hydrolytic degradation (relevant to mRNA-vaccine stability), and prediction of experimental sequence dropout. As a pretrained oracle, it can guide de novo design tasks such as constructing RNA pseudoknots, and its fine-tunable backbone provides a starting point for new RNA property-prediction models without retraining from scratch.

Impact

RibonanzaNet demonstrated that large-scale crowdsourced chemical-mapping data, paired with a competitive benchmarking process, can yield a general-purpose RNA foundation model that outperforms specialized prior methods across several tasks. By releasing both the Ribonanza dataset and self-contained inference code, the Das Lab established a reusable resource for the RNA structure community, and the model has since been extended (RibonanzaNet2, ~100M parameters) and applied to downstream RNA design problems. Its main limitations are that it is rooted in chemical-reactivity rather than direct 3D structure, and that documentation is spread across GitHub and Kaggle rather than a single model card.

Citations

Ribonanza: deep learning of RNA structure through dual crowdsourcing

Preprint

He, S., et al. (2024) Ribonanza: deep learning of RNA structure through dual crowdsourcing. bioRxiv.

DOI: 10.1101/2024.02.24.581671

De novo and salvage purine synthesis pathways across tissues and tumors.

Tran, D. H., et al. (2024) De novo and salvage purine synthesis pathways across tissues and tumors.. Cell.

DOI: 10.1016/j.cell.2024.05.011

Recent citations

Papers that recently cited this model.

Crowdsourced riboregulators reveal design principles for programmable RNA switching
James M. Robson, Gabrielle Moussas, Dayna Francis, et al.
bioRxiv · Jul 2026
0
RNAbpFlow: base pair-augmented SE(3) flow matching for conditional RNA 3D structure generation
Sumit Tarafder, Debswapna Bhattacharya
Nature Methods · Jun 2026
0Influential
MultiMolecule: a modular ecosystem for biomolecular sequence-model workflows
Zhiyuan Chen
Jun 2026
0

Top citations

The most-cited papers that cite this model.

mRNA vaccine sequence and structure design and optimization: Advances and challenges
Lei Jin, Yuanzhe Zhou, Sicheng Zhang, et al.
Journal of Biological Chemistry · Nov 2024
50
gRNAde: Geometric Deep Learning for 3D RNA inverse design
Chaitanya K. Joshi, A. Jamasb, Ramón Viñas, et al.
bioRxiv · May 2023
43
Assessment of nucleic acid structure prediction in CASP16
R. Kretsch, Alissa M. Hummer, Shujun He, et al.
bioRxiv · May 2025
39
Artificial intelligence for medicine 2025: Navigating the endless frontier
Ji Dai, Huiyu Xu, Tao Chen, et al.
The Innovation Medicine · 2025
25
From computational models of the splicing code to regulatory mechanisms and therapeutic implications
Charlotte Capitanchik, O. Wilkins, Nils Wagner, et al.
Nature reviews genetics · Oct 2024
23

Citations

Total Citations35

Influential1

References54

GitHub

Stars1

Forks0

Open Issues0

Contributors1

Last Push1y ago

LanguagePython

LicenseMIT

Fields of citing research

Biology92%
Computer Science84%
Medicine70%
Chemistry14%
Materials Science3%
Physics3%
Engineering3%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

74Open

Usability — can I run it?91

Reproducibility — can I retrain it?52

Model Openness Framework

Unclassified

Missing required components

Resources

GitHub Repository Research Paper Research Paper Research Paper Official Website

Key Features

Trained on experimental chemical mapping at scale: Learns from ~2 million RNA sequences and hundreds of millions of reactivity measurements (2A3 and DMS profiles), rather than from limited 3D structure data.

Dual-crowdsourced data: Combines Eterna citizen-scientist sequence design with a public Kaggle challenge that prospectively benchmarked model submissions.

Self-contained sequence-to-structure: Predicts reactivity and structure directly from sequence, with no dependence on external base-pairing or folding algorithms.

Reusable, fine-tunable checkpoint: A fixed pretrained model that can be applied zero-shot to guide RNA design, or fine-tuned for secondary structure, hydrolytic degradation, and experimental dropout.

Open inference: The rnet-inference repository provides automatable, self-contained inference under an MIT license, with weights distributed via a Git submodule and on Kaggle.

Technical Details

Applications

Impact

Citations

Ribonanza: deep learning of RNA structure through dual crowdsourcing

Preprint

He, S., et al. (2024) Ribonanza: deep learning of RNA structure through dual crowdsourcing. bioRxiv.

DOI: 10.1101/2024.02.24.581671

De novo and salvage purine synthesis pathways across tissues and tumors.

Tran, D. H., et al. (2024) De novo and salvage purine synthesis pathways across tissues and tumors.. Cell.

DOI: 10.1016/j.cell.2024.05.011

Recent citations

Papers that recently cited this model.

RibonanzaNet

#Key Features

#Technical Details

#Applications

#Impact

Citations

Ribonanza: deep learning of RNA structure through dual crowdsourcing

De novo and salvage purine synthesis pathways across tissues and tumors.

Recent citations

MultiMolecule: a modular ecosystem for biomolecular sequence-model workflows

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

RibonanzaNet

#Key Features

#Technical Details

#Applications

#Impact

Citations

Ribonanza: deep learning of RNA structure through dual crowdsourcing

De novo and salvage purine synthesis pathways across tissues and tumors.

Recent citations

MultiMolecule: a modular ecosystem for biomolecular sequence-model workflows

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact