bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
RNA foundation models
RNA

RNAJog

Shanghai Jiao Tong University

An autoregressive generative model trained with reinforcement learning to jointly optimize mRNA codon sequences for MFE, CAI, and GC content, running two orders of magnitude faster than LinearDesign on long sequences.

Released: June 2026

Choosing the coding sequence of an mRNA is a high-stakes optimization problem in therapeutic design. Because the genetic code is degenerate, any protein can be encoded by an astronomical number of synonymous codon sequences, and these alternatives differ sharply in properties that govern how well the resulting mRNA is translated and how stable it is — most prominently its minimum free energy (MFE), codon adaptation index (CAI), and GC content. These objectives often conflict, and classical tools such as LinearDesign that search the synonymous-sequence space optimally can become computationally expensive for long sequences like full-length antibody or vaccine antigen messages.

RNAJog (RNA Joint Optimization with autoregressive Generative model), introduced in a 2025 bioRxiv preprint (v2 posted June 2026) by Jiaqi Huang, Xiaoyong Pan, and colleagues at Shanghai Jiao Tong University, reframes codon optimization as a learned generative task. Rather than performing a fresh combinatorial search for each input, RNAJog trains an autoregressive model with reinforcement learning to emit optimized codon sequences directly, using a reward that jointly balances MFE, CAI, and GC content. This lets the model amortize the optimization into a fast forward pass while still navigating the trade-offs among competing design criteria.

The framework sits alongside a growing family of learning-based codon-design tools — contemporaries in the bio.rodeo catalog include RNARL, mRNA-GPT, and EVA-RNA — but distinguishes itself through its speed on long sequences, a training-data-free "zero" variant, and direct wet-lab validation in a mouse mRNA vaccine.

#Key Features

  • Joint multi-objective optimization: A single reinforcement-learning reward simultaneously optimizes minimum free energy, codon adaptation index, and GC content, rather than tuning one metric at a time.
  • Two orders of magnitude faster than LinearDesign: For long coding sequences, RNAJog generates optimized candidates roughly 100x faster than the dynamic-programming-based LinearDesign, making genome-scale and long-antigen design practical.
  • Annotation-free zero-shot variant: A companion model, RNAJog_zero, operates without annotated training data, broadening applicability to settings where curated codon-usage references are unavailable.
  • Modification-aware constraints: Optional flags eliminate m6A motifs and repress GC content, letting designers enforce biological constraints that affect translation and stability alongside the core objectives.
  • Pareto-front sampling: Configurable sampling produces multiple candidate sequences and Pareto-optimal sets, giving experimentalists a spread of trade-off solutions rather than a single output.

#Technical Details

RNAJog couples an autoregressive sequence generator with reinforcement learning: the model decodes a codon sequence position by position and is trained against a composite reward combining MFE (predicted secondary-structure stability), CAI (host codon adaptation), and GC content. It accepts either an RNA or a protein sequence as input and outputs candidate sequences annotated with length, MFE, CAI, GC content, and m6A-motif rate. The authors report that RNAJog runs about two orders of magnitude faster than LinearDesign on long sequences while achieving competitive sequence quality. Beyond in silico benchmarks, an RNAJog-designed influenza virus hemagglutinin (HA) mRNA vaccine produced roughly a 10-fold increase in antibody titer over the wild-type sequence in mice, and the model's m6A-minimization capability was validated in cell experiments. Pretrained weights for both RNAJog and RNAJog_zero are distributed via the Shanghai Jiao Tong University CSBio server; the preprint does not state an explicit parameter count.

#Applications

RNAJog targets mRNA therapeutic and vaccine development, where the coding sequence must be tuned for efficient, stable expression in a host. Its speed advantage on long sequences makes it well suited to full-length antibody messages and large vaccine antigens that strain exact combinatorial optimizers, while the m6A-elimination and GC-repression controls let teams enforce manufacturability and stability constraints. The freely hosted web application at the SJTU CSBio server lets experimental biologists generate optimized candidates without local installation, and the released code supports batch optimization and Pareto-front exploration for higher-throughput screening pipelines.

#Impact

By recasting codon optimization as a fast generative process and validating it in a live mouse vaccine, RNAJog contributes to the shift from exact-search codon tools toward learning-based designers that amortize optimization across inputs. Its reported 100x speedup over LinearDesign and ~10-fold antibody-titer gain are the kind of results that matter to mRNA programs constrained by long sequences and tight design cycles, and the annotation-free RNAJog_zero variant lowers the barrier for organisms or constructs lacking curated references. As a recent preprint, its benchmark standing and generalization remain to be confirmed through peer review and independent evaluation; the code is released under a custom non-commercial license (free for academic and research use) rather than an OSI-approved one, and no formal data card accompanies the release.

Citation

RNAJog: Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning

Preprint

Huang, J., et al. (2026) RNAJog: Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning. bioRxiv.

DOI: 10.1101/2025.08.26.672486

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References24

GitHub

Stars2
Forks0
Open Issues0
Contributors1
Last Push3mo ago
LanguagePython

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
9Closed
Usability — can I run it?14
Reproducibility — can I retrain it?4
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

autoregressivecodoncodon_optimizationgenerativemrnamrna_designmulti_objectivereinforcement_learningrna_therapeuticssequence_generationtransformer

Resources

GitHub RepositoryResearch PaperDemo