Choosing the coding sequence of an mRNA is a high-stakes optimization problem in therapeutic design. Because the genetic code is degenerate, any protein can be encoded by an astronomical number of synonymous codon sequences, and these alternatives differ sharply in properties that govern how well the resulting mRNA is translated and how stable it is — most prominently its minimum free energy (MFE), codon adaptation index (CAI), and GC content. These objectives often conflict, and classical tools such as LinearDesign that search the synonymous-sequence space optimally can become computationally expensive for long sequences like full-length antibody or vaccine antigen messages.

RNAJog (RNA Joint Optimization with autoregressive Generative model), introduced in a 2025 bioRxiv preprint (v2 posted June 2026) by Jiaqi Huang, Xiaoyong Pan, and colleagues at Shanghai Jiao Tong University, reframes codon optimization as a learned generative task. Rather than performing a fresh combinatorial search for each input, RNAJog trains an autoregressive model with reinforcement learning to emit optimized codon sequences directly, using a reward that jointly balances MFE, CAI, and GC content. This lets the model amortize the optimization into a fast forward pass while still navigating the trade-offs among competing design criteria.

The framework sits alongside a growing family of learning-based codon-design tools — contemporaries in the bio.rodeo catalog include RNARL, mRNA-GPT, and EVA-RNA — but distinguishes itself through its speed on long sequences, a training-data-free "zero" variant, and direct wet-lab validation in a mouse mRNA vaccine.

Key Features

Joint multi-objective optimization: A single reinforcement-learning reward simultaneously optimizes minimum free energy, codon adaptation index, and GC content, rather than tuning one metric at a time.
Two orders of magnitude faster than LinearDesign: For long coding sequences, RNAJog generates optimized candidates roughly 100x faster than the dynamic-programming-based LinearDesign, making genome-scale and long-antigen design practical.
Annotation-free zero-shot variant: A companion model, RNAJog_zero, operates without annotated training data, broadening applicability to settings where curated codon-usage references are unavailable.
Modification-aware constraints: Optional flags eliminate m6A motifs and repress GC content, letting designers enforce biological constraints that affect translation and stability alongside the core objectives.
Pareto-front sampling: Configurable sampling produces multiple candidate sequences and Pareto-optimal sets, giving experimentalists a spread of trade-off solutions rather than a single output.

Technical Details

RNAJog couples an autoregressive sequence generator with reinforcement learning: the model decodes a codon sequence position by position and is trained against a composite reward combining MFE (predicted secondary-structure stability), CAI (host codon adaptation), and GC content. It accepts either an RNA or a protein sequence as input and outputs candidate sequences annotated with length, MFE, CAI, GC content, and m6A-motif rate. The authors report that RNAJog runs about two orders of magnitude faster than LinearDesign on long sequences while achieving competitive sequence quality. Beyond in silico benchmarks, an RNAJog-designed influenza virus hemagglutinin (HA) mRNA vaccine produced roughly a 10-fold increase in antibody titer over the wild-type sequence in mice, and the model's m6A-minimization capability was validated in cell experiments. Pretrained weights for both RNAJog and RNAJog_zero are distributed via the Shanghai Jiao Tong University CSBio server; the preprint does not state an explicit parameter count.

Applications

RNAJog targets mRNA therapeutic and vaccine development, where the coding sequence must be tuned for efficient, stable expression in a host. Its speed advantage on long sequences makes it well suited to full-length antibody messages and large vaccine antigens that strain exact combinatorial optimizers, while the m6A-elimination and GC-repression controls let teams enforce manufacturability and stability constraints. The freely hosted web application at the SJTU CSBio server lets experimental biologists generate optimized candidates without local installation, and the released code supports batch optimization and Pareto-front exploration for higher-throughput screening pipelines.

Impact

By recasting codon optimization as a fast generative process and validating it in a live mouse vaccine, RNAJog contributes to the shift from exact-search codon tools toward learning-based designers that amortize optimization across inputs. Its reported 100x speedup over LinearDesign and ~10-fold antibody-titer gain are the kind of results that matter to mRNA programs constrained by long sequences and tight design cycles, and the annotation-free RNAJog_zero variant lowers the barrier for organisms or constructs lacking curated references. As a recent preprint, its benchmark standing and generalization remain to be confirmed through peer review and independent evaluation; the code is released under a custom non-commercial license (free for academic and research use) rather than an OSI-approved one, and no formal data card accompanies the release.

Key Features

Joint multi-objective optimization: A single reinforcement-learning reward simultaneously optimizes minimum free energy, codon adaptation index, and GC content, rather than tuning one metric at a time.

Two orders of magnitude faster than LinearDesign: For long coding sequences, RNAJog generates optimized candidates roughly 100x faster than the dynamic-programming-based LinearDesign, making genome-scale and long-antigen design practical.

Annotation-free zero-shot variant: A companion model, RNAJog_zero, operates without annotated training data, broadening applicability to settings where curated codon-usage references are unavailable.

Modification-aware constraints: Optional flags eliminate m6A motifs and repress GC content, letting designers enforce biological constraints that affect translation and stability alongside the core objectives.

Pareto-front sampling: Configurable sampling produces multiple candidate sequences and Pareto-optimal sets, giving experimentalists a spread of trade-off solutions rather than a single output.

Technical Details

Applications

Impact

RNAJog

Key Features

Technical Details

Applications

Impact

Citation

RNAJog: Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

RNAJog

Key Features

Technical Details

Applications

Impact

Citation

RNAJog: Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

RNAJog

#Key Features

#Technical Details

#Applications

#Impact

Citation

RNAJog: Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

RNAJog

#Key Features

#Technical Details

#Applications

#Impact

Citation

RNAJog: Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact