bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
RNA foundation models
RNA

mRNA-GPT

Chinese Academy of Sciences

Autoregressive generative model pretrained on 30 million full-length natural mRNA sequences that jointly optimizes 5' UTR, CDS, and 3' UTR for therapeutic mRNA stability and translation efficiency.

Released: April 2026

mRNA-GPT is an autoregressive generative model for designing therapeutic messenger RNA, posted to bioRxiv in early April 2026. Unlike earlier mRNA-design tools that optimize 5' UTR, coding sequence (CDS), and 3' UTR independently, mRNA-GPT is pretrained on 30 million full-length natural mRNA sequences and learns the joint distribution across all three regions. After pretraining, the model is fine-tuned with reinforcement learning to optimize designed sequences for stability and translation-efficiency reward signals.

This addresses a key limitation of existing mRNA optimization workflows: optimal CDS choices depend on UTR context and vice versa, and tools that optimize regions in isolation can miss strong interactions between them.

#Key Features

  • Joint UTR-CDS-UTR generation: Generates full-length mRNA sequences with coordinated 5' UTR, CDS, and 3' UTR rather than optimizing regions independently.
  • 30M-sequence pretraining corpus: Pretrained on full-length natural mRNAs, capturing biological constraints beyond what synthetic codon-optimization tables encode.
  • RL-tuned for therapeutic objectives: Fine-tuned with reinforcement learning against translation-efficiency and stability rewards to bias generation toward therapeutically useful designs.
  • Coordinated codon optimization: Codon choices are conditioned on flanking UTR context, capturing context-dependent translation effects that tabular CO methods miss.
  • Direct applicability to mRNA therapeutics: Targets the practical workflow of designing mRNAs for vaccines and protein-replacement therapies.

#Technical Details

mRNA-GPT uses a decoder-only transformer pretrained autoregressively on 30M full-length natural mRNA sequences. After pretraining, the model is fine-tuned via reinforcement learning with reward signals derived from experimental measurements of mRNA stability and translation efficiency. The bioRxiv preprint reports architecture, training corpus details, and ablations on the impact of the RL stage.

Benchmarks include comparisons against codon-table optimization tools (CodonW, EMBOSS) and prior ML-based UTR-optimization tools, evaluating both translation-efficiency proxies and direct in vitro measurements.

#Applications

mRNA-GPT is directly applicable to therapeutic mRNA design — vaccines, protein-replacement therapies, and mRNA-based gene therapies — where stability and translation efficiency are critical product attributes. The unified UTR-CDS-UTR generation removes manual handoffs between separate codon-optimization, UTR-design, and folding-check stages.

#Impact

mRNA-GPT is the first generative foundation model to address the mRNA design problem holistically by jointly modeling UTRs and CDS. Coupled with experimental validation reported in the preprint, it represents a meaningful step toward foundation-model approaches to mRNA-therapeutic engineering, complementing related efforts on UTR-specific models (5-UTR-LM) and coding-sequence optimization tools (CaLM, CodonFM).

Citation

mRNA-GPT: A Generative Model for Full-Length mRNA Design and Optimization

Li, S., et al. (2026) mRNA-GPT: A Generative Model for Full-Length mRNA Design and Optimization. bioRxiv.

DOI: 10.64898/2026.03.31.715707

Recent citations

Papers that recently cited this model.

  • mRNAutilus: Multi-Objective-Guided Discrete Generation of mRNA with Optimized Therapeutic Properties

    Sawan Patel, Sophia Tang, Yesol Kim, et al.

    May 2026

    1
  • Protein-Conditioned Multi-Objective Reinforcement Learning for Full-Length mRNA Design

    Zixi Shao, Tao Wang, Yibei Xiao, et al.

    May 2026

    0

Top citations

The most-cited papers that cite this model.

  • mRNAutilus: Multi-Objective-Guided Discrete Generation of mRNA with Optimized Therapeutic Properties

    Sawan Patel, Sophia Tang, Yesol Kim, et al.

    May 2026

    1
  • Protein-Conditioned Multi-Objective Reinforcement Learning for Full-Length mRNA Design

    Zixi Shao, Tao Wang, Yibei Xiao, et al.

    May 2026

    0

Citations

Total Citations2
Influential0
References47

GitHub

Stars3
Forks0
Open Issues0
Contributors1
Last Push7mo ago
LanguagePython

HuggingFace

Downloads0
Likes0
Last Modified1y ago

Fields of citing research

  • Biology100%
  • Computer Science100%
  • Medicine50%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility
10Closed
Usability — can I run it?11
Reproducibility — can I retrain it?6
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

codoncodon_optimizationfoundation_modelmrnamrna_designreinforcement_learningrna_stability_predictionself_supervisedtherapeutic_mrna_designtransformerutr

Resources

GitHub RepositoryResearch PaperHuggingFace Model