EiRA

Protein binder design model post-trained from a multimodal protein language model to bind proteins, peptides, small molecules, and nucleic acids.

Released: September 2025

EiRA is a generative model for universal biomolecule-binding protein design, described in a bioRxiv preprint from Hunan University (first posted September 2025, updated through February 2026). Designing proteins that bind specified targets is central to protein engineering and gene therapy, but binding partners span many biomolecule classes — other proteins, peptides, small molecules, nucleic acids, and more. EiRA aims to handle this diversity within a single framework rather than building a separate model per target type.

Rather than training from scratch, EiRA is produced by post-training a general multimodal protein language model in two stages: domain-adaptive masking training, which adapts the base model toward binding-relevant sequence distributions, and binding-site-informed preference optimization, which steers generation toward designs that respect binding-site constraints. This post-training recipe positions EiRA alongside other protein language model–based design approaches while emphasizing transfer from a broad pretrained backbone to the specialized task of binder generation.

The authors report state-of-the-art results across structural confidence, diversity, novelty, and designability, evaluated on eight test sets spanning six biomolecule types, and extend the model to DNA-conditioned binder design — broadening the binding-design paradigm to nucleic-acid targets.

Key Features

Two-stage post-training: Combines domain-adaptive masking training with binding-site-informed preference optimization on top of a general multimodal protein language model.
Universal binder design: Handles multiple biomolecule classes within one model, evaluated across eight test sets and six biomolecule types.
DNA-conditioned design: Incorporates DNA information to support DNA-conditioned binder generation, extending design beyond protein and peptide targets.
Repetition mitigation: Optimizes training strategy and loss to reduce the severe repetitive generation seen in the underlying language model.
Experimental validation: Purification experiments and molecular dynamics confirm manufacturability and DNA binding, including a one-shot Glucagon peptide binder with SPR-confirmed micromolar affinity.

Technical Details

EiRA is built on a general multimodal protein language model and specialized through two post-training stages: domain-adaptive masking training and binding-site-informed preference optimization. The authors report state-of-the-art performance on structural confidence, diversity, novelty, and designability across 8 test sets covering 6 biomolecule types, and show that EiRA yields better representations of biomolecule-binding proteins than a generic model, improving several downstream predictive tasks. Training adjustments reduce the repetitive-generation pathology common to such language models. Experimental validation included protein purification and molecular dynamics simulations confirming manufacturability and DNA-binding ability, and a one-shot-designed Glucagon peptide binder with surface plasmon resonance (SPR)-confirmed micromolar affinity. As a recent preprint, no public code or model weights are referenced in the manuscript.

Applications

EiRA targets protein engineering and gene therapy applications that require de novo binders against diverse molecular targets. Its support for protein, peptide, small-molecule, and DNA targets makes it relevant for designing therapeutic binders, research-grade affinity reagents, and DNA-binding proteins, while its improved representations can aid downstream prediction tasks involving biomolecule-binding proteins.

Impact

EiRA shows that careful post-training — domain adaptation plus binding-site-informed preference optimization — can convert a general multimodal protein language model into a versatile, multi-target binder designer, with wet-lab and SPR validation lending credibility to its top designs. Extending the approach to DNA-conditioned design broadens the scope of language-model-based protein design. As an unreviewed preprint without a referenced code or weight release, independent benchmarking and reproduction will help establish how broadly its state-of-the-art claims generalize.

Citation

Improved multimodal protein language model-driven universal biomolecules-binding protein design with EiRA

Preprint

Zeng, W., et al. (2026) Improved multimodal protein language model-driven universal biomolecules-binding protein design with EiRA. bioRxiv.

DOI: 10.1101/2025.09.02.673615

Recent citations

Papers that recently cited this model.

AI-Driven Biomolecular Design: Modalities, Models, and Translation
Mehmoona Azmat, Wenjin Li
Biomaterials · Jul 2026
0
Symmetric Self-play Online Preference Optimization for Protein Inverse Folding
Wenwu Zeng, Xiaoyu Li, Haitao Zou, et al.
bioRxiv · Mar 2026
0

Top citations

The most-cited papers that cite this model.

Symmetric Self-play Online Preference Optimization for Protein Inverse Folding
Wenwu Zeng, Xiaoyu Li, Haitao Zou, et al.
bioRxiv · Mar 2026
0
AI-Driven Biomolecular Design: Modalities, Models, and Translation
Mehmoona Azmat, Wenjin Li
Biomaterials · Jul 2026
0

Citations

Total Citations2

Influential0

References52

Fields of citing research

Biology100%
Computer Science50%
Engineering50%
Materials Science50%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

13Closed

Usability — can I run it?13

Reproducibility — can I retrain it?0

not reproducible

Model Openness Framework

Unclassified

Missing required components

Resources

Research Paper

Key Features

Two-stage post-training: Combines domain-adaptive masking training with binding-site-informed preference optimization on top of a general multimodal protein language model.

Universal binder design: Handles multiple biomolecule classes within one model, evaluated across eight test sets and six biomolecule types.

DNA-conditioned design: Incorporates DNA information to support DNA-conditioned binder generation, extending design beyond protein and peptide targets.

Repetition mitigation: Optimizes training strategy and loss to reduce the severe repetitive generation seen in the underlying language model.

Experimental validation: Purification experiments and molecular dynamics confirm manufacturability and DNA binding, including a one-shot Glucagon peptide binder with SPR-confirmed micromolar affinity.

Technical Details

Applications

Impact

EiRA

Key Features

Technical Details

Applications

Impact

Citation

Improved multimodal protein language model-driven universal biomolecules-binding protein design with EiRA

Recent citations

AI-Driven Biomolecular Design: Modalities, Models, and Translation

Symmetric Self-play Online Preference Optimization for Protein Inverse Folding

Top citations

Symmetric Self-play Online Preference Optimization for Protein Inverse Folding

AI-Driven Biomolecular Design: Modalities, Models, and Translation

Citations

Fields of citing research

Openness

Tags

Resources

EiRA

Key Features

Technical Details

Applications

Impact

Citation

Improved multimodal protein language model-driven universal biomolecules-binding protein design with EiRA

Recent citations

AI-Driven Biomolecular Design: Modalities, Models, and Translation

Symmetric Self-play Online Preference Optimization for Protein Inverse Folding

Top citations

Symmetric Self-play Online Preference Optimization for Protein Inverse Folding

AI-Driven Biomolecular Design: Modalities, Models, and Translation

Citations

Fields of citing research

Openness

Tags

Resources

EiRA

#Key Features

#Technical Details

#Applications

#Impact

Citation

Improved multimodal protein language model-driven universal biomolecules-binding protein design with EiRA

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

EiRA

#Key Features

#Technical Details

#Applications

#Impact

Citation

Improved multimodal protein language model-driven universal biomolecules-binding protein design with EiRA

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact