bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

EiRA

Hunan University

A generative protein design model post-trained from a multimodal protein language model for universal biomolecule-binding protein design, validated by SPR.

Released: September 2025

EiRA is a generative model for universal biomolecule-binding protein design, described in a bioRxiv preprint from Hunan University (first posted September 2025, updated through February 2026). Designing proteins that bind specified targets is central to protein engineering and gene therapy, but binding partners span many biomolecule classes — other proteins, peptides, small molecules, nucleic acids, and more. EiRA aims to handle this diversity within a single framework rather than building a separate model per target type.

Rather than training from scratch, EiRA is produced by post-training a general multimodal protein language model in two stages: domain-adaptive masking training, which adapts the base model toward binding-relevant sequence distributions, and binding-site-informed preference optimization, which steers generation toward designs that respect binding-site constraints. This post-training recipe positions EiRA alongside other protein language model–based design approaches while emphasizing transfer from a broad pretrained backbone to the specialized task of binder generation.

The authors report state-of-the-art results across structural confidence, diversity, novelty, and designability, evaluated on eight test sets spanning six biomolecule types, and extend the model to DNA-conditioned binder design — broadening the binding-design paradigm to nucleic-acid targets.

#Key Features

  • Two-stage post-training: Combines domain-adaptive masking training with binding-site-informed preference optimization on top of a general multimodal protein language model.
  • Universal binder design: Handles multiple biomolecule classes within one model, evaluated across eight test sets and six biomolecule types.
  • DNA-conditioned design: Incorporates DNA information to support DNA-conditioned binder generation, extending design beyond protein and peptide targets.
  • Repetition mitigation: Optimizes training strategy and loss to reduce the severe repetitive generation seen in the underlying language model.
  • Experimental validation: Purification experiments and molecular dynamics confirm manufacturability and DNA binding, including a one-shot Glucagon peptide binder with SPR-confirmed micromolar affinity.

#Technical Details

EiRA is built on a general multimodal protein language model and specialized through two post-training stages: domain-adaptive masking training and binding-site-informed preference optimization. The authors report state-of-the-art performance on structural confidence, diversity, novelty, and designability across 8 test sets covering 6 biomolecule types, and show that EiRA yields better representations of biomolecule-binding proteins than a generic model, improving several downstream predictive tasks. Training adjustments reduce the repetitive-generation pathology common to such language models. Experimental validation included protein purification and molecular dynamics simulations confirming manufacturability and DNA-binding ability, and a one-shot-designed Glucagon peptide binder with surface plasmon resonance (SPR)-confirmed micromolar affinity. As a recent preprint, no public code or model weights are referenced in the manuscript.

#Applications

EiRA targets protein engineering and gene therapy applications that require de novo binders against diverse molecular targets. Its support for protein, peptide, small-molecule, and DNA targets makes it relevant for designing therapeutic binders, research-grade affinity reagents, and DNA-binding proteins, while its improved representations can aid downstream prediction tasks involving biomolecule-binding proteins.

#Impact

EiRA shows that careful post-training — domain adaptation plus binding-site-informed preference optimization — can convert a general multimodal protein language model into a versatile, multi-target binder designer, with wet-lab and SPR validation lending credibility to its top designs. Extending the approach to DNA-conditioned design broadens the scope of language-model-based protein design. As an unreviewed preprint without a referenced code or weight release, independent benchmarking and reproduction will help establish how broadly its state-of-the-art claims generalize.

Tags

protein_designbinder_designrepresentation_learningtransformerlanguage_modelmultimodalgenerativeprotein_ligand_interactionsdna