bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

PepEDiff

University of Cincinnati

Zero-shot peptide binder generator that runs diffusion in the latent space of a pretrained protein-embedding model, designing binders without intermediate structure prediction.

Released: January 2026

PepEDiff is a generative model for designing peptide binders against a chosen target receptor, developed by the Bai lab (LabJunBMI) at the University of Cincinnati and posted to arXiv in January 2026. Its central idea is to perform diffusion directly in the continuous latent space of a pretrained protein-embedding model, rather than over explicit 3D coordinates. This lets PepEDiff propose binder sequences without an intermediate structure-prediction step, sidestepping a common bottleneck and source of compounding error in structure-first design pipelines.

Because generation happens in a learned embedding space rather than against a fixed target structure, PepEDiff is a zero-shot designer: a single pretrained checkpoint generates candidate binders for an arbitrary receptor sequence and user-specified pocket residues, with no per-target retraining. The authors show that exploring this latent space lets the model reach novel peptides that lie beyond the distribution of known binders, which is valuable when established binder motifs are scarce or biased.

The work is most clearly differentiated on hard, out-of-distribution targets. On TIGIT — an immune checkpoint with a flat, feature-poor protein-protein-interaction interface that is challenging for conventional methods — PepEDiff outperforms state-of-the-art baselines, supporting the claim that latent-space exploration helps where pocket-based assumptions break down.

#Key Features

  • Latent-space diffusion: PepEDiff runs the diffusion process in the continuous embedding space of a pretrained protein language/embedding model, so generation operates on learned representations rather than raw atomic coordinates.
  • No structure prediction required: The model designs binder sequences directly, avoiding an intermediate folding step and the errors it can introduce.
  • Zero-shot, any-target inference: A single checkpoint generates binders for any receptor sequence with specified pocket residues; the provided sample_by_seq.py script takes a receptor sequence and pocket indices and produces candidates without retraining.
  • Reaches novel binder space: Latent-space exploration yields peptides beyond the distribution of known binders, useful for difficult or poorly characterized targets.
  • Strong on flat PPI interfaces: On TIGIT, a challenging out-of-distribution target with a flat interface, PepEDiff outperforms state-of-the-art methods.

#Technical Details

PepEDiff is a diffusion model that operates in the latent space of a pretrained protein-embedding model. Instead of denoising 3D structure, it denoises within the embedding manifold and decodes to peptide sequences, so binder generation is conditioned on the receptor sequence and a set of (zero-indexed) pocket residue indices supplied at inference. The released "Seq-Only" implementation exposes this through sample_by_seq.py, parameterized by generation count, peptide length, pocket indices, and receptor sequence; the authors recommend folding generated sequences with external tools such as AlphaFold Server or Boltz for downstream evaluation. Code is openly available on GitHub (LabJunBMI/PepEDiff-Seq-Only) under the Apache-2.0 license, and pretrained weights plus preprocessed training/testing data are distributed via an institutional SharePoint link. That weights link is a personal university OneDrive directory, which is a notable archival/durability concern: such links are prone to breakage or access changes over time, so long-term reproducibility would benefit from a persistent host (e.g., a model hub or DOI-backed archive). Evaluation emphasizes the TIGIT target, where PepEDiff surpasses state-of-the-art baselines.

#Applications

PepEDiff is intended for researchers designing peptide binders, particularly against protein-protein-interaction targets that lack deep, well-defined pockets and are therefore difficult for structure-based methods. Given a receptor sequence and the residues that define a desired interaction site, the model proposes diverse candidate sequences that can be folded and triaged with external structure-prediction tools before experimental follow-up. Its zero-shot design means a single trained model can be applied across many targets, lowering the barrier for groups that do not have the resources to train target-specific generators. The released inference script makes it straightforward to integrate into a screening workflow.

#Impact

PepEDiff contributes to a growing line of work that moves peptide and protein design out of explicit coordinate space and into learned embedding spaces, where diffusion can explore beyond the support of known binders. Its result on TIGIT is a useful demonstration that this strategy can help precisely where conventional, pocket-centric approaches struggle. The open Apache-2.0 codebase and a usable inference entry point lower the barrier to reuse, though two caveats temper its near-term impact: the reported validation is computational rather than experimental, and the pretrained weights are hosted on a fragile institutional SharePoint link rather than a durable archive, which poses a real risk to long-term reproducibility.

GitHub

Stars2
Forks1
Open Issues1
Contributors1
Last Push6mo ago
LanguagePython
LicenseApache-2.0

Openness

bio.rodeo opennessFully open · usable and reproducible
62Partial
Usability — can I run it?72
Reproducibility — can I retrain it?50
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

peptide_designde_novo_designprotein_designdiffusiongenerativezero_shotrepresentation_learningpeptidesproteomics

Resources

GitHub RepositoryResearch Paper