bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Small molecule foundation models
Small moleculeProtein

Peptide2Mol

Tsinghua University

An E(3)-equivariant graph neural network diffusion model that generates drug-like small molecules as peptidomimetics, conditioned on a reference peptide binder and its protein pocket.

Released: November 2025

Peptide2Mol is a generative deep learning model that designs drug-like small molecules to mimic the binding behavior of a known peptide binder against a target protein. Peptide therapeutics bind their targets with high affinity and specificity, but they suffer from poor oral bioavailability, metabolic instability, and short half-lives. Converting a validated peptide binder into a small molecule — a peptidomimetic — preserves the desired interactions while yielding a more drug-like, orally amenable compound. Peptide2Mol automates this historically laborious medicinal-chemistry task by learning to generate molecules directly within the protein pocket that a reference peptide occupies.

Introduced in November 2025 by Xinheng He, Yijia Zhang, and colleagues at Tsinghua University (with the senior author Jianzhu Ma of the Institute for AI Industry Research), the model is described in a preprint submitted to RECOMB 2026. It builds on the lab's broader program in AI-driven peptide and small-molecule drug design, complementing peptide-generation tools such as PepMimic with a method that crosses the peptide-to-small-molecule boundary.

By jointly conditioning on both the reference peptide and the surrounding pocket environment, Peptide2Mol differs from pure structure-based generators that only see the receptor. It reports state-of-the-art peptidomimetic generative performance and supports an iterative optimization mode, positioning it as a practical tool for early-stage hit generation and lead refinement.

#Key Features

  • Dual conditioning: Generation is guided by both a reference peptide binder and the protein pocket, so produced molecules recapitulate the peptide's binding mode rather than merely fitting the receptor surface.
  • E(3)-equivariant diffusion: A graph neural network operating with E(3) equivariance generates 3D molecular structures whose geometry respects rotational and translational symmetry of the binding site.
  • Partial diffusion for optimization: By noising and denoising only part of a molecule, the model refines existing candidates, enabling lead optimization and scaffold exploration rather than only de novo generation.
  • State-of-the-art peptidomimetic generation: The model reports leading performance on metrics for generating valid, pocket-compatible molecules that mimic the reference peptide.
  • Open implementation and weights: Code and a pretrained checkpoint are released under a permissive MIT license, with optional integration of Pocket2Mol for downstream structure refinement.

#Technical Details

Peptide2Mol is a denoising diffusion probabilistic model built on an E(3)-equivariant graph neural network. The model treats atoms as nodes in a geometric graph and learns to denoise atom types and 3D coordinates conditioned on the reference peptide and pocket residues, ensuring that generated geometries transform consistently under Euclidean symmetries. The released v1.0 checkpoint (PMT_major.ckpt) was trained on roughly 370,000 samples derived from protein-ligand and peptide-bound structures; inference runs from this fixed checkpoint. A partial-diffusion procedure lets users specify a sub-structure to regenerate, supporting molecule optimization workflows. The reference implementation targets Python 3.9 and CUDA 12.1, includes a preprocessing pipeline that converts SDF inputs to PyTorch tensors, and optionally chains to Pocket2Mol for post-hoc refinement.

#Applications

Peptide2Mol is aimed at computational chemists and drug-discovery teams seeking to convert peptide hits — for example, those derived from receptor or antibody interfaces — into small-molecule leads with better pharmacokinetic properties. Typical workflows include de novo generation of candidate peptidomimetics against a target pocket, and iterative optimization of an existing molecule through partial diffusion. Because the model is pocket- and peptide-aware, it is well suited to targets where a validated peptide binder already exists but a tractable small-molecule starting point does not, a common situation for protein-protein interaction targets.

#Impact

Peptide2Mol addresses a long-standing gap in structure-based drug design: systematically translating peptide binders into small molecules. As a recent (late-2025) preprint under conference review, its long-term adoption is still emerging, and reported results have not yet been validated through peer review or large-scale wet-lab campaigns. Nonetheless, by releasing open code and weights under an MIT license, the authors lower the barrier for the community to test peptide-to-small-molecule generation and to extend the dual-conditioning diffusion paradigm. The work reflects a growing trend of equivariant generative models that reason jointly over multiple molecular modalities within a shared binding environment.

Citation

Peptide2Mol: A Diffusion Model for Generating Small Molecules as Peptide Mimics for Targeted Protein Binding

Preprint

He, X., et al. (2025) Peptide2Mol: A Diffusion Model for Generating Small Molecules as Peptide Mimics for Targeted Protein Binding. arXiv.org.

DOI: 10.48550/arXiv.2511.04984

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References41

GitHub

Stars15
Forks3
Open Issues0
Contributors3
Last Push7mo ago
LanguagePython
LicenseMIT

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
75Open
Usability — can I run it?91
Reproducibility — can I retrain it?52
Model Openness Framework
Unclassified
Missing required components

Tags

de_novo_designdiffusiondrug_discoverygenerativegraph_neural_networkmolecule_optimizationpeptidomimetics

Resources

GitHub RepositoryResearch Paper