Keshav Memorial Engineering College
Diffusion-based generative model for structure-based peptide inverse folding, pairing a geometric GNN encoder with a Transformer denoiser to design sequences for a target backbone.
Inverse folding — designing an amino-acid sequence that will fold into a desired three-dimensional backbone — is a core task in computational protein and peptide design. While inverse folding for full proteins has been advanced by tools such as ProteinMPNN and ESM-IF, peptides pose distinct challenges: they are short, conformationally flexible, and often lack the stabilizing tertiary context that larger proteins provide. InversePep targets this peptide-specific regime with a generative diffusion model conditioned on backbone structure.
InversePep was introduced in a 2026 preprint with contributions from researchers associated with Keshav Memorial Engineering College. The method frames sequence design as learning the conditional distribution of sequences that can adopt a given peptide backbone, then sampling from that distribution with a denoising diffusion process. This generative framing allows the model to propose diverse candidate sequences for a single target conformation rather than a single deterministic answer.
The architecture combines a geometric graph neural network, which encodes the 3D backbone while respecting its spatial geometry, with a Transformer-based denoiser that iteratively refines sequence predictions. This design situates InversePep among recent structure-conditioned generative models for biomolecular design, applied specifically to functional peptides.
InversePep is a denoising diffusion model for structure-based peptide inverse folding. Its encoder is a geometric (geometry-aware) graph neural network that represents the target peptide backbone as a spatial graph, and its decoder is a Transformer that serves as the denoiser, progressively recovering a sequence consistent with the conditioning structure. The model was trained on peptide structural data drawn from the Propedia and SATPdb databases, which catalog peptide–protein complexes and therapeutic/bioactive peptides respectively. On held-out structures, InversePep reports a TM-score of 0.38, a measure of how well structures predicted from the designed sequences recapitulate the intended backbone fold; the authors report improved performance relative to comparable baselines. The preprint is distributed under a CC BY-NC-ND license, and no public code or model weights accompany the release.
InversePep is intended for researchers designing functional peptides — including antimicrobial and therapeutic peptides — who need sequences predicted to adopt a specified backbone conformation. By generating multiple candidates per target structure, it can support exploratory design campaigns where sequence diversity is valuable for downstream screening. The structure-conditioned generative approach is well suited to peptide engineering workflows that begin from a desired fold or scaffold and seek viable sequences to realize it.
InversePep contributes a peptide-specialized entry to the growing family of structure-conditioned generative models for biomolecular design, applying diffusion modeling to a regime where short, flexible chains complicate standard inverse folding. Its reported TM-score of 0.38 reflects the genuine difficulty of designing sequences for flexible peptide backbones and marks early-stage rather than production-grade performance. The absence of released code or weights, together with its non-commercial license, limits immediate reproducibility and adoption. As a recent preprint, its broader validation — particularly experimental confirmation of designed peptides — remains to be established.