bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein

TD3B

Duke University

Sequence-based discrete-diffusion framework that designs peptide binders with specified agonist or antagonist behavior against GPCR targets.

Released: May 2026

TD3B (Transition-Directed Discrete Diffusion for Allosteric Binder Generation) is a sequence-based generative framework for designing peptide binders that exert a specified functional effect — agonism or antagonism — on a target protein. Most generative binder design methods optimize for binding affinity alone, treating any tight binder as a success. TD3B instead conditions generation on the direction of the allosteric response, so that the resulting peptides are not merely binders but functional modulators that push the target toward an active or inactive conformational state. The work was developed by the Chatterjee Lab (Programmable Biology Group) at Duke University, spanning Biomedical Engineering and Computer Science, and was presented as an ICML 2026 Spotlight.

The model targets G protein-coupled receptors (GPCRs), a therapeutically central class of membrane proteins whose signaling is governed by ligand-induced conformational shifts. Designing peptides that selectively bias a GPCR toward activation or inhibition is a long-standing challenge because affinity and functional direction are decoupled: a high-affinity binder can be silent, agonistic, or antagonistic. TD3B addresses this by coupling a discrete-diffusion sequence generator with a learned signal that scores the intended directional effect.

Architecturally, TD3B fine-tunes a pretrained masked discrete language model (MDLM) diffusion backbone using an amortized training scheme that combines a Direction Oracle with a binding-affinity gate. The result is a single fixed checkpoint that runs inference against new protein targets without re-training, while optional per-target fine-tuning remains available for users who want to specialize the model further.

#Key Features

  • Direction-conditioned generation: Peptides are generated to produce a specified agonist or antagonist effect, not just to bind, addressing the decoupling between binding affinity and functional outcome.
  • Direction Oracle plus affinity gate: An amortized training objective pairs a learned Direction Oracle with a binding-affinity gate, steering diffusion sampling toward sequences that are both high-affinity and directionally correct.
  • Sequence-based, structure-free design: The model operates directly on peptide sequence using a discrete-diffusion backbone, avoiding the cost and brittleness of explicit structure prediction in the generation loop.
  • Zero-shot transfer to new targets: A single fixed checkpoint (td3b.ckpt) runs inference on previously unseen protein targets without re-training; multi-target fine-tuning is optional.
  • Released checkpoints and demo: Three checkpoints — the fine-tuned td3b.ckpt, the pretrained.ckpt MDLM backbone, and the direction_oracle.pt scorer — are distributed alongside an inference.py script and a Colab demo.

#Technical Details

TD3B builds on a masked diffusion language model (MDLM) backbone, a discrete-diffusion architecture that generates sequences by iteratively unmasking tokens under a learned reverse process. The pretrained backbone is provided as pretrained.ckpt; the specific training corpus used to pretrain this MDLM is not stated in the released materials. Fine-tuning is performed via amortized training that introduces transition-directed guidance: a Direction Oracle (direction_oracle.pt) predicts whether a candidate peptide drives the target toward the desired active or inactive state, and a binding-affinity gate filters for sequences that also bind, jointly shaping the diffusion trajectory. This produces the fixed td3b.ckpt checkpoint used for inference. The released package includes an inference.py script and a Colab notebook for generating binders against user-supplied targets. Model weights are distributed on HuggingFace under a CC BY-NC-ND 4.0 license (non-commercial, no-derivatives).

#Applications

TD3B is aimed at researchers designing functional peptide modulators of GPCRs, where the goal is not only to bind a receptor but to elicit a defined pharmacological direction — activation or inhibition. Because the fixed checkpoint generalizes to new targets without re-training, it can be applied to a range of receptors as a first-pass in silico design tool, with optional fine-tuning for targets that warrant specialization. The bundled inference script and Colab demo lower the barrier for computational biologists and protein engineers to generate candidate sequences for downstream synthesis and experimental validation.

#Impact

By explicitly conditioning generation on the direction of an allosteric response, TD3B reframes peptide binder design around functional outcome rather than affinity alone, a distinction that matters for therapeutic GPCR modulation. Its selection as an ICML 2026 Spotlight signals interest from the machine learning community in directional, function-aware generative design. Adoption is early and the broader functional generalization of designs remains to be established experimentally. Two limitations should be weighed: the non-commercial, no-derivatives license (CC BY-NC-ND 4.0) restricts commercial use and modification of the weights, and the training corpus of the underlying MDLM backbone is not disclosed, which limits full reproducibility and provenance assessment.

Citation

Preprint

DOI: 10.48550/arXiv.2605.09810

DOI: 10.48550/arXiv.2605.09810

Openness

Unclassified
Restrictive license on core components

Tags

de_novo_designdiffusionfine_tuninggenerativegpcrpeptidepeptide_designprotein_design

Resources

Research PaperHuggingFace Model