University of Chicago / Broad Institute / Harvard Medical School
Transformer framework that models protein-protein interactions at residue resolution, generalizing zero-shot to unseen MHC alleles and sequence-neutral PTMs from one fixed checkpoint.
ReCLIP ("Residue-level Context for modeling protein-protein Interactions Predictions") is a transformer-based framework for predicting protein-protein interactions (PPIs) at the resolution of individual residues. Most protein language model (PLM) approaches to PPIs collapse per-residue features into a single whole-protein representation before scoring an interaction, which sacrifices both spatial resolution and interpretability. ReCLIP instead keeps the analysis residue-centered: it combines intra-protein residue neighborhoods with residue-conditioned representations of the interaction partner, learning interaction-specific features that remain anchored to specific positions in the sequence.
The model was developed by researchers at the University of Chicago, the Broad Institute, and Harvard Medical School, and posted to bioRxiv in June 2026. Its central claim is generality: rather than being a narrow single-task PPI classifier, a single fixed pretrained checkpoint supports a multi-task profile spanning mutation-perturbation analysis, post-translational modification (PTM) effects, peptide-MHC binding, and disease-variant interpretation. This distinguishes it from contemporaneous PPI tools such as CLIPepPI, PPIFlow, MOPPIT, and FlashPPI, which target narrower task families.
Because ReCLIP reasons about which residues mediate an interaction, the same architecture that scores a wild-type complex can quantify how a point mutation, a chemical modification, or an unseen binding partner shifts that interaction, without retraining for each new setting.
ReCLIP is built on a transformer architecture that operates over per-residue PLM features rather than pooled whole-protein embeddings. For a given residue, it constructs an intra-protein neighborhood representation and conditions it on representations of the interaction partner, yielding interaction-specific, position-resolved features. The authors report AUROC = 0.973 for predicting mutation-induced interaction perturbations, AUROC = 0.822 for generalization to PTMs that leave the sequence unchanged, and AUROC up to 0.972 for zero-shot peptide-MHC binding prediction across unseen alleles. The preprint is released under a CC BY 4.0 license; at the time of writing no public code or model weights had been confirmed, and no model card or data card was located.
ReCLIP is aimed at researchers studying interaction specificity and its disruption. Immunologists can apply it to peptide-MHC binding for alleles lacking dedicated training data, supporting epitope and immunogenicity analysis. Structural and functional biologists can use its residue-resolved scores to map which interface positions drive binding and how mutations or PTMs perturb them. Applied to clinically annotated genetic variants, ReCLIP links pathogenic variants to specific molecular interaction contexts, offering a route to mechanistic interpretation of disease-associated mutations within PPI networks.
ReCLIP reframes PPI modeling from a whole-protein classification problem into a residue-centered one, arguing that residue-level context is a general substrate for diverse interaction tasks. By demonstrating that a single fixed checkpoint transfers zero-shot to unseen MHC alleles and to sequence-neutral PTMs, it positions residue-conditioned representations as a unifying alternative to the proliferation of narrow, task-specific PPI predictors. As a 2026 preprint without released code or weights, its downstream adoption remains to be established, but its interpretability framing and multi-task generalization point toward more mechanistic, variant-aware models of protein interaction networks.
Zhang, Z., et al. (2026) Learning residue-level context for modeling protein-protein interactions. bioRxiv.
DOI: 10.64898/2026.06.01.729118Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data