Text-guided localization model that grounds natural-language functional descriptions to specific residue regions of a protein sequence.
ProLoc addresses a gap between protein function prediction and mechanistic interpretation. Most protein-text and protein function models capture global, protein-level associations: they can tell you that a protein has a given function, but not which residues are responsible for it. For researchers trying to understand a mechanism or to prioritize residues for experimental validation, that whole-protein answer is too coarse. ProLoc reframes the problem as a span-level grounding task: given a protein sequence and a free-text functional description, it identifies the specific residue regions—domains, motifs, or functional sites—that correspond to that description.
Developed by Peishuo Liu, Jiaxin Fan, Mianzhi Pan, and Jianbing Zhang at Nanjing University and released as a preprint in June 2026, ProLoc introduces both the task formulation, which the authors call text-guided protein functional region localization, and a model built to solve it. The work pairs a curated benchmark derived from InterPro annotations with a text-conditioned localization model that combines a protein language model and a biomedical text encoder.
The framing borrows the notion of visual grounding from vision-language research and applies it to proteins, treating the residue sequence as the medium to be localized within and the functional description as the query. This makes ProLoc useful as a residue-level annotation and hypothesis-generation tool rather than a global classifier.
ProLoc is a text-conditioned localization model built on a frozen-vocabulary pairing of ESM2-650M, a 650-million-parameter protein language model, and PubMedBERT, a biomedical-domain text encoder. It performs direct residue-level localization and includes an anchor-free span proposal mechanism for recovering multiple functional regions. Training and evaluation use a benchmark constructed from InterPro annotations covering both domain-level and functional-site descriptions, with sequence-similarity-aware splits designed to test generalization to dissimilar sequences. On the held-out test set, the direct output reaches the strongest single-region localization performance at 0.7730 IoU@1, while the anchor-free proposal output improves visible multi-site recovery, reaching 0.9671 VM R@10 IoU50 and 0.9489 VM All-Hit@50. The authors report that ProLoc substantially outperforms window-based adaptations of representative protein and protein-text models on the same benchmark.
ProLoc supports residue-level functional annotation of proteins, particularly for newly sequenced or under-characterized proteins where a functional description is available but the responsible regions are unknown. By localizing text descriptions to specific spans, it helps researchers prioritize residues for experimental validation, interpret the structural or mechanistic basis of a function, and pinpoint domains, motifs, and functional sites. The open-vocabulary text query makes it adaptable across the breadth of InterPro annotations without retraining for each function of interest.
ProLoc defines text-guided protein functional region localization as a distinct span-level grounding task and supplies both a benchmark and a baseline model for it, establishing an evaluation framework that future protein-text models can be measured against. Its emphasis on residue-level grounding rather than global classification moves protein-text modeling toward mechanistic interpretability and experimental prioritization. As of mid-2026 the work is a preprint awaiting peer review; no source code, pretrained weights, or hosted API have been released, and the work is distributed under a restrictive (non-commercial) license, which currently limits independent reproduction and downstream reuse.
Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data