ConGLUDe

Contrastive geometric model unifying structure- and ligand-based drug design for zero-shot virtual screening, target fishing, and pocket selection.

Released: January 2026

ConGLUDe (Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design) is a single model that bridges two historically separate paradigms in computational drug discovery. Structure-based methods reason about a protein's three-dimensional pocket to find or design ligands that fit it, while ligand-based methods reason from known active molecules without an explicit structure. ConGLUDe learns one shared representation in which proteins, binding pockets, and ligands are embedded together, so the same model can be queried from either direction.

Developed at the Institute for Machine Learning at Johannes Kepler University Linz (the group of Günter Klambauer, with co-authors including industry researcher Daniel Kuhn) and posted to arXiv in January 2026, ConGLUDe is trained with a contrastive objective on both protein-ligand complexes and large-scale bioactivity data. This combination lets it align the geometry of a binding site with the chemistry of its ligands while also learning from the much larger body of measured activity values that lack structural information.

The authors position ConGLUDe as a step toward general-purpose foundation models for drug discovery: rather than building a bespoke model per task, a single trained checkpoint performs several distinct tasks zero-shot, without task-specific fine-tuning.

Key Features

Unified structure- and ligand-based design: A single contrastive model couples a protein/pocket encoder with a fast ligand encoder in a shared embedding space, so screening can start from either a structure or a set of known actives.
Zero-shot multi-task capability: From one checkpoint, ConGLUDe performs virtual screening, target identification ("target fishing"), and ligand-conditioned pocket selection without retraining for each task.
Competitive zero-shot virtual screening: The model's zero-shot screening is competitive with task-specific approaches, while it substantially outperforms baselines on target fishing.
Pocket selection without predefined pockets: ConGLUDe reaches state-of-the-art ligand-conditioned pocket selection and does not require binding pockets to be defined in advance, removing a common preprocessing dependency.

Technical Details

ConGLUDe is a contrastive geometric learning model. It comprises a protein encoder that captures the geometry of a target and its candidate binding sites, and a deliberately fast ligand encoder that captures molecular structure, both trained to project into a common embedding space. The contrastive objective pulls together embeddings of cognate protein-ligand pairs (and protein/pocket-ligand pairs) while pushing apart non-binders, jointly leveraging two complementary data sources: protein-ligand complex structures, which supply geometric grounding, and large-scale bioactivity datasets, which supply broad coverage of measured activity without requiring structures. Because retrieval and ranking reduce to nearest-neighbor operations in the shared space, the same trained model addresses screening (rank ligands for a target), target fishing (rank targets for a ligand), and pocket selection (rank candidate pockets for a ligand) with no task-specific fine-tuning. Reported results include competitive zero-shot virtual screening, substantial gains on target fishing, and state-of-the-art ligand-conditioned pocket selection. As of the January 2026 preprint, no public code or model weights were available.

Applications

ConGLUDe is aimed at early-stage drug discovery teams who must repeatedly ask related questions about proteins and ligands. Medicinal chemists can run virtual screens against a target of interest; teams investigating an active compound's mechanism or off-target liabilities can use target fishing to rank plausible protein targets; and structural and computational chemists can use ligand-conditioned pocket selection to identify the most relevant binding site without manually defining pockets. Because all three capabilities come from a single checkpoint operating zero-shot, the model is well suited to exploratory settings where building and maintaining separate task-specific pipelines would be costly.

Impact

ConGLUDe's central contribution is methodological: it shows that a single contrastive geometric model can unify structure-based and ligand-based design and serve multiple discovery tasks zero-shot, lending weight to the broader push toward general-purpose foundation models in drug discovery. By learning from both structural complexes and abundant bioactivity data, it offers a pragmatic way to combine the precision of geometry with the scale of activity measurements. The key limitations are typical of a new preprint: the reported results are retrospective and benchmark-based rather than prospectively validated in the lab, and at the time of release no public code or weights were available, which constrains immediate reproduction and adoption.

Citation

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

Preprint

Schneckenreiter, L., et al. (2026) Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design. arXiv.org.

DOI: 10.48550/arXiv.2601.09693

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References98

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

8Closed

Usability — can I run it?7

Reproducibility — can I retrain it?10

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

Research Paper

Key Features

Unified structure- and ligand-based design: A single contrastive model couples a protein/pocket encoder with a fast ligand encoder in a shared embedding space, so screening can start from either a structure or a set of known actives.

Zero-shot multi-task capability: From one checkpoint, ConGLUDe performs virtual screening, target identification ("target fishing"), and ligand-conditioned pocket selection without retraining for each task.

Competitive zero-shot virtual screening: The model's zero-shot screening is competitive with task-specific approaches, while it substantially outperforms baselines on target fishing.

Pocket selection without predefined pockets: ConGLUDe reaches state-of-the-art ligand-conditioned pocket selection and does not require binding pockets to be defined in advance, removing a common preprocessing dependency.

Technical Details

Applications

Impact

ConGLUDe

Key Features

Technical Details

Applications

Impact

Citation

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

ConGLUDe

Key Features

Technical Details

Applications

Impact

Citation

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

ConGLUDe

#Key Features

#Technical Details

#Applications

#Impact

Citation

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

ConGLUDe

#Key Features

#Technical Details

#Applications

#Impact

Citation

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact