Overview

MolX is a graph-transformer foundation model that learns joint geometric representations of protein binding pockets and small molecules. Posted to bioRxiv in early March 2026 by researchers at Monash University and collaborating institutions, MolX is trained on a corpus of 3 million protein pockets and 5 million molecules represented as E(3)-equivariant graphs, enabling rotation- and translation-invariant geometric reasoning.

MolX achieves state-of-the-art results across eight downstream drug-discovery benchmarks spanning conventional structure-activity tasks (binding affinity, virtual screening), modality-specific tasks (PROTAC degradation activity, molecular glue design, antibody-drug conjugate optimization), and cross-domain transfer.

Key Features

E(3)-equivariant geometry: Joint protein-ligand representation respects rotational and translational symmetry, removing the need for arbitrary frame choices.
Massive pretraining corpus: 3M protein pockets and 5M molecules drawn from PDB, CSD, and curated commercial sources.
State-of-the-art on 8 benchmarks: SOTA on conventional binding-affinity prediction, virtual-screening enrichment, PROTAC degradation prediction, molecular-glue activity, ADC linker selection, and three additional drug-discovery tasks.
Cross-modality applicability: Single set of weights addresses small-molecule, peptide, PROTAC, molecular-glue, and ADC tasks, supporting modality-agnostic drug discovery.
Strong cross-domain generalization: Transfers well to held-out target classes and chemical scaffolds without per-task fine-tuning.

Technical Details

MolX uses a graph transformer with E(3)-equivariant attention layers. Inputs are heterogeneous graphs jointly representing protein pocket atoms and molecular atoms with distance-and-angle features. Pretraining objectives combine masked atom prediction with pocket-ligand contrastive matching. The bioRxiv preprint provides architectural specifications, training schedule, and full benchmark tables.

Eight downstream benchmarks include PDBbind, LIT-PCBA, PROTAC-DB, Molecular Glue Atlas, ADC-Bench, and three internal modality-specific tasks. MolX outperforms prior structure-aware foundation models including Uni-Mol, GearNet, and ESM-Gearnet across these benchmarks.

Applications

MolX is positioned as a general-purpose representation backbone for early drug discovery teams working across multiple therapeutic modalities. The PROTAC and molecular-glue capabilities are particularly valuable given the relative scarcity of foundation models for these emerging modalities. The cross-domain generalization property reduces the need for per-target retraining when applying the model to new programs.

Impact

MolX advances the state of the art in geometric foundation models for drug discovery by demonstrating that joint protein-pocket and small-molecule representation learning can deliver SOTA on a broad cross-modality benchmark sweep. The integration of multiple emerging modalities (PROTAC, molecular glue, ADC) into a single foundation model is unusual and provides a useful counterpoint to highly specialized per-modality tools.

Overview

Key Features

E(3)-equivariant geometry: Joint protein-ligand representation respects rotational and translational symmetry, removing the need for arbitrary frame choices.

Massive pretraining corpus: 3M protein pockets and 5M molecules drawn from PDB, CSD, and curated commercial sources.

State-of-the-art on 8 benchmarks: SOTA on conventional binding-affinity prediction, virtual-screening enrichment, PROTAC degradation prediction, molecular-glue activity, ADC linker selection, and three additional drug-discovery tasks.

Cross-modality applicability: Single set of weights addresses small-molecule, peptide, PROTAC, molecular-glue, and ADC tasks, supporting modality-agnostic drug discovery.

Strong cross-domain generalization: Transfers well to held-out target classes and chemical scaffolds without per-task fine-tuning.

Technical Details

Applications

Impact

MolX

Overview

Key Features

Technical Details

Applications

Impact

Citation

MolX: A Geometric Foundation Model for Protein–Ligand Modelling

Metrics

Citations

Tags

Resources

MolX

Overview

Key Features

Technical Details

Applications

Impact

Citation

MolX: A Geometric Foundation Model for Protein–Ligand Modelling

Metrics

Citations

Tags

Resources