bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

MolX

Monash University

Graph Transformer foundation model integrating 3 million protein pockets and 5 million molecules as E(3)-equivariant graphs for joint protein-ligand geometric representation learning.

Released: March 2026

MolX is a graph-transformer foundation model that learns joint geometric representations of protein binding pockets and small molecules. Posted to bioRxiv in early March 2026 by researchers at Monash University and collaborating institutions, MolX is trained on a corpus of 3 million protein pockets and 5 million molecules represented as E(3)-equivariant graphs, enabling rotation- and translation-invariant geometric reasoning.

MolX achieves state-of-the-art results across eight downstream drug-discovery benchmarks spanning conventional structure-activity tasks (binding affinity, virtual screening), modality-specific tasks (PROTAC degradation activity, molecular glue design, antibody-drug conjugate optimization), and cross-domain transfer.

#Key Features

  • E(3)-equivariant geometry: Joint protein-ligand representation respects rotational and translational symmetry, removing the need for arbitrary frame choices.
  • Massive pretraining corpus: 3M protein pockets and 5M molecules drawn from PDB, CSD, and curated commercial sources.
  • State-of-the-art on 8 benchmarks: SOTA on conventional binding-affinity prediction, virtual-screening enrichment, PROTAC degradation prediction, molecular-glue activity, ADC linker selection, and three additional drug-discovery tasks.
  • Cross-modality applicability: Single set of weights addresses small-molecule, peptide, PROTAC, molecular-glue, and ADC tasks, supporting modality-agnostic drug discovery.
  • Strong cross-domain generalization: Transfers well to held-out target classes and chemical scaffolds without per-task fine-tuning.

#Technical Details

MolX uses a graph transformer with E(3)-equivariant attention layers. Inputs are heterogeneous graphs jointly representing protein pocket atoms and molecular atoms with distance-and-angle features. Pretraining objectives combine masked atom prediction with pocket-ligand contrastive matching. The bioRxiv preprint provides architectural specifications, training schedule, and full benchmark tables.

Eight downstream benchmarks include PDBbind, LIT-PCBA, PROTAC-DB, Molecular Glue Atlas, ADC-Bench, and three internal modality-specific tasks. MolX outperforms prior structure-aware foundation models including Uni-Mol, GearNet, and ESM-Gearnet across these benchmarks.

#Applications

MolX is positioned as a general-purpose representation backbone for early drug discovery teams working across multiple therapeutic modalities. The PROTAC and molecular-glue capabilities are particularly valuable given the relative scarcity of foundation models for these emerging modalities. The cross-domain generalization property reduces the need for per-target retraining when applying the model to new programs.

#Impact

MolX advances the state of the art in geometric foundation models for drug discovery by demonstrating that joint protein-pocket and small-molecule representation learning can deliver SOTA on a broad cross-modality benchmark sweep. The integration of multiple emerging modalities (PROTAC, molecular glue, ADC) into a single foundation model is unusual and provides a useful counterpoint to highly specialized per-modality tools.

Citation

MolX: A Geometric Foundation Model for Protein–Ligand Modelling

Liu, J., et al. (2026) MolX: A Geometric Foundation Model for Protein–Ligand Modelling. bioRxiv.

DOI: 10.64898/2026.02.26.708362

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References68

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
11Closed
Usability — can I run it?7
Reproducibility — can I retrain it?14
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

antibody_drug_conjugate_designbinding_affinity_predictiondrug_discoveryequivariantfoundation_modelgraph_neural_networkmolecular_glue_designprotac_designprotein_ligand_complexprotein_pocketself_supervisedsmall_moleculetransformer

Resources

Research Paper