Overview

GearNet (GeomEtry-Aware Relational Graph Neural Network) is a structure-based protein encoder developed jointly by Mila Quebec AI Institute and IBM Research that learns robust representations directly from 3D protein structures. Published at ICLR 2023, it addresses a fundamental challenge in protein machine learning: how to encode the rich geometric information present in experimental or predicted structures in a way that is invariant to 3D transformations — translations, rotations, and reflections — while still capturing both spatial and chemical context.

Most protein language models operate on amino acid sequences alone, treating a protein as a one-dimensional string. GearNet takes a complementary approach by constructing a multi-relational graph from the protein's 3D coordinates. Each residue is a node, and edges are drawn according to multiple criteria: sequential connectivity (peptide bonds), spatial proximity (radius graphs), and K-nearest-neighbor relationships. Relational graph convolution then performs message passing across all edge types simultaneously, allowing the model to aggregate information from a residue's geometric neighborhood rather than just its sequence neighbors.

The model can be extended to GearNet-Edge, which adds an edge message passing layer that updates edge representations alongside node representations. This is analogous to the pair representation updates in AlphaFold 2's Evoformer, adapted as a sparse graph operation. Crucially, GearNet introduced five geometric self-supervised pretraining tasks — multiview contrast, residue type prediction, distance prediction, angle prediction, and dihedral prediction — demonstrating that structure-based pretraining can match or exceed sequence-based methods while using orders of magnitude less pretraining data.

Key Features

Geometric Invariance: Derives node and edge features from backbone geometry (distances, angles, dihedral angles) that are invariant to rigid-body transformations, ensuring learned representations reflect true structural relationships.
Multi-Relational Graph Construction: Builds protein graphs with multiple edge types — sequential, radius graph, and K-nearest-neighbor — enabling the model to capture diverse spatial and chemical interactions within a single unified framework.
Edge Message Passing: The GearNet-Edge variant updates edge representations alongside node representations, providing richer structural encoding analogous to pair-representation updates in larger structure prediction models.
Five Geometric Pretraining Tasks: Self-supervised objectives operating directly on structure (contrastive multiview, residue type masking, pairwise distance, angle, and dihedral prediction) provide strong structural priors that transfer to diverse downstream tasks.
Data Efficiency: Pretrains on AlphaFold Database structures — a far smaller dataset than the billions of sequences used by protein language models — while achieving competitive downstream performance.

Technical Details

GearNet constructs protein graphs with multiple edge types and applies relational graph convolutional layers that perform edge-type-specific message passing and aggregation. Geometric features on nodes and edges are derived from backbone atom coordinates: pairwise distances, sequential distances along the chain, bond angles, and dihedral angles. These features are invariant to 3D rigid-body transformations by construction, so the model does not require data augmentation for rotational symmetry. The GearNet-Edge variant adds a message passing step that updates edge embeddings before node aggregation, introducing an additional layer of structural context. Both variants are trained using AlphaFold Database structures as the pretraining corpus, which is substantially smaller than the sequence databases used to pretrain ESM-2 or other large protein language models, yet yields competitive performance due to the information density of structural data.

Benchmark results reported at ICLR 2023 show GearNet-Edge with multiview contrastive pretraining achieving strong performance on enzyme commission (EC) number prediction, gene ontology (GO) term annotation, and protein fold classification tasks from the ATOM3D and other standard benchmarks. The model demonstrates consistent gains from pretraining over training from scratch, confirming that geometric self-supervised objectives capture transferable structural features.

Applications

GearNet is designed for downstream protein property prediction tasks that benefit from structural context. It performs well on enzyme commission number classification, gene ontology annotation, and protein fold classification — tasks where 3D architecture encodes functional information that sequence alone may not capture. The model is especially useful in low-data regimes, where the geometric pretraining provides strong structural priors that reduce the need for large labeled training sets. GearNet can also be combined with sequence-based protein language models to leverage both sequential and structural information jointly, a strategy that tends to improve performance on structure-sensitive prediction tasks relative to either modality alone.

Impact

GearNet contributed to establishing structure-based pretraining as a viable complement to sequence-based protein language models. Its demonstration that five lightweight geometric self-supervised objectives — trained on far less data than sequence pretraining regimes — could match the downstream transfer performance of ESM-scale models was an important result for the field, highlighting the information efficiency of structural data. The multi-relational graph framework and the edge message passing mechanism have influenced subsequent structure-aware protein GNN designs. A key limitation is that GearNet depends on the availability of 3D coordinates, which historically required experimental determination; however, the widespread availability of AlphaFold Database predictions has substantially reduced this barrier, making structure-based encoders like GearNet broadly applicable across proteome-scale analyses.

Overview

Key Features

Geometric Invariance: Derives node and edge features from backbone geometry (distances, angles, dihedral angles) that are invariant to rigid-body transformations, ensuring learned representations reflect true structural relationships.

Multi-Relational Graph Construction: Builds protein graphs with multiple edge types — sequential, radius graph, and K-nearest-neighbor — enabling the model to capture diverse spatial and chemical interactions within a single unified framework.

Edge Message Passing: The GearNet-Edge variant updates edge representations alongside node representations, providing richer structural encoding analogous to pair-representation updates in larger structure prediction models.

Five Geometric Pretraining Tasks: Self-supervised objectives operating directly on structure (contrastive multiview, residue type masking, pairwise distance, angle, and dihedral prediction) provide strong structural priors that transfer to diverse downstream tasks.

Data Efficiency: Pretrains on AlphaFold Database structures — a far smaller dataset than the billions of sequences used by protein language models — while achieving competitive downstream performance.

Technical Details

Applications

Impact

GearNet

Overview

Key Features

Technical Details

Applications

Impact

Citation

Protein Representation Learning by Geometric Structure Pretraining

Metrics

Citations

Tags

Resources

GearNet

Overview

Key Features

Technical Details

Applications

Impact

Citation

Protein Representation Learning by Geometric Structure Pretraining

Metrics

Citations

Tags

Resources