SAGE-FM

Spatial transcriptomics foundation model built on a lightweight graph convolutional network and trained by masked central-spot prediction.

Released: January 2026

SAGE-FM is a spatial transcriptomics (ST) foundation model designed around two goals that often trade off against one another in the field: computational efficiency and mechanistic interpretability. Where many recent ST and single-cell foundation models rely on large transformer backbones with hundreds of millions of parameters, SAGE-FM uses a comparatively lightweight graph convolutional network (GCN) that explicitly encodes the tissue's spatial neighborhood graph, learning representations that respect the physical arrangement of cells and spots within a section.

The model was developed by Xianghao Zhan, Jingyu Xu, Yuanning Zheng, Zinaida Good, and Olivier Gevaert at Stanford University (Gevaert lab), and released as a preprint in January 2026. It targets the Visium platform, the most widely used sequencing-based ST technology, and is pretrained in a self-supervised manner so that it can produce useful embeddings for downstream tasks such as spatial domain clustering, spot annotation, and disease-subtype prediction without task-specific labels.

By framing the learning problem as masked central-spot prediction over a tissue graph, SAGE-FM positions itself alongside other ST and single-cell foundation models while emphasizing a smaller footprint and built-in interpretability through attention and perturbation-style analyses.

Key Features

Graph-based spatial modeling: A graph convolutional network operates directly on the spot neighborhood graph, so learned embeddings are spatially coherent and reflect local tissue context rather than treating spots as independent samples.
Masked central-spot pretraining: The self-supervised objective masks a central spot and reconstructs its expression from neighbors, recovering significant correlations for roughly 91% of masked genes (p < 0.05).
Lightweight and efficient: The architecture is deliberately compact relative to large transformer-based foundation models, lowering the compute barrier for ST analysis.
Interpretable perturbation analysis: In silico ligand-receptor perturbation experiments probe regulatory effects, offering a mechanistic window into the learned representations.
Strong downstream performance: Achieves about 81% accuracy on pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and outperforms MOFA on glioblastoma subtype prediction.

Technical Details

SAGE-FM is built on graph convolutional networks and trained with a masked central-spot prediction objective on 416 human Visium samples spanning 15 organs. During pretraining, the model learns to predict a held-out central spot's gene expression from its spatial neighbors, encouraging embeddings that capture both expression patterns and tissue architecture. Reported benchmarks include recovery of significant correlations for 91% of masked genes, roughly 81% spot-annotation accuracy in oropharyngeal cancer, and improved glioblastoma subtype prediction over the multi-omics factor analysis (MOFA) baseline, along with competitive unsupervised clustering. The preprint (arXiv:2601.15504, 26 pages, 5 figures) frames these results as evidence that a lightweight, interpretable GCN can match or exceed heavier approaches on key ST tasks.

Applications

SAGE-FM is aimed at researchers analyzing Visium spatial transcriptomics data who need spatially aware embeddings for clustering tissue into domains, annotating spots, and predicting clinically relevant subtypes. Its in silico perturbation capability supports hypothesis generation about ligand-receptor signaling and regulatory effects within tissue. Because the model is compact and interpretable, it is well suited to translational and pathology-adjacent settings such as the oropharyngeal cancer and glioblastoma analyses reported in the paper, where both predictive accuracy and explainability matter.

Impact

SAGE-FM contributes to a growing effort to bring foundation-model methodology to spatial transcriptomics while pushing back on the assumption that larger always means better. By demonstrating that a lightweight GCN with a simple masked self-supervised objective can deliver competitive performance and interpretable spatial embeddings, it offers a more accessible option for labs without large compute budgets. As a recent preprint, its long-term influence and adoption remain to be established, and at the time of writing no public code or model weights have been located, which currently limits independent reproduction and reuse.

Citation

SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Preprint

Zhan, X., et al. (2026) SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model. arXiv.org.

DOI: 10.48550/arXiv.2601.15504

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References9

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

10Closed

Usability — can I run it?7

Reproducibility — can I retrain it?14

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

Research Paper

Key Features

Graph-based spatial modeling: A graph convolutional network operates directly on the spot neighborhood graph, so learned embeddings are spatially coherent and reflect local tissue context rather than treating spots as independent samples.

Masked central-spot pretraining: The self-supervised objective masks a central spot and reconstructs its expression from neighbors, recovering significant correlations for roughly 91% of masked genes (p < 0.05).

Lightweight and efficient: The architecture is deliberately compact relative to large transformer-based foundation models, lowering the compute barrier for ST analysis.

Interpretable perturbation analysis: In silico ligand-receptor perturbation experiments probe regulatory effects, offering a mechanistic window into the learned representations.

Strong downstream performance: Achieves about 81% accuracy on pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and outperforms MOFA on glioblastoma subtype prediction.

Technical Details

Applications

Impact

SAGE-FM

Key Features

Technical Details

Applications

Impact

Citation

SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

SAGE-FM

Key Features

Technical Details

Applications

Impact

Citation

SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

SAGE-FM

#Key Features

#Technical Details

#Applications

#Impact

Citation

SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

SAGE-FM

#Key Features

#Technical Details

#Applications

#Impact

Citation

SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact