A lightweight graph-convolutional foundation model for spatial transcriptomics that learns spatially coherent, interpretable spot embeddings via masked central-spot prediction.
SAGE-FM is a spatial transcriptomics (ST) foundation model designed around two goals that often trade off against one another in the field: computational efficiency and mechanistic interpretability. Where many recent ST and single-cell foundation models rely on large transformer backbones with hundreds of millions of parameters, SAGE-FM uses a comparatively lightweight graph convolutional network (GCN) that explicitly encodes the tissue's spatial neighborhood graph, learning representations that respect the physical arrangement of cells and spots within a section.
The model was developed by Xianghao Zhan, Jingyu Xu, Yuanning Zheng, Zinaida Good, and Olivier Gevaert at Stanford University (Gevaert lab), and released as a preprint in January 2026. It targets the Visium platform, the most widely used sequencing-based ST technology, and is pretrained in a self-supervised manner so that it can produce useful embeddings for downstream tasks such as spatial domain clustering, spot annotation, and disease-subtype prediction without task-specific labels.
By framing the learning problem as masked central-spot prediction over a tissue graph, SAGE-FM positions itself alongside other ST and single-cell foundation models while emphasizing a smaller footprint and built-in interpretability through attention and perturbation-style analyses.
SAGE-FM is built on graph convolutional networks and trained with a masked central-spot prediction objective on 416 human Visium samples spanning 15 organs. During pretraining, the model learns to predict a held-out central spot's gene expression from its spatial neighbors, encouraging embeddings that capture both expression patterns and tissue architecture. Reported benchmarks include recovery of significant correlations for 91% of masked genes, roughly 81% spot-annotation accuracy in oropharyngeal cancer, and improved glioblastoma subtype prediction over the multi-omics factor analysis (MOFA) baseline, along with competitive unsupervised clustering. The preprint (arXiv:2601.15504, 26 pages, 5 figures) frames these results as evidence that a lightweight, interpretable GCN can match or exceed heavier approaches on key ST tasks.
SAGE-FM is aimed at researchers analyzing Visium spatial transcriptomics data who need spatially aware embeddings for clustering tissue into domains, annotating spots, and predicting clinically relevant subtypes. Its in silico perturbation capability supports hypothesis generation about ligand-receptor signaling and regulatory effects within tissue. Because the model is compact and interpretable, it is well suited to translational and pathology-adjacent settings such as the oropharyngeal cancer and glioblastoma analyses reported in the paper, where both predictive accuracy and explainability matter.
SAGE-FM contributes to a growing effort to bring foundation-model methodology to spatial transcriptomics while pushing back on the assumption that larger always means better. By demonstrating that a lightweight GCN with a simple masked self-supervised objective can deliver competitive performance and interpretable spatial embeddings, it offers a more accessible option for labs without large compute budgets. As a recent preprint, its long-term influence and adoption remain to be established, and at the time of writing no public code or model weights have been located, which currently limits independent reproduction and reuse.