SpaFoundation

Histology vision transformer with 80M parameters that predicts spatial gene expression from H&E tissue images and transfers to tumor detection.

Released: August 2025

Parameters: 80 Million

Spatial transcriptomics (ST) jointly profiles gene expression and spatial context alongside histological images, but the technology remains costly and time-consuming, limiting routine clinical use. A practical workaround is to infer gene expression directly from inexpensive hematoxylin-and-eosin (H&E) tissue images, yet prior computational approaches have been constrained by limited accuracy and spatial resolution, often a consequence of small training sets and modest model capacity.

SpaFoundation, introduced in August 2025 by researchers at Central South University (Changsha, China), addresses this gap with a large-scale histology foundation model purpose-built to predict spatial gene expression from tissue images. Rather than training a task-specific predictor from scratch, it learns generalizable histological representations through domain-specific self-supervised pretraining, then applies them to spatial gene expression inference and related downstream tasks with minimal or no fine-tuning.

Within the landscape of histology foundation models, SpaFoundation is distinguished by its explicit focus on spatial omics: it couples a general-purpose image encoder with the goal of high-resolution, transferable spatial gene expression prediction, positioning it alongside contemporaries such as BRIDGE that bridge histology and spatial transcriptomics.

Key Features

Spatial gene expression from H&E alone: Predicts spatial gene expression directly from standard tissue images, sidestepping the cost and turnaround of running spatial transcriptomics assays.
Self-distillation plus masked image modeling: Combines self-distillation with masked image modeling (MIM) so the encoder captures both high-level semantic representations and fine-grained structural features that enrich per-spot representations.
Strong transferability: Pretrained representations transfer to downstream tasks including tumor detection and spatial domain clustering with minimal or zero-shot fine-tuning.
Resolution flexibility: Validation across 117 samples demonstrates flexibility across different spatial resolutions, including high-resolution inference.
Open weights and code: Implementation and pretrained weights are publicly released under an MIT license on GitHub and Hugging Face.

Technical Details

SpaFoundation employs a teacher-student Vision Transformer (ViT) architecture that models dependencies among image patches, using an iBOT-style objective that jointly applies self-distillation and masked image modeling. The model has 80 million parameters and is pretrained on 1.79 million histology patches (the GitHub README cites approximately 1.84 million) spanning 26 tissue types, drawn from the HEST-1K spatial transcriptomics resource, which aggregates data from multiple platforms (including Spatial Transcriptomics, Visium, and Xenium) across human and mouse tissue. The authors validate the model on 117 samples and report that it consistently outperforms state-of-the-art baselines across four downstream tasks: spatial gene expression prediction, high-resolution gene expression inference, tumor detection, and spatial domain clustering. Downstream tumor-detection evaluation uses a cutaneous squamous cell carcinoma (cSCC) dataset (GEO accession GSE144240).

Applications

SpaFoundation is aimed at researchers and pathologists who want spatial molecular insight without the expense of full spatial transcriptomics experiments. By inferring gene expression from routine H&E slides, it can extend molecular characterization to large image archives, support virtual ST for cohorts where sequencing is impractical, and provide transferable features for tumor detection and tissue-region clustering. Its open weights make it a candidate encoder for computational pathology and spatial omics pipelines that need a histology backbone tuned for expression-related tasks.

Impact

By demonstrating that domain-specific pretraining on roughly 1.79 million histology patches yields representations that beat task-specific baselines across several spatial omics tasks, SpaFoundation reinforces a broader trend toward foundation-model-driven inference of spatial gene expression from cheap imaging. Released openly with code and weights, it lowers the barrier for groups exploring image-to-expression prediction. As a recent preprint, its real-world adoption and independent benchmarking are still emerging, and reported gains should be read in the context of the authors' own evaluation; the model's reliance on H&E appearance also means inferred expression remains a prediction rather than a measurement.

Citation

Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation

Preprint

Zhang, N., et al. (2025) Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation. bioRxiv.

DOI: 10.1101/2025.08.07.669202

Recent citations

Papers that recently cited this model.

A comprehensive survey of computer vision methods for spatial transcriptomics
Junchao Zhu, Ruining Deng, Junlin Guo, et al.
Briefings in Bioinformatics · May 2026
0Influential
Encoding functional edges in graphs to model spatially varying relationships in the tumor microenvironment
Ashley P. Tsang, S. Krishnan, Reva Kulkarni, et al.
npj Artificial Intelligence · Mar 2026
0

Top citations

The most-cited papers that cite this model.

A comprehensive survey of computer vision methods for spatial transcriptomics
Junchao Zhu, Ruining Deng, Junlin Guo, et al.
Briefings in Bioinformatics · May 2026
0Influential
Encoding functional edges in graphs to model spatially varying relationships in the tumor microenvironment
Ashley P. Tsang, S. Krishnan, Reva Kulkarni, et al.
npj Artificial Intelligence · Mar 2026
0

Citations

Total Citations3

Influential1

References35

GitHub

Stars8

Forks0

Open Issues0

Contributors1

Last Push10mo ago

LanguageJupyter Notebook

LicenseMIT

HuggingFace

Downloads0

Likes0

Last Modified11mo ago

Fields of citing research

Computer Science100%
Medicine100%
Biology50%

Share of papers citing this model.

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

59Partial

Usability — can I run it?83

Reproducibility — can I retrain it?42

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper HuggingFace Model Dataset

Key Features

Spatial gene expression from H&E alone: Predicts spatial gene expression directly from standard tissue images, sidestepping the cost and turnaround of running spatial transcriptomics assays.

Self-distillation plus masked image modeling: Combines self-distillation with masked image modeling (MIM) so the encoder captures both high-level semantic representations and fine-grained structural features that enrich per-spot representations.

Strong transferability: Pretrained representations transfer to downstream tasks including tumor detection and spatial domain clustering with minimal or zero-shot fine-tuning.

Resolution flexibility: Validation across 117 samples demonstrates flexibility across different spatial resolutions, including high-resolution inference.

Open weights and code: Implementation and pretrained weights are publicly released under an MIT license on GitHub and Hugging Face.

Technical Details

Applications

Impact

Citation

Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation

Preprint

Zhang, N., et al. (2025) Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation. bioRxiv.

DOI: 10.1101/2025.08.07.669202

SpaFoundation

Key Features

Technical Details

Applications

Impact

Citation

Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation

Recent citations

A comprehensive survey of computer vision methods for spatial transcriptomics

Encoding functional edges in graphs to model spatially varying relationships in the tumor microenvironment

Top citations

A comprehensive survey of computer vision methods for spatial transcriptomics

Encoding functional edges in graphs to model spatially varying relationships in the tumor microenvironment

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

SpaFoundation

Key Features

Technical Details

Applications

Impact

Citation

Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation

Recent citations

A comprehensive survey of computer vision methods for spatial transcriptomics

Encoding functional edges in graphs to model spatially varying relationships in the tumor microenvironment

Top citations

A comprehensive survey of computer vision methods for spatial transcriptomics

Encoding functional edges in graphs to model spatially varying relationships in the tumor microenvironment

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

SpaFoundation

#Key Features

#Technical Details

#Applications

#Impact

Citation

Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

SpaFoundation

#Key Features

#Technical Details

#Applications

#Impact

Citation

Inferring spatial gene expression from tissue images using large-scale histology foundation model with SpaFoundation

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact