SQUALL

Multimodal foundation model pretrained on 1.76B histology and spatial transcriptomics spots, inferring molecular state from whole-slide images.

Released: June 2026

SQUALL is a multimodal foundation model that bridges whole-slide histopathology images with spatial transcriptomics, learning a joint representation that connects tissue morphology to the underlying spatial molecular programs. Where conventional computational pathology models read only the visual content of a hematoxylin-and-eosin slide, SQUALL is pretrained to associate each region of tissue with its measured gene expression, allowing it to infer molecular state directly from histology and to reason about morphology and transcriptomics together.

The model was developed by Zongxu Zhang, Zexian Zeng, and collaborators at Peking University's Center for Quantitative Biology, with co-authors from the Peking-Tsinghua Center for Life Sciences, the National Cancer Center / Cancer Hospital of the Chinese Academy of Medical Sciences, and Tsinghua University, and released as a bioRxiv preprint in June 2026. It sits within a fast-moving wave of histology-plus-transcriptomics foundation models, alongside efforts such as STORM (Stanford) and SpatialFusion (MIT), but is distinguished by the scale and breadth of its paired pretraining corpus and by its emphasis on zero-shot generalization across tissues and platforms without per-dataset retraining.

By pretraining on paired data rather than images alone, SQUALL is positioned as a general-purpose backbone for both discovery (mapping where molecular programs are active in tissue) and clinical prediction (relating tissue appearance to patient outcomes) from routinely available slides.

Key Features

Paired multimodal pretraining: Jointly learns from co-registered histology and spatial transcriptomics, so morphological features are grounded in measured gene expression rather than inferred from images alone.
Massive, diverse training corpus: Pretrained on "histMol", a corpus of 1.76 billion paired histology-ST spots and bins spanning 33 tissues, 12 spatial platforms, and 3,446 sections.
Cross-platform generalization: Generalizes to new datasets without per-dataset retraining, supporting transcriptome-wide virtual biomarker profiling directly from a slide.
Spatial niche discovery: Identifies and clusters spatially coherent tissue niches, capturing organization that pure image models miss.
Trajectory and outcome modeling: Demonstrated on breast-cancer invasion trajectories and on whole-slide-level patient outcome prediction.

Technical Details

SQUALL is a transformer-based multimodal foundation model pretrained with a self-supervised objective that aligns whole-slide image regions with their paired spatial transcriptomic measurements. Its pretraining corpus, histMol, aggregates roughly 1.76 billion paired histology-ST spots and bins drawn from 3,446 tissue sections, covering 33 tissue types and 12 distinct spatial transcriptomics platforms — a scale and platform diversity intended to make the learned representation robust across assay chemistries and tissue contexts. After pretraining, the model supports transcriptome-wide virtual biomarker profiling, spatial niche discovery, and whole-slide outcome prediction. For clinical evaluation, the authors report outcome prediction on a cohort of 898 patients, and they benchmark SQUALL against existing computational pathology foundation models, reporting improved performance on these spatial and clinical tasks.

Applications

SQUALL is aimed at researchers and translational scientists working with digital pathology and spatial omics. From a standard whole-slide image it can predict spatially resolved gene expression, enabling "virtual" molecular biomarker profiling without running an expensive spatial assay on every sample; it can delineate spatial niches to study tissue architecture; and it can model disease progression, such as breast-cancer invasion trajectories. At the whole-slide level it supports patient outcome prediction, making it relevant to biomarker discovery, tumor microenvironment characterization, and prognostic modeling in oncology research.

Impact

By coupling 1.76 billion paired histology-transcriptomics observations into a single pretrained backbone, SQUALL pushes computational pathology beyond image-only representations toward models that natively reason about spatial molecular programs. Its reported gains over existing pathology foundation models on virtual biomarker profiling, niche discovery, and outcome prediction suggest paired pretraining at scale is a productive direction for the field. As a June 2026 preprint, its long-term influence remains to be established, and adoption is currently constrained: at the time of writing no public code, model weights, or HuggingFace release had been located, so independent reproduction and benchmarking are not yet possible. The work is released under a CC BY-NC license.

Citation

Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model

Zhang, Z., et al. (2026) Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model. bioRxiv.

DOI: 10.64898/2026.06.01.729028

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References0

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

6Closed

Usability — can I run it?7

Reproducibility — can I retrain it?3

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

Research Paper

Key Features

Paired multimodal pretraining: Jointly learns from co-registered histology and spatial transcriptomics, so morphological features are grounded in measured gene expression rather than inferred from images alone.

Massive, diverse training corpus: Pretrained on "histMol", a corpus of 1.76 billion paired histology-ST spots and bins spanning 33 tissues, 12 spatial platforms, and 3,446 sections.

Cross-platform generalization: Generalizes to new datasets without per-dataset retraining, supporting transcriptome-wide virtual biomarker profiling directly from a slide.

Spatial niche discovery: Identifies and clusters spatially coherent tissue niches, capturing organization that pure image models miss.

Trajectory and outcome modeling: Demonstrated on breast-cancer invasion trajectories and on whole-slide-level patient outcome prediction.

Technical Details

Applications

Impact

SQUALL

Key Features

Technical Details

Applications

Impact

Citation

Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

SQUALL

Key Features

Technical Details

Applications

Impact

Citation

Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

SQUALL

#Key Features

#Technical Details

#Applications

#Impact

Citation

Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

SQUALL

#Key Features

#Technical Details

#Applications

#Impact

Citation

Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact