Generative foundation model that enhances spatial transcriptomics by conditioning on H&E histology, scRNA-seq references, and spatial co-expression priors.
FOCUS is a generative foundation model for enhancing spatial transcriptomics (ST) data, developed by researchers in the Department of Clinical Neurosciences at the University of Cambridge and released as a bioRxiv preprint in December 2025. Spatial transcriptomics measures gene expression while preserving the physical location of cells within a tissue, but real ST data is limited by platform-dependent trade-offs between resolution, gene-panel breadth, capture sensitivity, and dropout. FOCUS addresses these limitations by learning to reconstruct and enhance ST measurements rather than treating each platform's output as a fixed observation.
What distinguishes FOCUS is that it conditions enhancement on three complementary sources of biological signal at once: the paired hematoxylin-and-eosin (H&E) histology image, a single-cell RNA-seq (scRNA-seq) reference of the relevant tissue, and spatial co-expression priors that capture how genes covary across neighboring locations. By fusing morphology, a single-cell expression reference, and spatial structure, the model can impute missing genes, denoise sparse counts, and increase effective resolution in a way that is anchored to both tissue appearance and known cell-state biology.
FOCUS is positioned as a unified, platform-agnostic model. It is trained across many ST technologies and is reported to generalize to unseen platforms and rare disease tissues in a zero-shot setting, making it a general-purpose enhancement layer that sits between raw spatial assays and downstream analyses such as cell typing and spatial domain discovery.
FOCUS is a generative model trained on a large multimodal corpus of paired histology and spatial transcriptomics data — more than 1.7 million H&E–ST pairs together with over 5.8 million single-cell expression profiles. This scale of paired supervision lets the model learn cross-modal relationships between tissue appearance and gene expression as well as the spatial co-expression structure that links neighboring measurements. The conditioning design enables enhancement tasks such as gene imputation, denoising, and resolution improvement to draw simultaneously on morphology (from H&E), reference cell-state distributions (from scRNA-seq), and spatial priors. The authors report state-of-the-art performance across 10 spatial transcriptomics platforms and demonstrate zero-shot application to the Open-ST platform and to rare craniopharyngioma and HNSCC datasets that were not part of training. As of the preprint, no public code or trained weights have been released, which currently limits independent benchmarking and deployment.
FOCUS targets researchers and pathologists working with spatial transcriptomics across basic, translational, and clinical settings. By imputing unmeasured genes and denoising sparse signal, it can extend limited gene panels, recover expression in low-capture regions, and raise effective resolution — improving downstream cell-type annotation, spatial domain delineation, and ligand-receptor analysis. Its histology grounding is especially valuable in oncology and neuropathology, where paired H&E slides are routine and where rare tumor entities (such as craniopharyngioma) provide too little data to train bespoke models. Because it generalizes zero-shot to new platforms, FOCUS can also serve as a common enhancement step in pipelines that combine data from heterogeneous spatial assays.
FOCUS reflects a growing trend toward large, multimodal foundation models that bridge digital pathology and spatial omics, joining a wave of models that learn jointly from histology images and molecular measurements. Its emphasis on enhancement — rather than de novo prediction — and its reported ability to generalize across platforms and to rare disease tissues address two persistent pain points in the ST field: data sparsity and platform fragmentation. The main limitations are that the work is a preprint, that the strongest claims (state-of-the-art across 10 platforms, zero-shot transfer) await independent confirmation, and that the absence of released code or weights makes the model difficult to reproduce or use in practice at this time.