bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Spatial omics foundation models
Spatial omicsSingle-cell

STPAINTER

University of Science and Technology of China / Peking University / Princeton University

A pan-cancer pretrained latent diffusion model that enhances spatial transcriptomics, imputing genome-wide expression from sparse panels with zero-shot generalization.

Released: February 2026

STPAINTER is a conditional generative model that enhances subcellular spatial transcriptomics by imputing genome-wide gene expression from the sparse gene panels that imaging-based platforms typically measure. Subcellular spatial technologies reveal tissue architecture at high resolution but are constrained by limited gene panels and detection sensitivity. Existing enhancement methods usually depend on a tissue-matched single-cell RNA-seq reference and require computationally intensive retraining for every new dataset, which limits their scalability and clinical use.

Developed and posted to bioRxiv in February 2026 by Yuhang Yang, Enhong Chen, and colleagues at the University of Science and Technology of China, together with collaborators at Peking University Cancer Hospital and Institute and Princeton University, STPAINTER takes a foundation-model approach. It is pretrained on a massive pan-cancer scRNA-seq atlas, learning a universal manifold of cellular states. Built on a latent diffusion architecture with stochastic differential equation (SDE)-guided generation, it reconstructs full transcriptomes from sparse spatial measurements without needing a per-dataset matched reference.

Crucially, this pretraining enables zero-shot generalization across tumor types. The authors apply STPAINTER to six spatial transcriptomics datasets spanning different cancers and cross-validate imputed landscapes against spatially resolved proteomics (CODEX). The model is distributed under a CC BY-NC-ND 4.0 license.

#Key Features

  • Pan-cancer pretraining: Learns a universal manifold of cellular states from a massive pan-cancer scRNA-seq atlas, removing the need for a tissue-matched reference per dataset.
  • Latent diffusion with SDE guidance: Uses a latent diffusion architecture with stochastic-differential-equation-guided generation to reconstruct genome-wide expression.
  • Zero-shot generalization: Generalizes across cancer types without retraining, demonstrated on six spatial transcriptomics datasets.
  • Downstream enhancement: Provides imputed transcriptomes and informative latent variables that improve resolution at both gene and cluster levels.
  • Orthogonal validation: Imputed cellular landscapes are cross-validated against spatially resolved proteomics (CODEX), supporting biological veracity.

#Technical Details

STPAINTER is a conditional latent diffusion model whose generation is guided by a stochastic differential equation formulation. It is pretrained on a large pan-cancer scRNA-seq atlas to learn a shared latent manifold of cellular states, then conditioned on sparse spatial measurements to reconstruct genome-wide expression profiles. Beyond imputed transcriptomes, the latent variables it produces serve as informative features for downstream analyses such as fine-grained subpopulation clustering and pathway enrichment. The authors evaluate across six spatial transcriptomics datasets of different cancer types and report zero-shot generalization without per-dataset retraining, with CODEX proteomics used as orthogonal validation. The preprint does not report a released parameter count, and code/weights availability is not specified at the time of writing.

#Applications

STPAINTER is designed for spatial cancer biology, where researchers need full transcriptomic context but imaging-based platforms only measure a limited gene panel. By imputing genome-wide expression and supplying informative latent representations, it supports tumor microenvironment analysis, subpopulation discovery, and pathway-level interpretation directly from sparse spatial data, without commissioning additional sequencing or assembling a matched scRNA-seq reference for each tissue.

#Impact

STPAINTER advances spatial transcriptomics enhancement from per-dataset, reference-dependent methods toward a pretrained, reference-free foundation-model paradigm with zero-shot transfer across cancer types. Its use of a pan-cancer atlas and CODEX cross-validation strengthens confidence in the biological plausibility of its imputations. As a February 2026 bioRxiv preprint, however, released code and weights are not yet confirmed, and independent benchmarking against established imputation methods will determine its broader adoption; the CC BY-NC-ND license also limits derivative use.

Tags

imputationgene_expressiondiffusionfoundation_modelzero_shotspatial_transcriptomicscancer