bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Pathology foundation models
PathologySpatial omics

SHEST

Samsung Advanced Institute for Health Sciences and Technology / Samsung Medical Center / Sungkyunkwan University

A multi-task framework that predicts single-cell type composition and reconstructs spatial gene expression directly from H&E histology using a frozen pathology foundation model backbone.

Released: November 2025

SHEST (Single-cell-level H&E Spatial Transcriptomics) is a multi-task deep learning framework that infers cellular biology directly from routine hematoxylin and eosin (H&E) histology. From a stained tissue image alone, it both predicts the cell-type composition of the tissue and reconstructs spatially resolved gene expression at single-cell resolution — bridging conventional histopathology with spatial transcriptomics without requiring a molecular assay.

The model addresses a practical bottleneck in tumor microenvironment research: spatial transcriptomics platforms are costly and not yet routine in clinical pathology, whereas H&E slides are produced for nearly every tissue specimen. By learning the relationship between tissue morphology and underlying molecular and cellular state, SHEST extracts cell-type and expression information from the inexpensive, ubiquitous H&E modality, making single-cell-level spatial analysis more accessible.

SHEST was developed by Hoyeon Jeong, Junghan Oh, Donggeon Lee, Jae Hwan Kang, and Yoon-La Choi at the Samsung Advanced Institute for Health Sciences and Technology (SAIHST), Samsung Medical Center, and Sungkyunkwan University in Seoul, South Korea. It was posted as a bioRxiv preprint in November 2025 and published in Briefings in Bioinformatics in 2026.

#Key Features

  • H&E-only single-cell inference: Predicts cell-type composition and reconstructs gene expression at single-cell resolution from a standard H&E slide, with no paired molecular measurement required at inference time.
  • Frozen foundation model backbone: Builds on the H-optimus-0 pathology vision transformer as a frozen feature extractor, attaching two lightweight task-specific heads rather than retraining the encoder.
  • Multi-task design: Jointly handles cell-type classification and spatial expression reconstruction, integrating cell classification, gene mapping, and spatial analysis into a single interpretable system.
  • Zero-shot whole-slide application: Runs on new whole-slide images out of the box (e.g., python he.py --wsi <file>), pairing Cellpose nuclear segmentation with the SHEST heads to output cell-level h5ad expression and GeoJSON annotations.
  • Validated on tumor tissue: Achieves F1 scores of 0.97 for tumor cells and 0.91 for lymphocytes in lung adenocarcinoma, with external validation supporting generalizability.

#Technical Details

SHEST uses a frozen H-optimus-0 vision transformer backbone — a large pathology foundation model trained on histology images — and adds two task-specific heads, one for cell-type prediction and one for gene-expression reconstruction. Inputs use a quadruple-tile strategy that aggregates morphological context around each segmented nucleus, with specialized clustering used to organize predictions. The model resolves six cell types in its lung adenocarcinoma setting: tumor (LUAD) cells, alveolar cells, macrophages, endothelial cells, fibroblasts, and lymphocytes. On held-out evaluation it reports F1 scores of 0.97 (tumor cells) and 0.91 (lymphocytes), and reconstructed expression reproduces known cell-type-specific marker patterns while preserving spatial relationships and multicellular niche structure. The released checkpoint combines the trained task heads with the frozen backbone; the implementation targets Python 3.10 and PyTorch 2.6.0 and uses Cellpose for nuclear segmentation.

#Applications

SHEST is aimed at researchers and pathologists studying the tumor microenvironment who want spatially resolved cellular and molecular readouts without running an expensive spatial transcriptomics experiment. Because it operates on standard H&E whole-slide images, it can be applied retrospectively to archival slides to map cell-type composition and infer gene expression across a section, supporting tasks such as immune-infiltration assessment, niche characterization, and hypothesis generation for downstream molecular validation.

#Impact

By demonstrating that single-cell-level cell typing and spatial expression can be recovered from H&E morphology, SHEST advances a growing line of work that repurposes ubiquitous histology images as a proxy for costly molecular assays. Its strong reported accuracy on tumor and immune cells, external validation, and openly released code and weights make it a practical reference point for histology-to-transcriptomics modeling. Key limitations follow from its training scope: the task heads were trained on lung adenocarcinoma with a fixed six-cell-type taxonomy, so performance on other tissues, cancer types, or cell populations will require further validation. The journal article is released under CC BY-NC 4.0, and the Hugging Face weights are gated behind a contact-sharing agreement.

Citations

SHEST: single-cell-level artificial intelligence from haematoxylin and eosin morphology for cell-type prediction and spatial transcriptomics reconstruction

Preprint

Jeong, H., et al. (2025) SHEST: single-cell-level artificial intelligence from haematoxylin and eosin morphology for cell-type prediction and spatial transcriptomics reconstruction. bioRxiv.

DOI: 10.1101/2025.11.19.689364

SHEST: single-cell-level artificial intelligence from haematoxylin and eosin morphology for cell-type prediction and spatial transcriptomics reconstruction

Jeong, H., et al. (2026) SHEST: single-cell-level artificial intelligence from haematoxylin and eosin morphology for cell-type prediction and spatial transcriptomics reconstruction. Briefings in Bioinformatics.

DOI: 10.1093/bib/bbag037

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References41

GitHub

Stars1
Forks0
Open Issues2
Contributors2
Last Push7d ago
LanguagePython

HuggingFace

Downloads0
Likes1
Last Modified2mo ago

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
16Closed
Usability — can I run it?19
Reproducibility — can I retrain it?11
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cell_type_annotationgene_expressionhistologymulti_taskspatial_transcriptomicsspatial_transcriptomics_reconstructiontransfer_learningvision_transformerzero_shot

Resources

GitHub RepositoryResearch PaperResearch PaperHuggingFace Model