BEPH

Histopathology foundation model pretrained with BEiT masked image modeling on 11M+ tissue image tiles for cancer diagnosis and survival prediction.

Released: March 2025

Parameters: 86 Million

BEPH (BEiT-based Pre-training on Histopathological images) is a self-supervised foundation model for computational pathology that learns transferable visual representations from hematoxylin and eosin (H&E) stained tissue images. Developed by the Yu Lab at Shanghai Jiao Tong University and published in Nature Communications in March 2025, BEPH addresses a central bottleneck in digital pathology: most clinically relevant tasks have only modest amounts of labeled data, making it difficult to train accurate models from scratch. By pretraining on millions of unlabeled image tiles, BEPH provides a general-purpose encoder that can be efficiently fine-tuned for a wide range of cancer-related tasks.

The model is notable for adapting the BEiT v2 masked image modeling paradigm, originally developed for natural images, to gigapixel histopathology. Rather than predicting raw pixels, BEiT-style pretraining reconstructs discrete visual tokens for masked patches, which encourages the encoder to learn high-level semantic structure relevant to tissue morphology. BEPH demonstrates that this approach yields representations that transfer well across cancer types and across the patch, whole-slide, and prognostic levels of analysis.

BEPH fits alongside other pathology foundation models such as UNI, CONCH, and Virchow, but distinguishes itself with a deliberately lightweight design (an ~86M-parameter ViT-Base backbone) that lowers the barrier to local deployment and fine-tuning on commodity hardware.

Key Features

BEiT v2 masked image modeling: BEPH pretrains a Vision Transformer by reconstructing discrete visual tokens for masked image patches, learning semantic tissue features without any manual annotation.
Large-scale pretraining: The encoder is trained on roughly 11.77 million image tiles extracted from 11,760 TCGA whole-slide images spanning 32 cancer types, giving broad morphological coverage.
Multi-level task support: Learned features are adapted to patch-level cancer detection, weakly supervised whole-slide-image (WSI) classification, and patient-level survival prediction.
Lightweight and deployable: With an ~86M-parameter ViT-Base backbone, BEPH is small enough to fine-tune and run on standard workstation GPUs.
Strong transfer performance: BEPH improves on ResNet and DINO baselines by up to 8.8% and 7.2% on WSI classification, and by 6.44% and 3.28% on average for survival prediction.

Technical Details

BEPH uses a ViT-Base backbone (~86M parameters) pretrained with the BEiT v2 self-supervised objective, in which a VQ-KD visual tokenizer supplies discrete target tokens for masked patches. Pretraining data consist of 224×224 tiles sampled at 40× magnification with at least 75% tissue content, drawn from TCGA diagnostic slides. For downstream WSI tasks, tile features are aggregated using the CLAM attention-based multiple-instance-learning framework. Across reported benchmarks, BEPH reaches AUCs of 0.994 for renal cell carcinoma subtyping, 0.970 for non-small cell lung cancer subtyping, and 0.946 for breast cancer subtyping, and achieves concordance indices in the 0.59–0.71 range for survival prediction across six TCGA cohorts (BRCA, CRC, CCRCC, PRCC, LUAD, STAD). Code and pretrained weights are released under the GPL-3.0 license.

Applications

BEPH is designed for computational pathology researchers and clinical AI developers who need a strong starting point for cancer-image analysis. Typical use cases include detecting malignancy in tissue patches, classifying cancer subtypes from whole-slide images, and stratifying patients by predicted survival risk to support prognosis. Because the backbone is lightweight and the weights are openly available, smaller labs can fine-tune BEPH on their own annotated datasets without the compute demands of larger pathology foundation models, making it suitable for both methods research and translational pipeline development.

Impact

By showing that a BEiT-based masked-image-modeling backbone can rival or exceed contrastive and supervised baselines across diagnosis, subtyping, and survival tasks, BEPH reinforced masked image modeling as a viable pretraining strategy for histopathology. Its publication in Nature Communications, paired with an openly released GPL-3.0 codebase and pretrained weights, lowered the practical barrier to adopting foundation models in pathology research. The model's emphasis on a compact, deployable architecture offers a useful counterpoint to the trend toward ever-larger pathology encoders, with the main limitation being that its pretraining draws from TCGA, so generalization to other scanners, stains, and populations warrants further external validation.

Citation

A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images

Yang, Z., et al. (2024) A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images. bioRxiv.

DOI: 10.1038/s41467-025-57587-y

Recent citations

Papers that recently cited this model.

Hypergraph-graph collaborative modeling for the prediction of benefit from immunotherapy in non-small cell lung cancer
Hanchen Wang, Baisen Cong, W. C. Cho, et al.
Engineering applications of artificial intelligence · Oct 2026
0
Foundation Models in Urological Pathology: Possible Impact and Potential Pitfalls
Xinhao Zhu, Qi Liu, Yujia Xia, et al.
Asian Journal of Urology · Jul 2026
0
Structurally and attention-adaptive explainable deep learning for multimodal medical diagnosis using the unified fuzzy membership function
Sajid Hussain, M. Waqas, Songhua Xu, et al.
Information Sciences · Jul 2026
0

Top citations

The most-cited papers that cite this model.

Artificial intelligence in digital pathology — time for a reality check
Arpit Aggarwal, Satvika Bharadwaj, Germán Corredor, et al.
Nature Reviews Clinical Oncology · Feb 2025
46
A Comprehensive Review of Deep Learning Applications with Multi-Omics Data in Cancer Research
Flavio Sartori, Francesco Codicé, Isabella Caranzano, et al.
Genes · May 2025
41
PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology
Fatemeh Ghezloo, M. S. Seyfioglu, Rustin Soraki, et al.
IEEE International Conference on Computer Vision · Feb 2025
35
A Survey on Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluation Tasks
Dong Li, Guihong Wan, Xintao Wu, et al.
arXiv.org · Jan 2025
24
From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer Research
A. Muneer, M. Waqas, Maliazurina B. Saad, et al.
Artificial Intelligence Review · Jul 2025
19

Citations

Total Citations96

Influential3

References65

GitHub

Stars77

Forks6

Open Issues7

Contributors1

Last Push1y ago

LanguagePython

LicenseGPL-3.0

Fields of citing research

Medicine97%
Computer Science92%
Biology16%
Engineering13%
Environmental Science2%
Chemistry1%
Mathematics1%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

71Open

Usability — can I run it?82

Reproducibility — can I retrain it?56

Model Openness Framework

Class III

Open Model

Resources

GitHub Repository Research Paper

Key Features

BEiT v2 masked image modeling: BEPH pretrains a Vision Transformer by reconstructing discrete visual tokens for masked image patches, learning semantic tissue features without any manual annotation.

Large-scale pretraining: The encoder is trained on roughly 11.77 million image tiles extracted from 11,760 TCGA whole-slide images spanning 32 cancer types, giving broad morphological coverage.

Multi-level task support: Learned features are adapted to patch-level cancer detection, weakly supervised whole-slide-image (WSI) classification, and patient-level survival prediction.

Lightweight and deployable: With an ~86M-parameter ViT-Base backbone, BEPH is small enough to fine-tune and run on standard workstation GPUs.

Strong transfer performance: BEPH improves on ResNet and DINO baselines by up to 8.8% and 7.2% on WSI classification, and by 6.44% and 3.28% on average for survival prediction.

Technical Details

Applications

Impact

BEPH

#Key Features

#Technical Details

#Applications

#Impact

Citation

A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images

Recent citations

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

BEPH

#Key Features

#Technical Details

#Applications

#Impact

Citation

A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images

Recent citations

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact