FM-CT

Self-supervised 3D vision foundation model for non-contrast head CT, pretrained on 361,663 scans to detect a broad range of intracranial disease.

Released: February 2025

FM-CT (also referred to as FM-HCT, "Foundation Model for Head CT" in its code release) is a 3D vision foundation model for non-contrast head computed tomography. Rather than training a separate narrow classifier for each condition, FM-CT learns general-purpose volumetric representations from large collections of unlabeled head CT scans, providing a reusable encoder that can be adapted — often with limited labeled data — to detect a broad range of intracranial diseases.

The model was developed by researchers at NYU Grossman School of Medicine (NYU Langone Health), led by the Razavian Lab, and first released as an arXiv preprint in February 2025 before publication in Nature Biomedical Engineering in 2026. FM-CT addresses a persistent gap in neuroimaging AI: most head CT models are trained on a single task (commonly intracranial hemorrhage) using relatively small annotated datasets, which limits their generalization to new diseases and to scans from unseen sites and scanners.

By demonstrating that self-supervised pretraining on hundreds of thousands of unlabeled 3D scans transfers effectively to clinically meaningful endpoints, FM-CT extends the foundation-model paradigm — already established in protein and genomics modeling — into volumetric clinical neuroimaging.

Key Features

3D volumetric modeling: Operates directly on full 3D head CT volumes rather than treating scans as independent 2D slices, preserving cross-slice anatomical context that is critical for detecting many intracranial findings.
Self-supervised pretraining: Combines self-distillation (DINO-style discrimination) with masked image modeling (MAE), learning representations from unlabeled scans without manual annotation.
Large, diverse pretraining corpus: Pretrained on 361,663 non-contrast 3D head CT scans, providing exposure to wide anatomical and acquisition variation.
Generalizable disease detection: Evaluated across multiple diagnostic tasks and on internal plus three external datasets, including out-of-distribution cases, to test robustness beyond the training site.
Competitive with commercial models: Reported to match or outperform existing CT foundation models — including Google's CT Foundation, Merlin, and CT-FM — across the disease-detection tasks assessed.

Technical Details

FM-CT uses a 3D Vision Transformer (ViT) backbone pretrained with two complementary self-supervised objectives: self-distillation between augmented views (a DINO-style discrimination loss) and masked image modeling (an MAE-style reconstruction loss). Pretraining drew on a corpus of 361,663 non-contrast 3D head CT scans collected without manual labels. The pretrained encoder was then fine-tuned and evaluated on downstream classification tasks spanning multiple intracranial conditions, with assessment on an internal held-out set and three external datasets to measure both in-distribution and out-of-distribution generalization. Across the evaluated tasks, the model outperformed several published and commercial CT foundation-model baselines, with the largest relative gains in lower-data and out-of-distribution settings.

Applications

FM-CT is intended as a shared backbone for teams building head CT analysis tools across radiology, neurology, and emergency medicine. Researchers and clinicians can adapt the pretrained encoder to detect conditions such as hemorrhage, infarct, mass effect, and other intracranial abnormalities — typically with far less labeled data than a from-scratch model would require — and to extend detection beyond the single-disease scope of most existing head CT classifiers. Because it generalizes across sites and scanners, it is particularly suited to multi-institution deployment and triage settings.

Impact

FM-CT demonstrates that large-scale self-supervised pretraining on 3D head CT produces a single encoder that generalizes across many intracranial diseases and across institutions, outperforming task-specific and commercial baselines while reducing the labeled data needed for new tasks. Its publication in Nature Biomedical Engineering and accompanying code release establish it as a notable reference point for volumetric medical-imaging foundation models. Key limitations are practical: pretrained weights are not openly distributed — because facial features can be reconstructed from head CT volumes, access requires an institutional data-sharing agreement via the NYU Langone Data Sharing Strategy Board — and the released code carries a non-commercial, no-derivatives (CC BY-NC-ND 4.0) license, with site-specific validation still required before any clinical use.

Citations

3D foundation model for generalizable disease detection in head computed tomography.

Zhu, W., et al. (2025) 3D foundation model for generalizable disease detection in head computed tomography.. Nature Biomedical Engineering.

DOI: 10.1038/s41551-026-01668-w

3D foundation model for generalizable disease detection in head computed tomography.

Preprint

Zhu, W., et al. (2025) 3D foundation model for generalizable disease detection in head computed tomography.. Nature Biomedical Engineering.

DOI: 10.48550/arXiv.2502.02779

Recent citations

Papers that recently cited this model.

OFMFS: Optimizing feature diversity via multi-perspective self-supervised tasks for few-shot image classification
Dongqing Li, Linhua Zou, Wencheng Lin, et al.
Information Sciences · Jul 2026
0
Large-Scale AI and Foundation Models for Neuroscience: A Comprehensive Review
Shihao Yang, Xiying Huang, Danilo Bernardo, et al.
Meta-Radiology · Jun 2026
0
Advanced Multimodal AI for Predicting Long-Term Functional Outcomes After Ischemic Stroke Using Only Admission Data
Fiona McBride, Haoxu Huang, Anjali Kiran Kapoor, et al.
medRxiv · May 2026
0

Top citations

The most-cited papers that cite this model.

Multimodal generative AI for interpreting 3D medical images and videos
Jung-Oh Lee, Hong-Yu Zhou, T. Berzin, et al.
npj Digital Medicine · May 2025
18
FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis
F. Maani, Numan Saeed, T. Saleem, et al.
arXiv.org · Feb 2025
15
Towards Scalable Language-Image Pre-training for 3D Medical Imaging
Chenhui Zhao, Yiwei Lyu, Asadur Chowdury, et al.
Trans. Mach. Learn. Res. · May 2025
10Influential
Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification
Han Liu, Bogdan Georgescu, Yanbo Zhang, et al.
arXiv.org · Dec 2025
5
AI Ecosystem and Value Chain: A Multi-Layered Framework for Analyzing Supply, Value Creation, and Delivery Mechanisms
R. K. Billones, Dan Arris S. Lauresta, Jeffrey T. Dellosa, et al.
Technologies · Sep 2025
4Influential

Citations

Total Citations13

Influential4

References70

GitHub

Stars62

Forks11

Open Issues1

Contributors2

Last Push3mo ago

LanguageJupyter Notebook

Fields of citing research

Computer Science92%
Medicine69%
Engineering31%
Business8%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

26Closed

Usability — can I run it?24

Reproducibility — can I retrain it?10

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper Research Paper

Key Features

3D volumetric modeling: Operates directly on full 3D head CT volumes rather than treating scans as independent 2D slices, preserving cross-slice anatomical context that is critical for detecting many intracranial findings.

Self-supervised pretraining: Combines self-distillation (DINO-style discrimination) with masked image modeling (MAE), learning representations from unlabeled scans without manual annotation.

Large, diverse pretraining corpus: Pretrained on 361,663 non-contrast 3D head CT scans, providing exposure to wide anatomical and acquisition variation.

Generalizable disease detection: Evaluated across multiple diagnostic tasks and on internal plus three external datasets, including out-of-distribution cases, to test robustness beyond the training site.

Competitive with commercial models: Reported to match or outperform existing CT foundation models — including Google's CT Foundation, Merlin, and CT-FM — across the disease-detection tasks assessed.

Technical Details

Applications

Impact

Citations

3D foundation model for generalizable disease detection in head computed tomography.

Zhu, W., et al. (2025) 3D foundation model for generalizable disease detection in head computed tomography.. Nature Biomedical Engineering.

DOI: 10.1038/s41551-026-01668-w

3D foundation model for generalizable disease detection in head computed tomography.

Preprint

Zhu, W., et al. (2025) 3D foundation model for generalizable disease detection in head computed tomography.. Nature Biomedical Engineering.

DOI: 10.48550/arXiv.2502.02779

FM-CT

#Key Features

#Technical Details

#Applications

#Impact

Citations

3D foundation model for generalizable disease detection in head computed tomography.

3D foundation model for generalizable disease detection in head computed tomography.

Recent citations

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

FM-CT

#Key Features

#Technical Details

#Applications

#Impact

Citations

3D foundation model for generalizable disease detection in head computed tomography.

3D foundation model for generalizable disease detection in head computed tomography.

Recent citations

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact