NYU Grossman School of Medicine
A 3D self-supervised foundation model for non-contrast head CT, pretrained on 361,663 scans for generalizable detection of intracranial disease.
FM-CT (also referred to as FM-HCT, "Foundation Model for Head CT" in its code release) is a 3D vision foundation model for non-contrast head computed tomography. Rather than training a separate narrow classifier for each condition, FM-CT learns general-purpose volumetric representations from large collections of unlabeled head CT scans, providing a reusable encoder that can be adapted — often with limited labeled data — to detect a broad range of intracranial diseases.
The model was developed by researchers at NYU Grossman School of Medicine (NYU Langone Health), led by the Razavian Lab, and first released as an arXiv preprint in February 2025 before publication in Nature Biomedical Engineering in 2026. FM-CT addresses a persistent gap in neuroimaging AI: most head CT models are trained on a single task (commonly intracranial hemorrhage) using relatively small annotated datasets, which limits their generalization to new diseases and to scans from unseen sites and scanners.
By demonstrating that self-supervised pretraining on hundreds of thousands of unlabeled 3D scans transfers effectively to clinically meaningful endpoints, FM-CT extends the foundation-model paradigm — already established in protein and genomics modeling — into volumetric clinical neuroimaging.
FM-CT uses a 3D Vision Transformer (ViT) backbone pretrained with two complementary self-supervised objectives: self-distillation between augmented views (a DINO-style discrimination loss) and masked image modeling (an MAE-style reconstruction loss). Pretraining drew on a corpus of 361,663 non-contrast 3D head CT scans collected without manual labels. The pretrained encoder was then fine-tuned and evaluated on downstream classification tasks spanning multiple intracranial conditions, with assessment on an internal held-out set and three external datasets to measure both in-distribution and out-of-distribution generalization. Across the evaluated tasks, the model outperformed several published and commercial CT foundation-model baselines, with the largest relative gains in lower-data and out-of-distribution settings.
FM-CT is intended as a shared backbone for teams building head CT analysis tools across radiology, neurology, and emergency medicine. Researchers and clinicians can adapt the pretrained encoder to detect conditions such as hemorrhage, infarct, mass effect, and other intracranial abnormalities — typically with far less labeled data than a from-scratch model would require — and to extend detection beyond the single-disease scope of most existing head CT classifiers. Because it generalizes across sites and scanners, it is particularly suited to multi-institution deployment and triage settings.
FM-CT demonstrates that large-scale self-supervised pretraining on 3D head CT produces a single encoder that generalizes across many intracranial diseases and across institutions, outperforming task-specific and commercial baselines while reducing the labeled data needed for new tasks. Its publication in Nature Biomedical Engineering and accompanying code release establish it as a notable reference point for volumetric medical-imaging foundation models. Key limitations are practical: pretrained weights are not openly distributed — because facial features can be reconstructed from head CT volumes, access requires an institutional data-sharing agreement via the NYU Langone Data Sharing Strategy Board — and the released code carries a non-commercial, no-derivatives (CC BY-NC-ND 4.0) license, with site-specific validation still required before any clinical use.
Zhu, W., et al. (2025) 3D foundation model for generalizable disease detection in head computed tomography.. Nature Biomedical Engineering.
DOI: 10.1038/s41551-026-01668-wZhu, W., et al. (2025) 3D foundation model for generalizable disease detection in head computed tomography.. Nature Biomedical Engineering.
DOI: 10.48550/arXiv.2502.02779Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data