A SimCLR self-supervised foundation model for 3D brain MRI, pretrained on 18,759 patients across 11 neurological-disease datasets for diverse diagnostic tasks.
3D-Neuro-SimCLR is a self-supervised foundation model for volumetric brain MRI analysis, developed by researchers at McGill University and Mila - Quebec Artificial Intelligence Institute and introduced in a September 2025 preprint. Most deep-learning models for brain MRI are trained from scratch on a single task and a single disease cohort, which limits their ability to generalize across neurological conditions, scanners, and patient populations. 3D-Neuro-SimCLR addresses this by learning general-purpose representations directly from large, heterogeneous collections of unlabeled scans.
The model adapts SimCLR-style contrastive pretraining to 3D T1-weighted structural MRI. Rather than relying on disease labels, it learns to pull together augmented views of the same scan and push apart views of different scans, building an embedding space that captures broadly transferable anatomical features. This pretrained encoder can then be fine-tuned for a wide range of downstream clinical prediction tasks with relatively little labeled data.
By aggregating 11 publicly available datasets spanning Alzheimer's disease, Parkinson's disease, frontotemporal dementia, stroke, and healthy aging, the work contributes an accessible, broadly applicable foundation model for clinical brain MRI, with pretrained weights released openly for the neuroimaging community.
The encoder is a 3D ResNet-18 convolutional network paired with a 64-dimensional projection head for contrastive learning, implemented in PyTorch with components adapted from SimCLR and MONAI. Pretraining used 44,958 scans from 18,759 patients drawn from ADNI, AOMIC, CoRR, DLBS, GSP, HABS-HD, MCSA, NIFD, PPMI, SALD, and SOOP, with scans preprocessed via TurboPrep and registered to the MNI152 ICBM template. The authors evaluated four downstream tasks: Alzheimer's disease classification (AIBL, AUC 0.929), sex classification (IXI, AUC 0.991), brain age regression (IXI, MAE 4.35 years), and stroke-scale regression (SOOP, MAE 5.37). Across all four tasks, the fine-tuned SimCLR model outperformed both a Masked Autoencoder (ViT-Tiny) self-supervised baseline and fully supervised models trained from scratch.
The model is intended as a reusable backbone for clinical and research neuroimaging pipelines. Researchers studying neurodegenerative and cerebrovascular disease can fine-tune it for classification, regression, or biomarker prediction tasks without assembling large labeled datasets, which is especially valuable for rare conditions or small single-site studies. Its label efficiency makes it well suited to settings where expert annotation is scarce, and its broad pretraining distribution supports deployment across cohorts with differing scanners and demographics.
3D-Neuro-SimCLR contributes to a growing effort to bring the foundation-model paradigm to 3D medical imaging, where data scale and annotation cost have historically constrained deep learning. By demonstrating that contrastive pretraining across diverse neurological diseases produces a single transferable brain MRI encoder, and by releasing the weights openly, the work provides a practical starting point for downstream neuroimaging research. As a recent preprint its long-term adoption remains to be established, and like other structural-MRI models its evaluation is bounded by the modalities and populations represented in its public training datasets.
Kaczmarek, E., et al. (2025) Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses. 2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
DOI: 10.48550/arXiv.2509.10620Kaczmarek, E., et al. (2025) Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses. 2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
DOI: 10.1109/ICCVW69036.2025.00141Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data