Chinese University of Hong Kong / Massachusetts General Hospital / Yonsei University / University of Sydney / Peking University / University of Georgia / Lehigh University / Emory University
A spatiotemporal foundation model that learns generalizable representations directly from 4D functional MRI volumes for diverse brain-imaging tasks.
Functional MRI (fMRI) measures brain activity through blood-oxygen-level-dependent (BOLD) signals, producing 4D data (three spatial dimensions over time) that is central to cognitive neuroscience and clinical neuroimaging. Yet most fMRI analysis pipelines remain bespoke: studies typically reduce the raw 4D volumes to handcrafted features such as region-of-interest time series or functional connectivity matrices, then train narrow, single-task models that transfer poorly across cohorts, scanners, and sites. This fragmentation contributes to well-documented reproducibility and generalization problems in brain imaging.
NeuroSTORM (Neuroimaging Foundation Model with Spatial-Temporal Optimized Representation Modeling) addresses this by learning generalizable representations directly from full 4D fMRI volumes, rather than from pre-extracted summaries. The model was developed by a multi-institution collaboration led by the Chinese University of Hong Kong with Massachusetts General Hospital, Yonsei University, the University of Sydney, Peking University, the University of Georgia, Lehigh University, and Emory University, first released as a preprint in June 2025 and published in Nature Biomedical Engineering in 2026.
Its central contribution is a scalable pretraining recipe paired with lightweight task adaptation, allowing a single pretrained backbone to be transferred efficiently to a wide range of downstream brain-imaging problems instead of training a new network for each study.
NeuroSTORM is a self-supervised foundation model built on a Shifted-Window Mamba backbone, released in several configurations (including Base and Large variants, as well as low-resolution and long-sequence variants tuned for different compute and input regimes). Pretraining uses self-supervised objectives — masked autoencoding together with contrastive learning — applied directly to 4D volumes, drawing on 28.65 million frames from over 50,000 subjects across multiple sites and a wide age range. Downstream evaluation spans large open cohorts such as UK Biobank, the Adolescent Brain Cognitive Development (ABCD) study, the Human Connectome Project, ABIDE, and ADHD-200, alongside multi-hospital clinical datasets covering 17 diagnoses. Across these benchmarks NeuroSTORM reports consistent improvements over prior fMRI methods, with strong results on disease diagnosis and cognitive phenotype prediction and demonstrated clinical utility on data from hospitals in the United States, South Korea, and Australia.
NeuroSTORM is intended as a reusable backbone for both neuroscience research and clinical neuroimaging. Researchers can fine-tune or prompt-adapt the pretrained model for demographic and cognitive phenotype prediction, brain-state decoding, and subject re-identification, reducing the per-study engineering and labeling burden. Its demonstrated performance on multi-hospital cohorts across three countries points toward clinical decision-support applications such as psychiatric and neurological disease diagnosis from fMRI, where consistent, transferable representations across sites and scanners are particularly valuable.
NeuroSTORM extends the foundation-model paradigm to 4D functional neuroimaging, a modality that has lagged behind protein, genomic, and natural-image domains in part because of the cost of modeling full spatiotemporal volumes. By showing that a single backbone pretrained on tens of millions of frames can transfer across demographic, phenotypic, diagnostic, retrieval, and state-classification tasks, it offers a concrete template for addressing the reproducibility and transferability challenges that have limited fMRI machine learning. The code is released under the Apache-2.0 license at the CUHK-AIM-Group repository, supporting both pretraining and fine-tuning workflows; broad reuse will depend on the continued availability and documentation of the pretrained checkpoints.
Wang, C., et al. (2026) Towards a general-purpose foundation model for functional MRI analysis.. Nature Biomedical Engineering.
DOI: 10.1038/s41551-026-01666-yPapers that recently cited this model.
The most-cited papers that cite this model.
Not enough data