CortexMAE is a foundation model for functional magnetic resonance imaging (fMRI) that learns general-purpose representations of human brain activity through self-supervised pretraining. Introduced in the paper "Scaling Vision Transformers for Functional MRI with Flat Maps" (arXiv:2510.13768, October 2025; accepted at ICML 2026), it was developed by Connor Lane, Paul S. Scotti, Tanishq M. Abraham, and collaborators at Sophont and the Medical AI Research Center (MedARC). The work addresses a central question in computational neuroimaging: whether the scaling laws that drive progress in vision and language models also hold for brain activity data.
The core insight is representational. Rather than processing 4D fMRI volumes directly or collapsing them into coarse parcellations, CortexMAE projects cortical activity onto 2D flat maps and treats the resulting time series as short videos. This recasts an awkward neuroimaging problem as a familiar spatiotemporal computer-vision task, allowing the authors to apply a standard masked autoencoder (MAE) framework with Vision Transformer backbones. The model is trained to reconstruct masked spatiotemporal patches of cortical activity.
CortexMAE demonstrates that masked fMRI modeling improves with dataset size according to a strict power law, providing some of the clearest evidence to date that brain-activity foundation models benefit from scale. The release positions it among a growing class of self-supervised neuroimaging models aimed at transferable brain representations.
cortex_mae_flat), Schaefer-400 parcellation (cortex_mae_parcel), and MNI cortex volume (cortex_mae_volume) representations for direct comparison.CortexMAE pretrains Vision Transformers using a spatiotemporal masked autoencoder on roughly 2.1K hours of fMRI from the Human Connectome Project Young Adult study. The primary released checkpoints use a ViT-B encoder; scaling experiments span data subsets from 400K to 6.6M frames and show power-law improvements in masked reconstruction loss, with diminishing returns beyond about 37M encoder parameters. Downstream evaluation through the Brainmarks suite covers state-prediction tasks (HCP-YA 21-class cognitive task decoding, NSD COCO 24-category visual decoding) and trait-prediction tasks (ABIDE, ADHD-200, ADNI, PPMI, age/sex). The authors report strong, robust performance on state decoding, while noting that on subject-trait prediction the learned features struggle to clearly beat a simple functional-connectivity baseline, and that generalization to out-of-distribution data (NSD) scales more weakly than in-distribution reconstruction. Code is released under Apache 2.0; pretrained weights are available on HuggingFace under CC-BY-NC 4.0.
CortexMAE provides reusable cortical-activity embeddings for neuroimaging researchers who would otherwise train task-specific models from scratch on limited data. Its representations support brain-state decoding (identifying cognitive tasks or perceived stimuli from activity) and exploratory analysis of subject-level traits, and the released variants let practitioners compare flat-map, parcellation, and volumetric pipelines on their own datasets. The accompanying Brainmarks benchmark gives the field a standardized way to evaluate fMRI foundation models across clinical and cognitive-neuroscience datasets.
CortexMAE is among the first fMRI models to establish empirical scaling laws for masked brain-activity modeling, lending quantitative support to the foundation-model paradigm in neuroimaging. By framing cortical activity as flat-map video and open-sourcing code, weights, and a benchmark suite, the work lowers the barrier for transfer learning on notoriously data-scarce fMRI. Its candid reporting of where representations help (state decoding) and where they do not yet beat simple baselines (trait prediction, out-of-distribution generalization) offers a realistic roadmap for the next generation of brain foundation models.
Lane, C., et al. (2025) Scaling Vision Transformers for Functional MRI with Flat Maps. arXiv.org.
DOI: 10.48550/arXiv.2510.13768Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data