CortexMAE

fMRI foundation model trained on cortical flat-map videos with masked autoencoding, showing power-law scaling on brain activity reconstruction.

Released: October 2025

CortexMAE is a foundation model for functional magnetic resonance imaging (fMRI) that learns general-purpose representations of human brain activity through self-supervised pretraining. Introduced in the paper "Scaling Vision Transformers for Functional MRI with Flat Maps" (arXiv:2510.13768, October 2025; accepted at ICML 2026), it was developed by Connor Lane, Paul S. Scotti, Tanishq M. Abraham, and collaborators at Sophont and the Medical AI Research Center (MedARC). The work addresses a central question in computational neuroimaging: whether the scaling laws that drive progress in vision and language models also hold for brain activity data.

The core insight is representational. Rather than processing 4D fMRI volumes directly or collapsing them into coarse parcellations, CortexMAE projects cortical activity onto 2D flat maps and treats the resulting time series as short videos. This recasts an awkward neuroimaging problem as a familiar spatiotemporal computer-vision task, allowing the authors to apply a standard masked autoencoder (MAE) framework with Vision Transformer backbones. The model is trained to reconstruct masked spatiotemporal patches of cortical activity.

CortexMAE demonstrates that masked fMRI modeling improves with dataset size according to a strict power law, providing some of the clearest evidence to date that brain-activity foundation models benefit from scale. The release positions it among a growing class of self-supervised neuroimaging models aimed at transferable brain representations.

Key Features

Flat-map video representation: Cortical activity is projected onto 2D flat maps and stacked into video clips, converting volumetric fMRI into a tractable spatiotemporal vision problem.
Masked autoencoder pretraining: A spatiotemporal MAE objective reconstructs masked patches of cortical activity, requiring no task labels during pretraining.
Power-law scaling: Reconstruction performance follows a strict power law with dataset size, with encoder capacity saturating around 37M parameters (ViT-B, depth 9).
Multiple input variants: Three released models cover flat-map (cortex_mae_flat), Schaefer-400 parcellation (cortex_mae_parcel), and MNI cortex volume (cortex_mae_volume) representations for direct comparison.
Brainmarks evaluation: A companion benchmark suite measures both subject-level trait decoding and fine-grained brain-state decoding across multiple public datasets.

Technical Details

CortexMAE pretrains Vision Transformers using a spatiotemporal masked autoencoder on roughly 2.1K hours of fMRI from the Human Connectome Project Young Adult study. The primary released checkpoints use a ViT-B encoder; scaling experiments span data subsets from 400K to 6.6M frames and show power-law improvements in masked reconstruction loss, with diminishing returns beyond about 37M encoder parameters. Downstream evaluation through the Brainmarks suite covers state-prediction tasks (HCP-YA 21-class cognitive task decoding, NSD COCO 24-category visual decoding) and trait-prediction tasks (ABIDE, ADHD-200, ADNI, PPMI, age/sex). The authors report strong, robust performance on state decoding, while noting that on subject-trait prediction the learned features struggle to clearly beat a simple functional-connectivity baseline, and that generalization to out-of-distribution data (NSD) scales more weakly than in-distribution reconstruction. Code is released under Apache 2.0; pretrained weights are available on HuggingFace under CC-BY-NC 4.0.

Applications

CortexMAE provides reusable cortical-activity embeddings for neuroimaging researchers who would otherwise train task-specific models from scratch on limited data. Its representations support brain-state decoding (identifying cognitive tasks or perceived stimuli from activity) and exploratory analysis of subject-level traits, and the released variants let practitioners compare flat-map, parcellation, and volumetric pipelines on their own datasets. The accompanying Brainmarks benchmark gives the field a standardized way to evaluate fMRI foundation models across clinical and cognitive-neuroscience datasets.

Impact

CortexMAE is among the first fMRI models to establish empirical scaling laws for masked brain-activity modeling, lending quantitative support to the foundation-model paradigm in neuroimaging. By framing cortical activity as flat-map video and open-sourcing code, weights, and a benchmark suite, the work lowers the barrier for transfer learning on notoriously data-scarce fMRI. Its candid reporting of where representations help (state decoding) and where they do not yet beat simple baselines (trait prediction, out-of-distribution generalization) offers a realistic roadmap for the next generation of brain foundation models.

Citation

Scaling Vision Transformers for Functional MRI with Flat Maps

Preprint

Lane, C., et al. (2025) Scaling Vision Transformers for Functional MRI with Flat Maps. arXiv.org.

DOI: 10.48550/arXiv.2510.13768

Recent citations

Papers that recently cited this model.

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics
Sam Gijsen, M. Lukomski, Marc-Andre Schulz, et al.
Jun 2026
0
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding
Mu Nan, Muquan Yu, Weijian Mai, et al.
Apr 2026
1

Top citations

The most-cited papers that cite this model.

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding
Mu Nan, Muquan Yu, Weijian Mai, et al.
Apr 2026
1
Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics
Sam Gijsen, M. Lukomski, Marc-Andre Schulz, et al.
Jun 2026
0

Citations

Total Citations2

Influential0

References135

GitHub

Stars53

Forks17

Open Issues0

Contributors2

Last Push27d ago

LanguageJupyter Notebook

LicenseApache-2.0

HuggingFace

Downloads107

Likes1

Last Modified29d ago

Fields of citing research

Biology100%
Computer Science100%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

68Partial

Usability — can I run it?72

Reproducibility — can I retrain it?62

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Flat-map video representation: Cortical activity is projected onto 2D flat maps and stacked into video clips, converting volumetric fMRI into a tractable spatiotemporal vision problem.

Masked autoencoder pretraining: A spatiotemporal MAE objective reconstructs masked patches of cortical activity, requiring no task labels during pretraining.

Power-law scaling: Reconstruction performance follows a strict power law with dataset size, with encoder capacity saturating around 37M parameters (ViT-B, depth 9).

Multiple input variants: Three released models cover flat-map (cortex_mae_flat), Schaefer-400 parcellation (cortex_mae_parcel), and MNI cortex volume (cortex_mae_volume) representations for direct comparison.

Brainmarks evaluation: A companion benchmark suite measures both subject-level trait decoding and fine-grained brain-state decoding across multiple public datasets.

Technical Details

Applications

Impact

CortexMAE

Key Features

Technical Details

Applications

Impact

Citation

Scaling Vision Transformers for Functional MRI with Flat Maps

Recent citations

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Top citations

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

CortexMAE

Key Features

Technical Details

Applications

Impact

Citation

Scaling Vision Transformers for Functional MRI with Flat Maps

Recent citations

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Top citations

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

CortexMAE

#Key Features

#Technical Details

#Applications

#Impact

Citation

Scaling Vision Transformers for Functional MRI with Flat Maps

Recent citations

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Top citations

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

CortexMAE

#Key Features

#Technical Details

#Applications

#Impact

Citation

Scaling Vision Transformers for Functional MRI with Flat Maps

Recent citations

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Top citations

Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact