BrainMAE

Self-supervised masked autoencoder for functional MRI that learns representations from BOLD time-series with per-ROI embeddings and graph attention.

Released: June 2024

BrainMAE (Brain Masked Auto-Encoder) is a self-supervised learning framework for functional MRI that learns representations directly from blood-oxygen-level-dependent (BOLD) time-series rather than from precomputed connectivity matrices. It was introduced in a June 2024 preprint by Yifan Yang, Yutong Mao, Xufu Liu, and Xiao Liu in the Department of Biomedical Engineering and the Institute for Computational and Data Sciences at the Pennsylvania State University.

The model targets a long-standing tension in fMRI analysis. Static functional-connectivity methods (Fixed-FC) summarize an entire scan into one correlation matrix and discard temporal dynamics, while dynamic-connectivity methods (Dynamic-FC) capture those dynamics but are highly sensitive to the substantial noise in fMRI. BrainMAE sidesteps this trade-off by treating the recording as a sequence of "transient brain states" and learning to model their temporal structure with a masked-reconstruction objective, retaining dynamics while staying robust to noise.

Its central design choice is to give every brain region of interest (ROI) its own learnable embedding, analogous to word embeddings in natural language processing. These region embeddings inject neuroscientific prior structure into the network and, after pretraining, recover relationships between regions that align with known functional brain networks, yielding interpretable representations alongside strong predictive performance.

Key Features

Region-aware graph attention: A graph attention mechanism operates over ROI embeddings to capture relationships between brain regions, replacing hand-engineered connectivity features with learned ones.
Learnable ROI embeddings: Each cortical region is assigned its own embedding vector, injecting region identity as prior knowledge and producing representations that map onto established functional networks.
Masked-autoencoding pretraining: The framework masks segments of the transient-state sequence and reconstructs them, enabling label-free pretraining on large unlabeled fMRI collections.
Dynamics with noise robustness: By modeling sequences of transient states, BrainMAE preserves temporal information that Fixed-FC discards while avoiding the noise sensitivity of Dynamic-FC.
Interpretable structure: Learned ROI embeddings recover meaningful inter-region relationships, supporting neuroscientific interpretation rather than black-box prediction.

Technical Details

BrainMAE parcellates the cortex with the Schaefer2018 100-ROI atlas and encodes BOLD activity through learnable per-region embeddings. Transient State Encoders convert short windows of activity into state embeddings using region-aware graph attention; the authors report two variants, SG-TSE (three blocks of pure graph attention) and AG-TSE (two self-attention blocks over one graph-attention block). A transformer encoder-decoder (two standard transformer blocks each) is then pretrained with the masked-autoencoding objective. Pretraining and evaluation draw on Human Connectome Project data—HCP-3T (897 subjects, 3,422 sessions), HCP-7T (184 subjects), and HCP-Aging (725 subjects)—plus the Natural Scenes Dataset (NSD). Across four downstream tasks the model reports gains over prior methods: 97.49% gender-classification accuracy on HCP-3T (vs. 94.11% for BrainNetTF-OCR), 92.67% accuracy on HCP-Aging age prediction (vs. 88.83% for BrainNetCNN), and a 95.59% macro F1 for transient mental-state decoding (vs. 92.0% for the CSM baseline), alongside cognitive and task-performance prediction.

Applications

BrainMAE is aimed at neuroscientists and computational researchers working with resting-state and task fMRI who need representations that transfer across analyses without per-task feature engineering. Demonstrated use cases include predicting demographic and cognitive variables (gender, age, and behavioral measures), estimating task performance such as memory scores and response times, and decoding transient mental states during cognitive tasks. Because pretraining requires no labels, the framework is well suited to leveraging large unlabeled fMRI archives and then fine-tuning on smaller labeled cohorts—useful for biomarker discovery and individual-difference studies where labeled scans are scarce.

Impact

BrainMAE contributes to a growing line of work extending the foundation-model and masked-autoencoder paradigms from images and language to brain signals, and is cited in recent surveys of brain foundation models. Its emphasis on learnable region embeddings and a time-series-native masked objective offers an alternative to connectivity-matrix pipelines that dominate fMRI machine learning, with the added benefit of interpretable, neuroscience- aligned representations. As a notable limitation, no public code repository or pretrained weights were released with the preprint, which constrains immediate reproducibility and downstream adoption despite the reported gains across four benchmark tasks.

Citation

BrainMAE: A Region-aware Self-supervised Learning Framework for Brain Signals

Preprint

Yang, Y., et al. (2024) BrainMAE: A Region-aware Self-supervised Learning Framework for Brain Signals. arXiv.org.

DOI: 10.48550/arXiv.2406.17086

Recent citations

Papers that recently cited this model.

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs
Junyu Pan, Yansen Wang, Enze Zhang, et al.
May 2026
0
Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction
Yujie Wei, Chenglong Ma, Jianxiong Gao, et al.
May 2026
0
SCMA: A Subnetwork-Aware Contextual Masked Autoencoder for Self-Supervised Learning of rs-fMRI in Autism Spectrum Disorder
Yuchu Chen, Hairui Chen, Ying Li, et al.
IEEE International Conference on Acoustics, Speech, and Signal Processing · May 2026
0

Top citations

The most-cited papers that cite this model.

Brain Foundation Models: A survey on advancements in neural signal processing and brain discovery
Xin-qiu Zhou, Chenyu Liu, Zhisheng Chen, et al.
IEEE Signal Processing Magazine · Mar 2025
34
Advances in functional magnetic resonance imaging-based brain function mapping: a deep learning perspective
Ling Zhao
Psychoradiology · Apr 2025
12Influential
Self-Supervised Learning to Unveil Brain Dysfunctional Signatures in Brain Disorders: Methods and Applications
Ying Li, Yanwu Yang, Yuchu Chen, et al.
Health Data Science · Apr 2025
4
BrainNetMLP: An Efficient and Effective Baseline for Functional Brain Network Classification
Jiacheng Hou, Zhenjie Song, E. Kuruoglu
EMA4MICCAI · May 2025
3
Bridging Brain with Foundation Models through Self-Supervised Learning
Hamdi Altaheri, Fakhri Karray, Md. Milon Islam, et al.
arXiv.org · Jun 2025
1

Citations

Total Citations10

Influential1

References58

Fields of citing research

Computer Science100%
Medicine40%
Biology10%
Engineering10%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

17Closed

Usability — can I run it?14

Reproducibility — can I retrain it?6

Model Openness Framework

Unclassified

Missing required components

Resources

Research Paper

Key Features

Region-aware graph attention: A graph attention mechanism operates over ROI embeddings to capture relationships between brain regions, replacing hand-engineered connectivity features with learned ones.

Learnable ROI embeddings: Each cortical region is assigned its own embedding vector, injecting region identity as prior knowledge and producing representations that map onto established functional networks.

Masked-autoencoding pretraining: The framework masks segments of the transient-state sequence and reconstructs them, enabling label-free pretraining on large unlabeled fMRI collections.

Dynamics with noise robustness: By modeling sequences of transient states, BrainMAE preserves temporal information that Fixed-FC discards while avoiding the noise sensitivity of Dynamic-FC.

Interpretable structure: Learned ROI embeddings recover meaningful inter-region relationships, supporting neuroscientific interpretation rather than black-box prediction.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Junyu Pan, Yansen Wang, Enze Zhang, et al.

May 2026

Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction

Yujie Wei, Chenglong Ma, Jianxiong Gao, et al.

May 2026

SCMA: A Subnetwork-Aware Contextual Masked Autoencoder for Self-Supervised Learning of rs-fMRI in Autism Spectrum Disorder

Yuchu Chen, Hairui Chen, Ying Li, et al.

IEEE International Conference on Acoustics, Speech, and Signal Processing · May 2026

BrainMAE

#Key Features

#Technical Details

#Applications

#Impact

Citation

BrainMAE: A Region-aware Self-supervised Learning Framework for Brain Signals

Recent citations

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

BrainMAE

#Key Features

#Technical Details

#Applications

#Impact

Citation

BrainMAE: A Region-aware Self-supervised Learning Framework for Brain Signals

Recent citations

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact