A region-aware masked-autoencoder framework that learns self-supervised representations directly from fMRI time-series via per-ROI embeddings and graph attention.
BrainMAE (Brain Masked Auto-Encoder) is a self-supervised learning framework for functional MRI that learns representations directly from blood-oxygen-level-dependent (BOLD) time-series rather than from precomputed connectivity matrices. It was introduced in a June 2024 preprint by Yifan Yang, Yutong Mao, Xufu Liu, and Xiao Liu in the Department of Biomedical Engineering and the Institute for Computational and Data Sciences at the Pennsylvania State University.
The model targets a long-standing tension in fMRI analysis. Static functional-connectivity methods (Fixed-FC) summarize an entire scan into one correlation matrix and discard temporal dynamics, while dynamic-connectivity methods (Dynamic-FC) capture those dynamics but are highly sensitive to the substantial noise in fMRI. BrainMAE sidesteps this trade-off by treating the recording as a sequence of "transient brain states" and learning to model their temporal structure with a masked-reconstruction objective, retaining dynamics while staying robust to noise.
Its central design choice is to give every brain region of interest (ROI) its own learnable embedding, analogous to word embeddings in natural language processing. These region embeddings inject neuroscientific prior structure into the network and, after pretraining, recover relationships between regions that align with known functional brain networks, yielding interpretable representations alongside strong predictive performance.
BrainMAE parcellates the cortex with the Schaefer2018 100-ROI atlas and encodes BOLD activity through learnable per-region embeddings. Transient State Encoders convert short windows of activity into state embeddings using region-aware graph attention; the authors report two variants, SG-TSE (three blocks of pure graph attention) and AG-TSE (two self-attention blocks over one graph-attention block). A transformer encoder-decoder (two standard transformer blocks each) is then pretrained with the masked-autoencoding objective. Pretraining and evaluation draw on Human Connectome Project data—HCP-3T (897 subjects, 3,422 sessions), HCP-7T (184 subjects), and HCP-Aging (725 subjects)—plus the Natural Scenes Dataset (NSD). Across four downstream tasks the model reports gains over prior methods: 97.49% gender-classification accuracy on HCP-3T (vs. 94.11% for BrainNetTF-OCR), 92.67% accuracy on HCP-Aging age prediction (vs. 88.83% for BrainNetCNN), and a 95.59% macro F1 for transient mental-state decoding (vs. 92.0% for the CSM baseline), alongside cognitive and task-performance prediction.
BrainMAE is aimed at neuroscientists and computational researchers working with resting-state and task fMRI who need representations that transfer across analyses without per-task feature engineering. Demonstrated use cases include predicting demographic and cognitive variables (gender, age, and behavioral measures), estimating task performance such as memory scores and response times, and decoding transient mental states during cognitive tasks. Because pretraining requires no labels, the framework is well suited to leveraging large unlabeled fMRI archives and then fine-tuning on smaller labeled cohorts—useful for biomarker discovery and individual-difference studies where labeled scans are scarce.
BrainMAE contributes to a growing line of work extending the foundation-model and masked-autoencoder paradigms from images and language to brain signals, and is cited in recent surveys of brain foundation models. Its emphasis on learnable region embeddings and a time-series-native masked objective offers an alternative to connectivity-matrix pipelines that dominate fMRI machine learning, with the added benefit of interpretable, neuroscience- aligned representations. As a notable limitation, no public code repository or pretrained weights were released with the preprint, which constrains immediate reproducibility and downstream adoption despite the reported gains across four benchmark tasks.
Yang, Y., et al. (2024) BrainMAE: A Region-aware Self-supervised Learning Framework for Brain Signals. arXiv.org.
DOI: 10.48550/arXiv.2406.17086Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data