Harbin Institute of Technology (Shenzhen) / Peng Cheng Laboratory
Self-supervised foundation model for functional brain network analysis from fMRI, pretrained on ~70K samples across 30 datasets for disorder diagnosis.
BrainMass is a self-supervised foundation model for analyzing functional brain networks derived from resting-state functional MRI (fMRI). Brain network analysis — modeling the brain as a graph of regions connected by functional correlations — is a powerful tool for studying neurological and psychiatric disorders, but individual studies are typically small, heterogeneous, and expensive to label. This data scarcity has historically limited the generalizability of deep learning models in the field. BrainMass addresses this by pretraining a single Transformer encoder on a large, pooled corpus and transferring it to many downstream diagnostic tasks with little or no task-specific supervision.
Developed by researchers at the Harbin Institute of Technology (Shenzhen) and the Peng Cheng Laboratory, the model was introduced in a March 2024 preprint and published in IEEE Transactions on Medical Imaging later that year. The authors position BrainMass as the first foundation model dedicated to brain network analysis, aggregating 70,781 samples from 46,686 participants across 30 datasets to learn broadly transferable representations of functional connectivity.
By learning general-purpose embeddings of brain networks, BrainMass aims to streamline a wide range of clinical and neuroscience workflows that would otherwise each require their own annotated training set, while also surfacing interpretable biomarkers tied to specific disorders.
BrainMass operates on functional connectivity matrices computed from parcellated resting-state fMRI, where each node is a region of interest (ROI) and edges encode BOLD-signal correlations. The model is a Transformer encoder pretrained with two complementary self-supervised objectives: Mask-ROI Modeling, which masks and reconstructs ROI-level features to capture intra-network structure, and Latent Representation Alignment, which aligns embeddings of augmented network views. The pseudo-functional connectivity augmentation generates millions of synthetic networks by randomly dropping timepoints from the underlying BOLD time series, supplying the diversity needed for large-scale pretraining. Pretraining draws on 70,781 samples from 46,686 participants pooled across 30 public and in-house datasets. The authors evaluate on eight internal tasks and seven external brain-disorder diagnosis tasks, reporting superior performance over task-specific baselines along with strong few-shot and zero-shot transfer.
BrainMass is aimed at computational neuroscientists and clinical researchers working with resting-state fMRI to study and diagnose brain disorders such as autism, Parkinson's disease, and other neurological and psychiatric conditions. Because the pretrained encoder transfers across sites and cohorts, it is especially useful for small studies that lack the data to train a model from scratch, and its few/zero-shot behavior allows rapid prototyping of new diagnostic tasks. Beyond classification, the model's attention over ROIs can highlight candidate biomarkers, supporting interpretability-focused analyses of disorder-specific connectivity patterns.
BrainMass demonstrates that the foundation-model paradigm — large-scale self-supervised pretraining followed by lightweight transfer — extends naturally to functional brain networks, a domain long hampered by fragmented, small datasets. By pooling tens of thousands of scans and releasing code and pretrained weights, it provides a reusable starting point that lowers the barrier to building robust fMRI-based diagnostic models and contributes to a growing wave of neuroimaging foundation models (alongside efforts such as BrainLM). Its emphasis on interpretable biomarkers and zero-shot generalization is particularly relevant for translational applications where labeled clinical data are scarce.
Yang, Y., et al. (2024) BrainMass: Advancing Brain Network Analysis for Diagnosis With Large-Scale Self-Supervised Learning. IEEE Transactions on Medical Imaging.
DOI: 10.1109/TMI.2024.3414476Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data