BrainSymphony

5.6M-parameter multimodal foundation model fusing fMRI time series with diffusion-MRI structural connectivity in a shared ROI embedding space.

Released: June 2025

Parameters: 5.6 Million

BrainSymphony is a multimodal foundation model for human brain dynamics that jointly represents functional MRI (fMRI) time series and diffusion-MRI-derived structural connectivity in a unified region-of-interest (ROI) embedding space. It was developed by Moein Khajehnejad, Forough Habibollahi, Devon Stoliker, and Adeel Razi at Monash University's Turner Institute for Brain and Mental Health, with the preprint first posted to arXiv in June 2025.

The central problem the model addresses is the data and compute cost of existing brain foundation models. Whereas systems such as BrainLM (111M parameters) and Brain-JEPA (85M parameters) require large pretraining corpora, BrainSymphony is deliberately lightweight, totalling roughly 5.6 million parameters, and is designed to perform competitively when labelled neuroimaging data are scarce. It offers plug-and-play integration of functional and structural modalities: it can be trained and deployed in unimodal or multimodal form without architectural changes, making it adaptable to datasets that contain only one imaging type.

By combining temporal brain signals with anatomical wiring, BrainSymphony sits at the intersection of medical imaging and biosignal modelling, contributing to a growing class of neuro foundation models aimed at decoding individual differences, cognitive states, and clinical phenotypes from non-invasive brain recordings.

Key Features

Parameter efficiency: At ~5.6M parameters, the full multimodal model is roughly 15-20x smaller than comparable brain foundation models while matching or exceeding their benchmark performance.
Dual-stream functional encoder: fMRI time series are processed through parallel spatial and temporal transformer streams, then distilled into compact embeddings by a Perceiver module.
Signed graph transformer: A graph transformer encodes anatomical connectivity from diffusion MRI, explicitly handling positively and negatively weighted edges in the structural connectome.
Adaptive fusion: A learned fusion mechanism combines the complementary functional and structural representations, supporting unimodal or multimodal use without re-architecting.
Released checkpoints: Pretrained functional and structural branch checkpoints ship with the Apache-2.0 codebase, with scripts for loading and pretraining on new datasets.

Technical Details

The model operates over a 450-ROI parcellation comprising 400 Schaefer cortical regions plus 50 Tian Scale-III subcortical regions. Pretraining used the Human Connectome Project Young Adult cohort (967 participants, aged 22-35) and HCP-Aging (262 participants for pretraining, 394 held out for testing), with external validation on the PsiConnect psilocybin dataset (54 participants after quality control). Reported results on HCP-Aging include 94.04% accuracy (F1 = 0.933) on sex classification and an age-prediction correlation of rho = 0.841 (MSE = 0.363). In unsupervised functional-network recovery the embeddings reach 84.44% classification accuracy, outperforming raw time series, VAE, and GCN baselines, and BOLD reconstruction achieves a mean R-squared of 0.438 across ROIs.

Applications

BrainSymphony targets computational neuroscience and clinical neuroimaging researchers who need expressive brain representations from limited data. Its embeddings support phenotype prediction (age, sex), disease and state classification, and unsupervised discovery of functional networks, and the model's pharmacological validation on psilocybin data suggests utility for studying drug-induced brain-state changes. Because it runs in unimodal or multimodal configurations, it can be applied to legacy datasets containing only fMRI or only diffusion MRI, lowering the barrier for smaller labs to adopt foundation-model workflows.

Impact

BrainSymphony contributes to the argument that brain foundation models need not be large to be effective, demonstrating that careful architectural design (Perceiver distillation, signed graph attention, adaptive fusion) can rival far larger models on neuroimaging benchmarks. By releasing code and pretrained checkpoints under an Apache-2.0 license, the authors make a reproducible, extensible baseline available to the neuro-AI community. As a 2025-2026 preprint its long-term adoption is still emerging, and its evaluation centers on HCP-derived cohorts, so broader generalization to clinical populations and other scanners remains to be established.

Citation

BrainSymphony: A parameter-efficient multimodal foundation model for brain dynamics with limited data

Preprint

Khajehnejad, M., et al. (2025) BrainSymphony: A parameter-efficient multimodal foundation model for brain dynamics with limited data.

DOI: 10.48550/arXiv.2506.18314

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References44

GitHub

Stars2

Forks1

Open Issues0

Contributors1

Last Push5mo ago

LanguagePython

LicenseApache-2.0

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

71Open

Usability — can I run it?95

Reproducibility — can I retrain it?43

Model Openness Framework

Unclassified

Missing required components

Resources

GitHub Repository Research Paper

Key Features

Parameter efficiency: At ~5.6M parameters, the full multimodal model is roughly 15-20x smaller than comparable brain foundation models while matching or exceeding their benchmark performance.

Dual-stream functional encoder: fMRI time series are processed through parallel spatial and temporal transformer streams, then distilled into compact embeddings by a Perceiver module.

Signed graph transformer: A graph transformer encodes anatomical connectivity from diffusion MRI, explicitly handling positively and negatively weighted edges in the structural connectome.

Adaptive fusion: A learned fusion mechanism combines the complementary functional and structural representations, supporting unimodal or multimodal use without re-architecting.

Released checkpoints: Pretrained functional and structural branch checkpoints ship with the Apache-2.0 codebase, with scripts for loading and pretraining on new datasets.

Technical Details

Applications

Impact

BrainSymphony

Key Features

Technical Details

Applications

Impact

Citation

BrainSymphony: A parameter-efficient multimodal foundation model for brain dynamics with limited data

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

BrainSymphony

Key Features

Technical Details

Applications

Impact

Citation

BrainSymphony: A parameter-efficient multimodal foundation model for brain dynamics with limited data

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

BrainSymphony

#Key Features

#Technical Details

#Applications

#Impact

Citation

BrainSymphony: A parameter-efficient multimodal foundation model for brain dynamics with limited data

Recent citations

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

BrainSymphony

#Key Features

#Technical Details

#Applications

#Impact

Citation

BrainSymphony: A parameter-efficient multimodal foundation model for brain dynamics with limited data

Recent citations

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact