BrainOmni

Tsinghua University / Shanghai AI Laboratory / University of Cambridge / University College London

Brain foundation model unifying EEG and MEG in a single encoder via a shared discrete tokenizer that transfers across sensor layouts and montages.

Released: May 2025

Parameters: 33 Million

Electroencephalography (EEG) and magnetoencephalography (MEG) both measure the electromagnetic activity of populations of neurons, but they are typically modeled in isolation: EEG and MEG use different sensors, different channel layouts, and different recording montages, so a decoder trained on one configuration rarely transfers to another. This fragmentation has limited brain-signal modeling to small, dataset-specific networks, even as foundation models have transformed genomics, proteins, and natural language. BrainOmni addresses this gap by learning a single representation that spans both modalities and arbitrary sensor arrangements.

Introduced in May 2025 and accepted to NeurIPS 2025, BrainOmni is presented by its authors as the first brain foundation model to unify EEG and MEG. It was developed by a collaboration led by Chao Zhang's group at Tsinghua University (the OpenTSLab team), together with Shanghai AI Laboratory, the MRC Cognition and Brain Sciences Unit at the University of Cambridge, and University College London. The model is released openly with code and pretrained weights under an MIT license.

The core idea is a two-stage design: a tokenizer first converts heterogeneous raw recordings into a shared vocabulary of discrete brain tokens, and a transformer is then pretrained on these tokens with a self-supervised objective. This decouples the model from any specific electrode or sensor layout and lets EEG and MEG contribute to the same representation space.

Key Features

Unified EEG and MEG modeling: A single pretrained model ingests both EEG and MEG recordings, rather than requiring separate modality-specific decoders.
BrainTokenizer with sensor encoding: A residual vector-quantization tokenizer (4 codebook levels, 16 latent source variables) discretizes spatiotemporal brain activity, while a dedicated Sensor Encoder handles varying channel counts and montages.
Criss-cross transformer: BrainOmni applies separate spatial and temporal attention pathways so that both the sensor geometry and the time dynamics of brain signals are captured.
Self-supervised pretraining at scale: The model is pretrained on 1,997 hours of EEG and 656 hours of MEG curated from public datasets, with no task labels required.
Open release in two sizes: Tiny (8.4M parameters) and base (33M parameters) checkpoints, plus the tokenizer, are available on Hugging Face under an MIT license.

Technical Details

BrainOmni separates signal quantization from semantic learning. BrainTokenizer encodes raw multi-channel recordings into discrete tokens using residual vector quantization with four codebook levels and 16 latent source variables, and a Sensor Encoder maps differing electrode or sensor configurations into a common space so heterogeneous datasets can be tokenized consistently. The BrainOmni backbone is a criss-cross transformer with distinct spatial and temporal attention, pretrained self-supervised over 2,653 total hours (1,997 EEG, 656 MEG) drawn from sources such as TUAB and numerous OpenNeuro datasets. Two sizes are released: 8.4M-parameter tiny and 33M-parameter base. On downstream benchmarks the base model reports balanced accuracies of 0.828 on AD65 (vs. 0.711 for LaBraM) and 0.819 on TUAB for EEG, 0.651 on the ASD74 MEG task, and 0.832 on a multimodal SomatoMotor benchmark, an improvement of roughly 20% over specialized baselines.

Applications

BrainOmni is intended as a general-purpose backbone for neural decoding across both EEG and MEG. Researchers can fine-tune it for clinical and cognitive tasks such as detecting Alzheimer's disease or autism spectrum signatures, abnormal-EEG screening, motor-imagery classification for brain-computer interfaces, and somatomotor decoding. Because the tokenizer normalizes across montages and sensors, the model is especially useful when combining datasets recorded on different hardware, or when a target task has too little labeled data to train a decoder from scratch. The open MIT-licensed weights make it practical to adopt in academic neuroscience and BCI pipelines.

Impact

By demonstrating that EEG and MEG can share a single tokenized representation, BrainOmni extends the foundation-model paradigm to multimodal neurophysiology and provides a route to pooling the field's fragmented public datasets. Its reported gains over modality-specific foundation models such as LaBraM and CBraMod, together with strong multimodal transfer, suggest that cross-modality pretraining is a productive direction for brain-signal modeling. As a recent NeurIPS 2025 contribution, its broader influence is still emerging, and the released checkpoints are modest in size (8.4M-33M parameters) compared with foundation models in other domains, leaving headroom for scaling in future work.

Citation

BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals

Preprint

Xiao, Q., et al. (2025) BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals. arXiv.org.

DOI: 10.48550/arXiv.2505.18185

Recent citations

Papers that recently cited this model.

BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language
Haitao Wu, Qirui Zhang, Zhouheng Yao, et al.
Jun 2026
0
Stimulus identity rather than emotion drives EEG classification on the FACED dataset
M. Gerster, E. Sirotina, A. Orlovskii, et al.
bioRxiv · Jun 2026
0
Next-Token Prediction Learns Generalisable Representations of Sleep Physiology
Jonathan Carter, Lionel Tarassenko
Jun 2026
0Influential

Top citations

The most-cited papers that cite this model.

DeeperBrain: A Neuro-Grounded EEG Foundation Model Towards Universal BCI
Jiquan Wang, Sha Zhao, Yangxuan Zhou, et al.
arXiv.org · Jan 2026
7
Neural Codecs as Biosignal Tokenizers
Kleanthis Avramidis, Tiantian Feng, Woojae Jeong, et al.
arXiv.org · Oct 2025
7
NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models
Konstantinos Barmpas, Na Lee, Dimitrios Chalatsis, et al.
Oct 2025
6
Brain4FMs: A Benchmark of Foundation Models for Electrical Brain Signal
Fanqi Shen, En-Hui Yang, Jiahe Li, et al.
arXiv.org · Feb 2026
4
Wearable Foundation Models Should Go Beyond Static Encoders
Y. Wu, Yuwei Zhang, Hyungjun Yoon, et al.
Mar 2026
2

Citations

Total Citations28

Influential3

References79

GitHub

Stars71

Forks10

Open Issues1

Contributors1

Last Push7mo ago

LanguageJupyter Notebook

HuggingFace

Downloads0

Likes3

Last Modified9mo ago

Fields of citing research

Computer Science96%
Medicine44%
Engineering30%
Biology26%
Physics4%
Linguistics4%
Education4%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

80Open

Usability — can I run it?95

Reproducibility — can I retrain it?66

Model Openness Framework

Unclassified

Missing required components

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Unified EEG and MEG modeling: A single pretrained model ingests both EEG and MEG recordings, rather than requiring separate modality-specific decoders.

BrainTokenizer with sensor encoding: A residual vector-quantization tokenizer (4 codebook levels, 16 latent source variables) discretizes spatiotemporal brain activity, while a dedicated Sensor Encoder handles varying channel counts and montages.

Criss-cross transformer: BrainOmni applies separate spatial and temporal attention pathways so that both the sensor geometry and the time dynamics of brain signals are captured.

Self-supervised pretraining at scale: The model is pretrained on 1,997 hours of EEG and 656 hours of MEG curated from public datasets, with no task labels required.

Open release in two sizes: Tiny (8.4M parameters) and base (33M parameters) checkpoints, plus the tokenizer, are available on Hugging Face under an MIT license.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language

Haitao Wu, Qirui Zhang, Zhouheng Yao, et al.

Jun 2026

Stimulus identity rather than emotion drives EEG classification on the FACED dataset

M. Gerster, E. Sirotina, A. Orlovskii, et al.

bioRxiv · Jun 2026

Next-Token Prediction Learns Generalisable Representations of Sleep Physiology

Jonathan Carter, Lionel Tarassenko

Jun 2026

0Influential

Top citations

The most-cited papers that cite this model.

DeeperBrain: A Neuro-Grounded EEG Foundation Model Towards Universal BCI

Jiquan Wang, Sha Zhao, Yangxuan Zhou, et al.

arXiv.org · Jan 2026

Neural Codecs as Biosignal Tokenizers

Kleanthis Avramidis, Tiantian Feng, Woojae Jeong, et al.

arXiv.org · Oct 2025

NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models

Konstantinos Barmpas, Na Lee, Dimitrios Chalatsis, et al.

Oct 2025

Brain4FMs: A Benchmark of Foundation Models for Electrical Brain Signal

Fanqi Shen, En-Hui Yang, Jiahe Li, et al.

arXiv.org · Feb 2026

Wearable Foundation Models Should Go Beyond Static Encoders

Y. Wu, Yuwei Zhang, Hyungjun Yoon, et al.

Mar 2026

BrainOmni

#Key Features

#Technical Details

#Applications

#Impact

Citation

BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals

Recent citations

BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language

Next-Token Prediction Learns Generalisable Representations of Sleep Physiology

Top citations

NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models

Wearable Foundation Models Should Go Beyond Static Encoders

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

BrainOmni

#Key Features

#Technical Details

#Applications

#Impact

Citation

BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals

Recent citations

BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language

Next-Token Prediction Learns Generalisable Representations of Sleep Physiology

Top citations

NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models

Wearable Foundation Models Should Go Beyond Static Encoders

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact