MMM

Microsoft / South China University of Technology

EEG pretraining framework mapping any electrode montage to a unified topology for topology-agnostic representations that transfer across datasets.

Released: December 2023

Scalp electroencephalography (EEG) is a rich, abundant, and largely unlabeled signal, which makes it a natural candidate for the kind of large-scale self-supervised pretraining that has transformed vision and language. A persistent obstacle, however, is heterogeneity of acquisition: different EEG datasets use different numbers of electrodes, placed at different positions according to different montages. Models trained on one channel configuration typically cannot ingest data recorded with another, which fragments the available data and prevents the assembly of a single large pretraining corpus.

MMM (named for its Multi-dimensional position encoding, Multi-level channel hierarchy, and Multi-stage pretraining) addresses this problem by mapping every channel selection onto a single unified electrode topology, so that recordings from incompatible montages can be pretrained together. The result is a topology-agnostic representation that transfers across datasets regardless of the original electrode layout. MMM was introduced by Ke Yi, Yansen Wang, Kan Ren, and Dongsheng Li at Microsoft Research Asia (the first author worked on it as an intern affiliated with South China University of Technology) and presented at NeurIPS 2023.

The framework is built as a masked autoencoder, learning to reconstruct deliberately hidden portions of the EEG signal and thereby acquiring representations that capture the spatial and structural regularities of brain activity. By unifying topology rather than restricting itself to a fixed sensor set, MMM offers a route toward genuinely reusable EEG foundation models.

Key Features

Unified topology: All channel selections, regardless of montage or electrode count, are projected onto one common electrode topology, allowing datasets with different configurations to be pretrained jointly.
Multi-dimensional position encoding: Geometric and spatial information about electrode locations is injected directly into channel tokens, giving the model an explicit sense of where each signal originates on the scalp.
Multi-level channel hierarchy: Aggregated regional tokens are modeled alongside individual channel tokens, enabling the network to reason about local channels and broader brain regions simultaneously.
Multi-stage pretraining: Training alternates between global random masking and regional masking, which together encourage robust reconstruction even at high mask ratios where standard masked autoencoders degrade.
Cross-dataset transfer: Because representations are topology-agnostic, a model pretrained on one corpus can be fine-tuned on downstream datasets with entirely different electrode setups.

Technical Details

MMM uses a masked-autoencoder architecture with a transformer encoder-decoder bottleneck. Input EEG is represented as differential entropy (DE) features per channel; a subset of channel-time tokens is masked, the encoder produces a unified representation from the visible tokens, and a lightweight decoder reconstructs the masked entries. The multi-stage schedule applies global random masking and regional masking in sequence so that the encoder learns both fine-grained and region-level structure, sustaining high reconstruction quality at aggressive masking ratios. The released base encoder (tuh_pretrained_encoder_base.pt) is pretrained on the large Temple University Hospital (TUH) EEG corpus and distributed through the project page. On the SEED and SEED-IV emotion-recognition benchmarks, MMM reports improvements over prior state-of-the-art EEG representation methods. Reference code is provided in Microsoft's PhysioPro framework under an MIT license; the authors note ongoing investigation of the use of DE features for SEED and work toward training directly on raw EEG signals.

Applications

MMM targets researchers and engineers building EEG decoding systems who must combine or transfer across datasets with mismatched electrode configurations. Its most directly demonstrated application is affective computing, specifically emotion recognition on the SEED and SEED-IV datasets, but the topology-agnostic design generalizes to any downstream EEG task, including brain-computer interfaces, clinical monitoring, and neuroscience analysis. By providing a pretrained base encoder, MMM lowers the labeled-data burden for groups that cannot collect large annotated EEG corpora of their own.

Impact

MMM was one of the early demonstrations that EEG pretraining can be made montage-independent, directly tackling the channel-heterogeneity problem that previously prevented EEG datasets from being pooled. By framing diverse electrode layouts as projections onto a shared topology, it influenced subsequent topology-agnostic EEG foundation models that pursue the same goal of cross-dataset generality. Its open availability through the PhysioPro framework, together with a downloadable TUH-pretrained checkpoint, makes it a practical starting point for transfer learning. Limitations include reliance on differential-entropy features in the reported experiments and evaluation centered on emotion-recognition benchmarks, leaving broader clinical validation and raw-signal pretraining as acknowledged future work.

Citation

Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling

Yi, K., et al. (2023) Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling. Neural Information Processing Systems.

DOI: 10.52202/075280-2344

Recent citations

Papers that recently cited this model.

Learning unified brain region representation for cross-dataset EEG-based emotion recognition
Wei Li, Shaojie Wu, Huafu Xu, et al.
Biomedical Signal Processing and Control · 2026
0
A dual-branch network with brain region-constrained attention for EEG emotion recognition
Chengyun Hua, H. Cao, Zhaolong Li, et al.
Frontiers in Neuroscience · Jul 2026
0
RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning
Jinhan Liu, Mahsa Shoaran
Jun 2026
0Influential

Top citations

The most-cited papers that cite this model.

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI
Wei-Bang Jiang, Li-Ming Zhao, Bao-Liang Lu
International Conference on Learning Representations · May 2024
344
NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
Wei-Bang Jiang, Yansen Wang, Bao-Liang Lu, et al.
International Conference on Learning Representations · Aug 2024
93
EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model
Yuqi Chen, Kan Ren, Kaitao Song, et al.
arXiv.org · Jan 2024
60
ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset
Yilin Wang, Peixuan Lei, Jie Song, et al.
International Conference on Machine Learning · Jun 2025
39
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder
Jiaqi Wang, Zhenxi Song, Zhengyu Ma, et al.
Annual Meeting of the Association for Computational Linguistics · Feb 2024
27

Citations

Total Citations91

Influential3

References29

GitHub

Stars111

Forks15

Open Issues5

Contributors5

Last Push8d ago

LanguagePython

LicenseMIT

Fields of citing research

Computer Science99%
Medicine49%
Engineering48%
Biology14%
Psychology3%
Environmental Science2%
Linguistics1%
Physics1%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

60Partial

Usability — can I run it?71

Reproducibility — can I retrain it?57

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper Official Website Documentation

Key Features

Unified topology: All channel selections, regardless of montage or electrode count, are projected onto one common electrode topology, allowing datasets with different configurations to be pretrained jointly.

Multi-dimensional position encoding: Geometric and spatial information about electrode locations is injected directly into channel tokens, giving the model an explicit sense of where each signal originates on the scalp.

Multi-level channel hierarchy: Aggregated regional tokens are modeled alongside individual channel tokens, enabling the network to reason about local channels and broader brain regions simultaneously.

Multi-stage pretraining: Training alternates between global random masking and regional masking, which together encourage robust reconstruction even at high mask ratios where standard masked autoencoders degrade.

Cross-dataset transfer: Because representations are topology-agnostic, a model pretrained on one corpus can be fine-tuned on downstream datasets with entirely different electrode setups.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

Learning unified brain region representation for cross-dataset EEG-based emotion recognition

Wei Li, Shaojie Wu, Huafu Xu, et al.

Biomedical Signal Processing and Control · 2026

A dual-branch network with brain region-constrained attention for EEG emotion recognition

Chengyun Hua, H. Cao, Zhaolong Li, et al.

Frontiers in Neuroscience · Jul 2026

RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning

Jinhan Liu, Mahsa Shoaran

Jun 2026

0Influential

MMM

#Key Features

#Technical Details

#Applications

#Impact

Citation

Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling

Recent citations

RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

MMM

#Key Features

#Technical Details

#Applications

#Impact

Citation

Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling

Recent citations

RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact