LaBraM

EEG foundation model that learns transferable brain-signal representations with a vector-quantized tokenizer and masked transformer pretraining.

Released: May 2024

LaBraM (Large Brain Model) is a foundation model for electroencephalography (EEG) that learns generic, transferable representations of brain activity rather than being tuned to a single dataset or task. EEG research has long been fragmented: recordings differ in the number and placement of electrodes, sampling rates, and recording duration, which forces most deep-learning approaches to train narrow, dataset-specific models that fail to exploit the growing body of available data. LaBraM addresses this by treating raw EEG channels as patchable signals that can be tokenized and modeled with a transformer, enabling pretraining across many heterogeneous datasets and montages.

Developed by Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu at Shanghai Jiao Tong University, LaBraM was introduced in a paper submitted to arXiv on 29 May 2024 and presented as a Spotlight at ICLR 2024. It is one of the first EEG foundation models to combine a learned discrete tokenizer with large-scale masked pretraining, and it became the predecessor to NeuroLM, the authors' later multimodal EEG-language model.

The core idea is to decouple representation learning from any single downstream objective. By segmenting EEG into channel patches and learning a neural codebook that captures spectral structure, LaBraM can be pretrained self-supervised on unlabeled data and then fine-tuned on diverse brain-computer interface (BCI) tasks with minimal architectural changes.

Key Features

Channel-patch tokenization: Each EEG channel is divided into non-overlapping windows (200 time points = 1 second at 200 Hz), so recordings with arbitrary channel counts and lengths can be encoded into a uniform sequence of patches.
Vector-quantized neural tokenizer: A VQ tokenizer trained by neural spectrum prediction maps continuous EEG patches to discrete codes drawn from an 8192-entry codebook, giving the model a semantically rich, spectrum-aware vocabulary.
Masked-transformer pretraining: Neural transformers are pretrained to predict the original codes of masked EEG patches, a BERT-style objective that learns representations without task labels.
Cross-dataset generalization: Pretraining on roughly 20 datasets lets a single model transfer across abnormal detection, event classification, emotion recognition, and gait prediction.
Scalable model family: Three sizes (Base ~5.8M, Large ~46M, Huge ~369M parameters) allow trading compute for accuracy.

Technical Details

LaBraM uses a two-stage recipe. First, a vector-quantized neural spectrum prediction (VQ-NSP) tokenizer is trained to reconstruct the Fourier amplitude and phase of EEG patches, yielding a codebook of 8192 discrete embeddings (each 64-dimensional). Second, a neural transformer is pretrained with a masked-modeling objective: a subset of channel patches is masked and the model predicts their codebook indices. Pretraining used about 2,500 hours of EEG aggregated from around 20 public datasets. The released checkpoints include the VQ-NSP tokenizer (vqnsp.pth) and the pretrained labram-base transformer. On the TUAB abnormal-detection benchmark, LaBraM-Base reaches 0.814 balanced accuracy and 0.902 AUROC; on the six-class TUEV event-type task it reports 0.641 balanced accuracy, 0.664 Cohen's kappa, and 0.831 weighted F1, outperforming prior task-specific baselines.

Applications

LaBraM targets BCI and clinical EEG workflows where labeled data is scarce but unlabeled recordings are plentiful. After self-supervised pretraining, it can be fine-tuned for pathology screening (e.g., abnormal-EEG detection), seizure and event-type classification, affective computing (emotion recognition), and motor or gait decoding. Because the tokenizer accepts arbitrary channel configurations, researchers can apply a single pretrained backbone to recordings from different hardware without redesigning the model, lowering the barrier to building EEG classifiers in neurology, sleep research, and human-machine interaction.

Impact

LaBraM helped establish the foundation-model paradigm for EEG, showing that a single self-supervised backbone can match or beat bespoke models across several benchmarks. Its ICLR 2024 Spotlight and openly released code and weights (MIT license, 600+ GitHub stars) made it a widely used reference point for subsequent EEG and biosignal foundation models, and it directly seeded the authors' follow-up NeuroLM. Its main limitations mirror the field's: the largest released checkpoint is Base, evaluation centers on a handful of public benchmarks, and cross-subject and cross-montage robustness in deployment remain active research questions.

Citation

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

Preprint

Jiang, W., et al. (2024) Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2405.18765

Recent citations

Papers that recently cited this model.

Self-Supervised Pre-Training for EEG denoising
Yilin Han, Aiping Liu, Heng Cui, et al.
Advanced Engineering Informatics · 2026
0
PGAP: Purity-Guided Active Prompting for EEG Decoding With LLMs [Research Frontier]
Jingwei Luo, Ziwei Wang, Dingkun Liu, et al.
IEEE Computational Intelligence Magazine · Aug 2026
0
Leveraging unlabelled data for generalizable neural population decoding
Ximeng Mao, Nanda H Krishna, Avery Hee-Woon Ryoo, et al.
Jul 2026
0

Top citations

The most-cited papers that cite this model.

EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG Signals
Guangyu Wang, Wenchao Liu, Yuhong He, et al.
Neural Information Processing Systems · 2024
171
CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding
Jiquan Wang, Sha Zhao, Zhiling Luo, et al.
International Conference on Learning Representations · Dec 2024
164Influential
NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
Wei-Bang Jiang, Yansen Wang, Bao-Liang Lu, et al.
International Conference on Learning Representations · Aug 2024
93Influential
REVE: A Foundation Model for EEG - Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects
Yassine El Ouahidi, Jonathan Lys, Philipp Thölke, et al.
arXiv.org · Oct 2025
44Influential
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model
Jiarui Jin, Haoyu Wang, Hongyan Li, et al.
International Conference on Learning Representations · Feb 2025
42

Citations

Total Citations398

Influential109

References71

GitHub

Stars646

Forks120

Open Issues54

Contributors2

Last Push10mo ago

LanguagePython

LicenseMIT

Fields of citing research

Computer Science98%
Medicine53%
Engineering44%
Biology17%
Psychology3%
Physics3%
Linguistics2%
Mathematics1%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

72Open

Usability — can I run it?94

Reproducibility — can I retrain it?64

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper Official Website

Key Features

Channel-patch tokenization: Each EEG channel is divided into non-overlapping windows (200 time points = 1 second at 200 Hz), so recordings with arbitrary channel counts and lengths can be encoded into a uniform sequence of patches.

Vector-quantized neural tokenizer: A VQ tokenizer trained by neural spectrum prediction maps continuous EEG patches to discrete codes drawn from an 8192-entry codebook, giving the model a semantically rich, spectrum-aware vocabulary.

Masked-transformer pretraining: Neural transformers are pretrained to predict the original codes of masked EEG patches, a BERT-style objective that learns representations without task labels.

Cross-dataset generalization: Pretraining on roughly 20 datasets lets a single model transfer across abnormal detection, event classification, emotion recognition, and gait prediction.

Scalable model family: Three sizes (Base ~5.8M, Large ~46M, Huge ~369M parameters) allow trading compute for accuracy.

Technical Details

Applications

Impact

Citation

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

Preprint

Jiang, W., et al. (2024) Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2405.18765

Recent citations

Papers that recently cited this model.

Self-Supervised Pre-Training for EEG denoising

Yilin Han, Aiping Liu, Heng Cui, et al.

Advanced Engineering Informatics · 2026

PGAP: Purity-Guided Active Prompting for EEG Decoding With LLMs [Research Frontier]

Jingwei Luo, Ziwei Wang, Dingkun Liu, et al.

IEEE Computational Intelligence Magazine · Aug 2026

Leveraging unlabelled data for generalizable neural population decoding

Ximeng Mao, Nanda H Krishna, Avery Hee-Woon Ryoo, et al.

Jul 2026

LaBraM

#Key Features

#Technical Details

#Applications

#Impact

Citation

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

Recent citations

Leveraging unlabelled data for generalizable neural population decoding

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

LaBraM

#Key Features

#Technical Details

#Applications

#Impact

Citation

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

Recent citations

Leveraging unlabelled data for generalizable neural population decoding

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact