BELT-2

EEG-to-language foundation model that pairs a Q-Conformer encoder with a frozen LLM to decode coherent sentences from non-invasive brain recordings.

Released: August 2024

BELT-2 (Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding) is a multi-task foundation model that translates non-invasive electroencephalography (EEG) recordings into natural language. Reading out language directly from scalp EEG is far harder than from invasive electrocorticography because surface signals are noisy, low in spatial resolution, and only weakly coupled to the words a person reads or imagines. BELT-2 tackles this by reframing brain decoding as a representation-alignment problem: it learns to map EEG features into the same embedding space as the subword tokens of a large language model, then lets a frozen LLM generate fluent text from those aligned representations.

The model was introduced in August 2024 by Jinzhao Zhou, Yiqun Duan, Thomas Do, Yu-Kai Wang, Chin-Teng Lin, and colleagues at the University of Technology Sydney. Its headline result is the first demonstration of decoding coherent, readable sentences from non-invasive brain signals, reaching a BLEU-1 score of 52.2% on the word-reading EEG benchmark ZuCo, a substantial jump over prior EEG-to-text systems.

BELT-2 sits at the intersection of biosignal modeling and language modeling. Rather than training a bespoke sequence-to-sequence network end to end, it bootstraps from the linguistic priors already captured by pretrained LLMs and concentrates its learning budget on bridging the gap between brain activity and language.

Key Features

BPE-level EEG-language alignment: BELT-2 aligns EEG representations to byte-pair-encoding (BPE) subword tokens using contrastive learning, giving a finer-grained and more transferable mapping than earlier word- or sentence-level alignment approaches.
Q-Conformer encoder: A querying Conformer (Q-Conformer) EEG encoder combines convolutional and self-attention blocks with a set of learnable query tokens, producing a fixed, information-dense summary of variable-length EEG sequences.
Prefix-tuning bridge to frozen LLMs: The encoder output is converted into soft prefix embeddings that condition a frozen large language model, so the heavy linguistic model stays fixed while only the lightweight bridge is trained.
Multi-task brain decoding: A single model handles several tasks, including EEG-to-text translation, EEG-based sentiment classification, and EEG sentiment-to-text, sharing representations across objectives.
Coherent sentence decoding: BELT-2 is the first reported system to produce coherent, readable sentences from non-invasive EEG rather than disjointed token fragments.

Technical Details

BELT-2 is built around the Q-Conformer encoder, which stacks Conformer blocks (interleaving multi-head self-attention with convolutional modules) and uses learnable query embeddings to distill EEG sequences into a compact set of vectors. Training proceeds in stages: a contrastive objective first aligns EEG features with BPE token embeddings of the target text, after which prefix-tuning maps the encoder output into the input space of a frozen pretrained language model that is never updated. Multi-task supervision spans translation, sentiment classification, and conditioned generation. Evaluated on the ZuCo eye-tracking-and-EEG reading corpus, BELT-2 attains a BLEU-1 of 52.2% and reports translation gains of roughly 31% to 162% over previous EEG-to-text methods across metrics.

Applications

BELT-2 targets brain-computer interface (BCI) research and assistive communication, where decoding intended language from non-invasive recordings could one day help people who cannot speak or type. Because it relies on scalp EEG rather than implanted electrodes, it is relevant to lower-risk, more scalable BCI settings than invasive speech neuroprostheses. Its multi-task design also makes it a useful research platform for studying how linguistic structure is represented in EEG and for benchmarking EEG-language alignment methods.

Impact

BELT-2 advanced the EEG-to-language field by showing that aligning brain signals to subword tokens and offloading generation to frozen LLMs can yield far more fluent output than end-to-end decoders, establishing a new performance bar on ZuCo. Its bootstrapping recipe has influenced subsequent work on EEG-language representation alignment and multi-task neural decoding. A practical limitation is reproducibility and openness: at the time of writing, the code was released only through an anonymous review link, and no de-anonymized public repository or pretrained weights could be located. As with most EEG reading-decoding results, performance is also tied to a specific benchmark and reading paradigm, so generalization to imagined speech or new subjects remains an open question.

Citation

BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding

Preprint

Zhou, J., et al. (2024) BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding. arXiv.org.

DOI: 10.48550/arXiv.2409.00121

Recent citations

Papers that recently cited this model.

Wearable AI in the Era of Large Sensor Models
Yize Cai, Baoshen Guo, Guobin Shen, et al.
Apr 2026
0
DAMind: Zero-Shot Visual Cross-Domain Alignment and Representation for EEG Decoding
Haodong Jing, Yongqiang Ma, Panqi Yang, et al.
IEEE Transactions on Image Processing · Feb 2026
2
From Neurons to Networks: A Holistic Review of Electroencephalography (EEG) from Neurophysiological Foundations to AI Techniques
Christos Kalogeropoulos, Konstantinos A. Theofilatos, S. Mavroudi
Signals · Feb 2026
0

Top citations

The most-cited papers that cite this model.

Large Language Models for EEG: A Comprehensive Survey and Taxonomy
Naseem Babu, Jimson Mathew, A. Vinod
arXiv.org · Jun 2025
12Influential
Pretraining Large Brain Language Model for Active BCI: Silent Speech
Jinzhao Zhou, Zehong Cao, Yiqun Duan, et al.
ACM Multimedia · Apr 2025
9
WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities
Ziyi Zeng, Zhenyang Cai, Yixi Cai, et al.
arXiv.org · Sep 2025
6
Foundation Models for Cross-Domain EEG Analysis Application: A Survey
Hongqi Li, Yitong Chen, Yujuan Wang, et al.
arXiv.org · Aug 2025
5Influential
Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings
Jinzhao Zhou, Yiqun Duan, Ziyi Zhao, et al.
Balkan Conference in Informatics · Aug 2024
5

Citations

Total Citations12

Influential2

References42

Fields of citing research

Computer Science100%
Engineering42%
Medicine33%
Biology17%
Linguistics8%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

30Closed

Usability — can I run it?30

Reproducibility — can I retrain it?15

Model Openness Framework

Unclassified

Missing required components

Resources

Research Paper Official Website

Key Features

BPE-level EEG-language alignment: BELT-2 aligns EEG representations to byte-pair-encoding (BPE) subword tokens using contrastive learning, giving a finer-grained and more transferable mapping than earlier word- or sentence-level alignment approaches.

Q-Conformer encoder: A querying Conformer (Q-Conformer) EEG encoder combines convolutional and self-attention blocks with a set of learnable query tokens, producing a fixed, information-dense summary of variable-length EEG sequences.

Prefix-tuning bridge to frozen LLMs: The encoder output is converted into soft prefix embeddings that condition a frozen large language model, so the heavy linguistic model stays fixed while only the lightweight bridge is trained.

Multi-task brain decoding: A single model handles several tasks, including EEG-to-text translation, EEG-based sentiment classification, and EEG sentiment-to-text, sharing representations across objectives.

Coherent sentence decoding: BELT-2 is the first reported system to produce coherent, readable sentences from non-invasive EEG rather than disjointed token fragments.

Technical Details

Applications

Impact

BELT-2

#Key Features

#Technical Details

#Applications

#Impact

Citation

BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding

Recent citations

Wearable AI in the Era of Large Sensor Models

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

BELT-2

#Key Features

#Technical Details

#Applications

#Impact

Citation

BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding

Recent citations

Wearable AI in the Era of Large Sensor Models

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact