Neuro-GPT

University of Southern California / Université de Montréal

EEG foundation model that pairs a convolutional encoder with a GPT backbone, pretrained by masked-segment reconstruction for low-data BCI decoding.

Released: November 2023

Electroencephalography (EEG) decoding for brain-computer interfaces (BCIs) is chronically limited by data scarcity and heterogeneity: signals vary across subjects, sessions, montages, and devices, while labeled datasets for any given task are typically tiny. Most decoders are therefore trained from scratch on a handful of subjects, which caps accuracy and generalization. Neuro-GPT applies the foundation-model recipe that reshaped language and protein modeling to this problem, pretraining on a large unlabeled clinical EEG corpus and then fine-tuning on a small downstream task.

Introduced by Wenhui Cui, Richard Leahy, and colleagues at the University of Southern California in collaboration with Karim Jerbi's group at the Université de Montréal, Neuro-GPT was first posted to arXiv in November 2023 and presented at the 2024 IEEE International Symposium on Biomedical Imaging (ISBI). The model couples an EEG encoder with a GPT-style autoregressive transformer and is trained with a self-supervised objective that reconstructs masked EEG segments, learning the temporal and spatial structure of brain activity without task labels.

The central claim is practical rather than purely architectural: pretraining on abundant unlabeled EEG and transferring to a data-scarce target task yields a substantial improvement over training the same network from scratch, making it a useful reference design for foundation-model approaches in neurotechnology.

Key Features

Encoder-plus-GPT design: An EEG encoder produces embeddings for fixed-length signal chunks, and a GPT autoregressive transformer models the sequence of chunk embeddings, separating local feature extraction from longer-range temporal modeling.
Masked-segment reconstruction pretraining: The self-supervised objective masks EEG chunks and trains the model to reconstruct them, requiring no labels and exploiting large unlabeled corpora.
Pretrained on the TUH EEG Corpus: Training uses the Temple University Hospital EEG Corpus, one of the largest publicly available clinical EEG datasets, providing broad coverage of recording conditions.
Data-scarcity robustness: Fine-tuned on a motor-imagery dataset of only nine subjects, the pretrained model outperforms an identical architecture trained from scratch, the paper's headline result.
Open code and weights: The implementation is released on GitHub under GPL-3.0 (224+ stars) with pretrained weights distributed on Hugging Face, enabling reuse and reproduction.

Technical Details

Neuro-GPT processes EEG by splitting each multi-channel recording into fixed-length chunks, embedding them with a convolutional EEG encoder, and feeding the resulting sequence into a GPT-2-style autoregressive transformer. Pretraining masks a subset of chunks and trains the network to reconstruct the masked segments, a self-supervised task analogous to masked modeling in language. Pretraining data comes from the TUH EEG Corpus, a large clinical archive spanning tens of thousands of recordings, which exposes the model to diverse montages and acquisition settings. For the downstream evaluation, the authors fine-tune on the BCI Competition IV 2a four-class motor-imagery dataset, which contains only nine subjects, and compare a pretrained-then-fine-tuned model against the same architecture trained from scratch. The pretrained foundation model significantly improves motor-imagery classification performance in this low-data regime, with the authors also examining how encoder and GPT design choices affect transfer.

Applications

Neuro-GPT targets EEG-based BCI development in settings where labeled data per user is limited, including motor-imagery decoding for assistive control, neurorehabilitation, and human-computer interaction. Because the model is pretrained once on abundant unlabeled clinical EEG and then fine-tuned per task or dataset, it suits researchers who cannot collect large labeled corpora for every new application. The released code and Hugging Face weights make it a practical starting point for transfer-learning experiments and for benchmarking newer EEG foundation models.

Impact

As one of the early demonstrations that GPT-style self-supervised pretraining transfers usefully to EEG, Neuro-GPT helped establish the foundation-model paradigm for biosignals and is frequently cited as a baseline in subsequent EEG representation-learning work. Its open code and weights have supported reproduction and extension. The reported gains are demonstrated offline on a single four-class motor-imagery benchmark with nine subjects, so the evidence is narrower than large multi-task evaluations, and absolute accuracy remains modest, as is typical for non-invasive motor-imagery decoding. It is distinct from other EEG foundation models developed by separate groups, and its lasting contribution is the concrete, reproducible encoder-plus-GPT template it provides.

Citations

Neuro-GPT: Towards A Foundation Model For EEG

Preprint

Cui, W., et al. (2023) Neuro-GPT: Towards A Foundation Model For EEG. IEEE International Symposium on Biomedical Imaging.

DOI: 10.48550/arXiv.2311.03764

Neuro-GPT: Towards A Foundation Model For EEG

Cui, W., et al. (2023) Neuro-GPT: Towards A Foundation Model For EEG. IEEE International Symposium on Biomedical Imaging.

DOI: 10.1109/ISBI56570.2024.10635453

Recent citations

Papers that recently cited this model.

STST-JEPA: Shallow-Target Spatio-Temporal Joint Embedding Prediction Architecture For EEG Self-Supervised Learning
Roy Segal, Yoni Svechinsky, T. Fekete
Jul 2026
0
Channel-Oriented Design for EEG-to-Music Reconstruction
Jiaxin Qing, Junwei Lu, Lexin Li
Jun 2026
0
EEG-based decoding of swallowing intention using a transformer-enhanced deep learning approach
Sevgi Gökçe Aslan, Bülent Yılmaz
Biomedical Signal Processing and Control · Jun 2026
0

Top citations

The most-cited papers that cite this model.

REVE: A Foundation Model for EEG - Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects
Yassine El Ouahidi, Jonathan Lys, Philipp Thölke, et al.
arXiv.org · Oct 2025
44
EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation
Jonathan W. Kim, A. Alaa, Danilo Bernardo
arXiv.org · Jan 2024
41
CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding
Yuchen Zhou, Jiamin Wu, Zichen Ren, et al.
arXiv.org · Jun 2025
35
Brain Foundation Models: A survey on advancements in neural signal processing and brain discovery
Xin-qiu Zhou, Chenyu Liu, Zhisheng Chen, et al.
IEEE Signal Processing Magazine · Mar 2025
34
A multimodal sleep foundation model for disease prediction
R. Thapa, M. R. Kjaer, Bryan He, et al.
Nature Medicine · Jan 2026
24

Citations

Total Citations98

Influential9

References25

GitHub

Stars228

Forks44

Open Issues7

Contributors1

Last Push2y ago

LanguagePython

LicenseGPL-3.0

HuggingFace

Downloads0

Likes13

Last Modified2y ago

Fields of citing research

Computer Science97%
Medicine58%
Engineering47%
Biology21%
Physics3%
Psychology3%
Education1%
Linguistics1%

Share of papers citing this model.

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

46Partial

Usability — can I run it?62

Reproducibility — can I retrain it?36

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Encoder-plus-GPT design: An EEG encoder produces embeddings for fixed-length signal chunks, and a GPT autoregressive transformer models the sequence of chunk embeddings, separating local feature extraction from longer-range temporal modeling.

Masked-segment reconstruction pretraining: The self-supervised objective masks EEG chunks and trains the model to reconstruct them, requiring no labels and exploiting large unlabeled corpora.

Pretrained on the TUH EEG Corpus: Training uses the Temple University Hospital EEG Corpus, one of the largest publicly available clinical EEG datasets, providing broad coverage of recording conditions.

Data-scarcity robustness: Fine-tuned on a motor-imagery dataset of only nine subjects, the pretrained model outperforms an identical architecture trained from scratch, the paper's headline result.

Open code and weights: The implementation is released on GitHub under GPL-3.0 (224+ stars) with pretrained weights distributed on Hugging Face, enabling reuse and reproduction.

Technical Details

Applications

Impact

Citations

Neuro-GPT: Towards A Foundation Model For EEG

Preprint

Cui, W., et al. (2023) Neuro-GPT: Towards A Foundation Model For EEG. IEEE International Symposium on Biomedical Imaging.

DOI: 10.48550/arXiv.2311.03764

Neuro-GPT: Towards A Foundation Model For EEG

Cui, W., et al. (2023) Neuro-GPT: Towards A Foundation Model For EEG. IEEE International Symposium on Biomedical Imaging.

DOI: 10.1109/ISBI56570.2024.10635453

Recent citations

Papers that recently cited this model.

STST-JEPA: Shallow-Target Spatio-Temporal Joint Embedding Prediction Architecture For EEG Self-Supervised Learning

Roy Segal, Yoni Svechinsky, T. Fekete

Jul 2026

Channel-Oriented Design for EEG-to-Music Reconstruction

Jiaxin Qing, Junwei Lu, Lexin Li

Jun 2026

EEG-based decoding of swallowing intention using a transformer-enhanced deep learning approach

Sevgi Gökçe Aslan, Bülent Yılmaz

Biomedical Signal Processing and Control · Jun 2026

Neuro-GPT

#Key Features

#Technical Details

#Applications

#Impact

Citations

Neuro-GPT: Towards A Foundation Model For EEG

Neuro-GPT: Towards A Foundation Model For EEG

Recent citations

STST-JEPA: Shallow-Target Spatio-Temporal Joint Embedding Prediction Architecture For EEG Self-Supervised Learning

Channel-Oriented Design for EEG-to-Music Reconstruction

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Neuro-GPT

#Key Features

#Technical Details

#Applications

#Impact

Citations

Neuro-GPT: Towards A Foundation Model For EEG

Neuro-GPT: Towards A Foundation Model For EEG

Recent citations

STST-JEPA: Shallow-Target Spatio-Temporal Joint Embedding Prediction Architecture For EEG Self-Supervised Learning

Channel-Oriented Design for EEG-to-Music Reconstruction

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact