bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

Neuro-GPT

University of Southern California / Université de Montréal

An EEG foundation model pairing a convolutional encoder with a GPT pretrained by masked-segment reconstruction on the TUH EEG Corpus to improve low-data BCI decoding.

Released: November 2023

Electroencephalography (EEG) decoding for brain-computer interfaces (BCIs) is chronically limited by data scarcity and heterogeneity: signals vary across subjects, sessions, montages, and devices, while labeled datasets for any given task are typically tiny. Most decoders are therefore trained from scratch on a handful of subjects, which caps accuracy and generalization. Neuro-GPT applies the foundation-model recipe that reshaped language and protein modeling to this problem, pretraining on a large unlabeled clinical EEG corpus and then fine-tuning on a small downstream task.

Introduced by Wenhui Cui, Richard Leahy, and colleagues at the University of Southern California in collaboration with Karim Jerbi's group at the Université de Montréal, Neuro-GPT was first posted to arXiv in November 2023 and presented at the 2024 IEEE International Symposium on Biomedical Imaging (ISBI). The model couples an EEG encoder with a GPT-style autoregressive transformer and is trained with a self-supervised objective that reconstructs masked EEG segments, learning the temporal and spatial structure of brain activity without task labels.

The central claim is practical rather than purely architectural: pretraining on abundant unlabeled EEG and transferring to a data-scarce target task yields a substantial improvement over training the same network from scratch, making it a useful reference design for foundation-model approaches in neurotechnology.

#Key Features

  • Encoder-plus-GPT design: An EEG encoder produces embeddings for fixed-length signal chunks, and a GPT autoregressive transformer models the sequence of chunk embeddings, separating local feature extraction from longer-range temporal modeling.
  • Masked-segment reconstruction pretraining: The self-supervised objective masks EEG chunks and trains the model to reconstruct them, requiring no labels and exploiting large unlabeled corpora.
  • Pretrained on the TUH EEG Corpus: Training uses the Temple University Hospital EEG Corpus, one of the largest publicly available clinical EEG datasets, providing broad coverage of recording conditions.
  • Data-scarcity robustness: Fine-tuned on a motor-imagery dataset of only nine subjects, the pretrained model outperforms an identical architecture trained from scratch, the paper's headline result.
  • Open code and weights: The implementation is released on GitHub under GPL-3.0 (224+ stars) with pretrained weights distributed on Hugging Face, enabling reuse and reproduction.

#Technical Details

Neuro-GPT processes EEG by splitting each multi-channel recording into fixed-length chunks, embedding them with a convolutional EEG encoder, and feeding the resulting sequence into a GPT-2-style autoregressive transformer. Pretraining masks a subset of chunks and trains the network to reconstruct the masked segments, a self-supervised task analogous to masked modeling in language. Pretraining data comes from the TUH EEG Corpus, a large clinical archive spanning tens of thousands of recordings, which exposes the model to diverse montages and acquisition settings. For the downstream evaluation, the authors fine-tune on the BCI Competition IV 2a four-class motor-imagery dataset, which contains only nine subjects, and compare a pretrained-then-fine-tuned model against the same architecture trained from scratch. The pretrained foundation model significantly improves motor-imagery classification performance in this low-data regime, with the authors also examining how encoder and GPT design choices affect transfer.

#Applications

Neuro-GPT targets EEG-based BCI development in settings where labeled data per user is limited, including motor-imagery decoding for assistive control, neurorehabilitation, and human-computer interaction. Because the model is pretrained once on abundant unlabeled clinical EEG and then fine-tuned per task or dataset, it suits researchers who cannot collect large labeled corpora for every new application. The released code and Hugging Face weights make it a practical starting point for transfer-learning experiments and for benchmarking newer EEG foundation models.

#Impact

As one of the early demonstrations that GPT-style self-supervised pretraining transfers usefully to EEG, Neuro-GPT helped establish the foundation-model paradigm for biosignals and is frequently cited as a baseline in subsequent EEG representation-learning work. Its open code and weights have supported reproduction and extension. The reported gains are demonstrated offline on a single four-class motor-imagery benchmark with nine subjects, so the evidence is narrower than large multi-task evaluations, and absolute accuracy remains modest, as is typical for non-invasive motor-imagery decoding. It is distinct from other EEG foundation models developed by separate groups, and its lasting contribution is the concrete, reproducible encoder-plus-GPT template it provides.

Citations

Neuro-GPT: Towards A Foundation Model For EEG

Preprint

Cui, W., et al. (2023) Neuro-GPT: Towards A Foundation Model For EEG. IEEE International Symposium on Biomedical Imaging.

DOI: 10.48550/arXiv.2311.03764

Neuro-GPT: Towards A Foundation Model For EEG

Cui, W., et al. (2023) Neuro-GPT: Towards A Foundation Model For EEG. IEEE International Symposium on Biomedical Imaging.

DOI: 10.1109/ISBI56570.2024.10635453

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations93
Influential7
References25

GitHub

Stars224
Forks43
Open Issues7
Contributors1
Last Push2y ago
LanguagePython
LicenseGPL-3.0

HuggingFace

Downloads0
Likes13
Last Modified2y ago

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
46Partial
Usability — can I run it?62
Reproducibility — can I retrain it?36
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

brain_computer_interfaceeegfoundation_modelgptmotor_imagery_decodingself_supervisedtransfer_learningtransformer

Resources

GitHub RepositoryResearch PaperHuggingFace Model