ELM (EEG-Language Model)

Multimodal contrastive model aligning clinical EEG with free-text reports, enabling zero-shot EEG classification from natural-language prompts.

Released: September 2024

ELM (EEG-Language Model) is a multimodal framework that learns joint representations of electroencephalography (EEG) recordings and their accompanying free-text clinical reports. Developed by Sam Gijsen and Kerstin Ritter at Charité – Universitätsmedizin Berlin and presented at ICML 2025, ELM addresses a persistent bottleneck in clinical neurophysiology: labeled EEG data is scarce and expensive to annotate, yet hospitals accumulate vast archives of recordings paired with the reports neurologists write during routine reading.

Rather than treating those reports as disposable, ELM uses them as a rich, naturally occurring supervisory signal. Borrowing the contrastive vision-language paradigm popularized by CLIP, ELM aligns EEG signals and clinical text in a shared embedding space, so that a recording and its matching report are pulled together while mismatched pairs are pushed apart. This is, to the authors' knowledge, the first work to enable zero-shot EEG classification through natural-language prompts and bidirectional retrieval between neural signals and reports.

The result is a model that is highly label-efficient: it transfers to downstream clinical phenotyping tasks with far fewer labeled examples than EEG-only baselines, and it can classify recordings for conditions it was never explicitly trained to label, simply by comparing them against textual descriptions.

Key Features

EEG–text contrastive alignment: A CLIP-style objective aligns EEG and clinical-report embeddings in a shared space, turning routinely written reports into supervision without manual labeling.
Zero-shot classification via text prompts: Recordings can be classified by comparing their embeddings to natural-language descriptions of clinical conditions, with no task-specific labeled training.
Cross-modal retrieval: The shared space supports retrieving the most relevant report for a given EEG and vice versa, useful for archive search and decision support.
Multiple instance learning for misalignment: Attention-based multiple instance learning, combined with timeseries cropping and text segmentation, handles the loose correspondence between EEG segments and report sentences.
Label-efficient transfer: The pretrained encoders deliver strong downstream performance with minimal labeled data, outperforming EEG-only baselines across four clinical evaluation tasks.

Technical Details

ELM pairs a convolutional EEG encoder (an EEG_ResNet) with a clinical BERT text encoder, trained jointly with a contrastive loss. Pretraining uses roughly 15,000 EEG recordings paired with clinical reports from the Temple University Hospital (TUH) EEG Corpus. Signals are processed as 20-channel longitudinal bipolar (TCP) montages, bandpass-filtered to 0.1–49 Hz and resampled to 100 Hz. To cope with the misalignment between long recordings and multi-sentence reports, ELM combines timeseries cropping, text segmentation, and attention-based multiple instance learning so that clinically informative segments are emphasized without segment-level annotation. The authors release two pretrained encoder checkpoints operating on different epoch lengths (5-second and 60-second windows) as PyTorch .pt files. Evaluated across four clinical phenotyping tasks, ELM substantially outperforms EEG-only baselines, with the largest gains in the low-label regime that characterizes real clinical practice.

Applications

ELM is aimed at clinical neurophysiology workflows where annotated EEG is limited but report-paired recordings are abundant. Its label-efficient transfer suits building classifiers for abnormality detection, pathology screening, and related phenotyping tasks from small labeled sets, while zero-shot prompting lets clinicians and researchers probe recordings for conditions described in plain language. Cross-modal retrieval can power EEG archive search, surface similar prior cases, and assist report drafting or quality control. Researchers can use the released encoders as a feature extractor for downstream EEG modeling without retraining from scratch.

Impact

ELM extends the contrastive multimodal pretraining recipe that reshaped medical imaging into the EEG domain, demonstrating that the clinical reports already produced during routine reading are a powerful and underused source of supervision. By enabling the first zero-shot EEG classification and EEG–report retrieval, it points toward foundation-model approaches that reduce dependence on costly expert labeling in neurophysiology. As a relatively young, single-institution ICML 2025 contribution, its broad clinical generalization beyond the TUH corpus and across diverse acquisition setups remains to be established, but the released code and pretrained encoders provide a concrete starting point for the community.

Citation

EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping

Preprint

Gijsen, S. & Ritter, K. (2024) EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping. International Conference on Machine Learning.

DOI: 10.48550/arXiv.2409.07480

Recent citations

Papers that recently cited this model.

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics
Sam Gijsen, M. Lukomski, Marc-Andre Schulz, et al.
Jun 2026
0
KAST-BAR: Knowledge-Anchored Semantically-Dynamic Topology Brain Autoregressive Modeling for Universal Neural Interpretation
Haoning Wang, Wenchao Yang, Shuai Shen, et al.
May 2026
0Influential
NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning
Guoan Wang, Shihao Yang, Jun-En Ding, et al.
bioRxiv · Feb 2026
0

Top citations

The most-cited papers that cite this model.

Large Language Models for EEG: A Comprehensive Survey and Taxonomy
Naseem Babu, Jimson Mathew, A. Vinod
arXiv.org · Jun 2025
12Influential
Brain4FMs: A Benchmark of Foundation Models for Electrical Brain Signal
Fanqi Shen, En-Hui Yang, Jiahe Li, et al.
arXiv.org · Feb 2026
4
Neural Signals Generate Clinical Notes in the Wild
Jathurshan Pradeepkumar, Zheng Chen, Jimeng Sun
arXiv.org · Jan 2026
2
Bridging Brain with Foundation Models through Self-Supervised Learning
Hamdi Altaheri, Fakhri Karray, Md. Milon Islam, et al.
arXiv.org · Jun 2025
1
Bridging neuroscience and AI: a survey on large language models for neurological signal interpretation
Sreejith Chandrasekharan, J. E. Jacob
Frontiers Neuroinformatics · Jun 2025
1

Citations

Total Citations10

Influential2

References65

GitHub

Stars14

Forks2

Open Issues1

Contributors1

Last Push1y ago

LanguagePython

Fields of citing research

Computer Science100%
Engineering50%
Medicine50%
Biology20%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

26Closed

Usability — can I run it?23

Reproducibility — can I retrain it?15

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper Research Paper

Key Features

EEG–text contrastive alignment: A CLIP-style objective aligns EEG and clinical-report embeddings in a shared space, turning routinely written reports into supervision without manual labeling.

Zero-shot classification via text prompts: Recordings can be classified by comparing their embeddings to natural-language descriptions of clinical conditions, with no task-specific labeled training.

Cross-modal retrieval: The shared space supports retrieving the most relevant report for a given EEG and vice versa, useful for archive search and decision support.

Multiple instance learning for misalignment: Attention-based multiple instance learning, combined with timeseries cropping and text segmentation, handles the loose correspondence between EEG segments and report sentences.

Label-efficient transfer: The pretrained encoders deliver strong downstream performance with minimal labeled data, outperforming EEG-only baselines across four clinical evaluation tasks.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

Sam Gijsen, M. Lukomski, Marc-Andre Schulz, et al.

Jun 2026

KAST-BAR: Knowledge-Anchored Semantically-Dynamic Topology Brain Autoregressive Modeling for Universal Neural Interpretation

Haoning Wang, Wenchao Yang, Shuai Shen, et al.

May 2026

0Influential

NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Guoan Wang, Shihao Yang, Jun-En Ding, et al.

bioRxiv · Feb 2026

ELM (EEG-Language Model)

#Key Features

#Technical Details

#Applications

#Impact

Citation

EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping

Recent citations

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

KAST-BAR: Knowledge-Anchored Semantically-Dynamic Topology Brain Autoregressive Modeling for Universal Neural Interpretation

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

ELM (EEG-Language Model)

#Key Features

#Technical Details

#Applications

#Impact

Citation

EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping

Recent citations

Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics

KAST-BAR: Knowledge-Anchored Semantically-Dynamic Topology Brain Autoregressive Modeling for Universal Neural Interpretation

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact