bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
BiosignalsLanguage model

ELM (EEG-Language Model)

Charité – Universitätsmedizin Berlin

Multimodal contrastive model aligning clinical EEG recordings with free-text reports, enabling label-efficient and zero-shot EEG phenotyping via text prompts.

Released: September 2024

ELM (EEG-Language Model) is a multimodal framework that learns joint representations of electroencephalography (EEG) recordings and their accompanying free-text clinical reports. Developed by Sam Gijsen and Kerstin Ritter at Charité – Universitätsmedizin Berlin and presented at ICML 2025, ELM addresses a persistent bottleneck in clinical neurophysiology: labeled EEG data is scarce and expensive to annotate, yet hospitals accumulate vast archives of recordings paired with the reports neurologists write during routine reading.

Rather than treating those reports as disposable, ELM uses them as a rich, naturally occurring supervisory signal. Borrowing the contrastive vision-language paradigm popularized by CLIP, ELM aligns EEG signals and clinical text in a shared embedding space, so that a recording and its matching report are pulled together while mismatched pairs are pushed apart. This is, to the authors' knowledge, the first work to enable zero-shot EEG classification through natural-language prompts and bidirectional retrieval between neural signals and reports.

The result is a model that is highly label-efficient: it transfers to downstream clinical phenotyping tasks with far fewer labeled examples than EEG-only baselines, and it can classify recordings for conditions it was never explicitly trained to label, simply by comparing them against textual descriptions.

#Key Features

  • EEG–text contrastive alignment: A CLIP-style objective aligns EEG and clinical-report embeddings in a shared space, turning routinely written reports into supervision without manual labeling.
  • Zero-shot classification via text prompts: Recordings can be classified by comparing their embeddings to natural-language descriptions of clinical conditions, with no task-specific labeled training.
  • Cross-modal retrieval: The shared space supports retrieving the most relevant report for a given EEG and vice versa, useful for archive search and decision support.
  • Multiple instance learning for misalignment: Attention-based multiple instance learning, combined with timeseries cropping and text segmentation, handles the loose correspondence between EEG segments and report sentences.
  • Label-efficient transfer: The pretrained encoders deliver strong downstream performance with minimal labeled data, outperforming EEG-only baselines across four clinical evaluation tasks.

#Technical Details

ELM pairs a convolutional EEG encoder (an EEG_ResNet) with a clinical BERT text encoder, trained jointly with a contrastive loss. Pretraining uses roughly 15,000 EEG recordings paired with clinical reports from the Temple University Hospital (TUH) EEG Corpus. Signals are processed as 20-channel longitudinal bipolar (TCP) montages, bandpass-filtered to 0.1–49 Hz and resampled to 100 Hz. To cope with the misalignment between long recordings and multi-sentence reports, ELM combines timeseries cropping, text segmentation, and attention-based multiple instance learning so that clinically informative segments are emphasized without segment-level annotation. The authors release two pretrained encoder checkpoints operating on different epoch lengths (5-second and 60-second windows) as PyTorch .pt files. Evaluated across four clinical phenotyping tasks, ELM substantially outperforms EEG-only baselines, with the largest gains in the low-label regime that characterizes real clinical practice.

#Applications

ELM is aimed at clinical neurophysiology workflows where annotated EEG is limited but report-paired recordings are abundant. Its label-efficient transfer suits building classifiers for abnormality detection, pathology screening, and related phenotyping tasks from small labeled sets, while zero-shot prompting lets clinicians and researchers probe recordings for conditions described in plain language. Cross-modal retrieval can power EEG archive search, surface similar prior cases, and assist report drafting or quality control. Researchers can use the released encoders as a feature extractor for downstream EEG modeling without retraining from scratch.

#Impact

ELM extends the contrastive multimodal pretraining recipe that reshaped medical imaging into the EEG domain, demonstrating that the clinical reports already produced during routine reading are a powerful and underused source of supervision. By enabling the first zero-shot EEG classification and EEG–report retrieval, it points toward foundation-model approaches that reduce dependence on costly expert labeling in neurophysiology. As a relatively young, single-institution ICML 2025 contribution, its broad clinical generalization beyond the TUH corpus and across diverse acquisition setups remains to be established, but the released code and pretrained encoders provide a concrete starting point for the community.

Citation

EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping

Preprint

Gijsen, S. & Ritter, K. (2024) EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping. International Conference on Machine Learning.

DOI: 10.48550/arXiv.2409.07480

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations9
Influential2
References65

GitHub

Stars14
Forks2
Open Issues1
Contributors1
Last Push11mo ago
LanguagePython

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
26Closed
Usability — can I run it?23
Reproducibility — can I retrain it?15
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

clinical_phenotypingcontrastive_learningcross_modal_retrievaleegmultimodalresnetself_supervisedzero_shot_classification

Resources

GitHub RepositoryResearch PaperResearch Paper