bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

CLEF

Nokia Bell Labs

A clinically-guided contrastive foundation model for single-lead ECG analysis, pretrained on 12-lead recordings from 161K patients using SCORE2 risk to weight pairs.

Released: December 2025
Parameters: 296 Million

CLEF (Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models) is a self-supervised foundation model for single-lead ECG analysis, developed by researchers at Nokia Bell Labs together with collaborators at University College London, the University of Glasgow, and Tampere University's Heart Hospital, and released as a preprint in December 2025. It is designed to power remote health monitoring and wearable intelligence, where only a single ECG lead is typically available, rather than the 12-lead recordings used in clinical settings.

The central problem CLEF addresses is that conventional contrastive self-supervised learning treats every pair of distinct patients as equally dissimilar, ignoring the fact that two people can be physiologically very close or very far apart. CLEF instead uses an established clinical risk score to adaptively weight negative pairs, so that the geometry of the learned embedding space reflects clinically meaningful differences between subjects. This injects domain knowledge into pretraining without requiring per-sample disease labels.

By pretraining on full 12-lead data but randomly sampling a single lead per example, CLEF produces one flexible encoder that can be applied to any lead at inference time. This sidesteps the need to train a separate model for each lead or device, supporting cross-device generalization for the diverse single-lead sensors found in consumer and clinical wearables.

#Key Features

  • Clinically-guided contrastive objective: CLEF uses the SCORE2 10-year cardiovascular risk score (derived from age, sex, smoking, blood pressure, diabetes, and cholesterol) to weight negative pairs, aligning embedding similarity with clinical risk distance rather than treating all subjects as equally distinct.
  • Single-model, any-lead deployment: Pretraining randomly selects one of the 12 leads per sample, yielding a single encoder usable on any lead and bridging the gap between 12-lead training data and single-lead wearable deployment.
  • Missing-metadata handling: An explicit mechanism allows the model to exploit risk-score supervision even when some patient metadata fields are absent, a common situation in real-world clinical records.
  • Three model scales: Released in Small (448K), Medium (30.7M), and Large (296M) parameter variants, letting users trade accuracy for the compute and memory budgets of edge or wearable hardware.

#Technical Details

CLEF is built on a one-dimensional ResNeXt1D convolutional encoder, offered in three depth/width configurations spanning 448K to 296M parameters. It is pretrained on MIMIC-IV-ECG, comprising 12-lead recordings from roughly 161,000 patients, with SCORE2 risk scores computed from associated metadata used to modulate the contrastive loss. Across an evaluation suite of 18 downstream clinical tasks drawn from 7 held-out datasets, CLEF reports at least a 2.6% improvement in average AUROC on classification tasks and at least a 3.2% reduction in mean absolute error on regression tasks relative to self-supervised foundation-model baselines, with the medium variant noted as a strong default. Code is released under a BSD-3-Clause-Clear license, and pretrained checkpoints for all three scales are distributed via Zenodo.

#Applications

CLEF targets remote and wearable cardiac monitoring, where single-lead ECG from smartwatches, patches, and handheld devices is the norm. Its embeddings can be fine-tuned or probed for a range of downstream tasks including arrhythmia and disease classification and continuous physiological regression, making it useful for researchers building cardiovascular screening tools, clinicians exploring scalable triage from consumer devices, and engineers developing on-device health features. The availability of compact variants makes it practical for deployment on resource-constrained edge hardware.

#Impact

CLEF demonstrates that embedding clinical risk knowledge directly into the contrastive pretraining objective can outperform purely label-agnostic self-supervised approaches for ECG, offering a template for incorporating structured medical priors into biosignal foundation models. By openly releasing code and multi-scale weights and positioning itself for single-lead, any-device use, it lowers the barrier to building robust ECG analysis for remote health monitoring. As a recent preprint, its benchmark claims await peer review and broader independent validation, and its reliance on SCORE2 ties the clinical guidance specifically to cardiovascular risk rather than arbitrary conditions.

Citation

CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

Preprint

Shu, Y., et al. (2025) CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models. arXiv.org.

DOI: 10.48550/arXiv.2512.02180

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations3
Influential0
References80

GitHub

Stars47
Forks5
Open Issues1
Contributors3
Last Push6mo ago
LanguagePython
LicenseBSD-3-Clause-Clear

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
62Partial
Usability — can I run it?69
Reproducibility — can I retrain it?66
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cardiovascular_risk_predictioncnncontrastive_learningecg_analysiselectrocardiogramfoundation_modelrepresentation_learningself_supervised

Resources

GitHub RepositoryResearch PaperDataset