CLEF

Single-lead ECG foundation model pretrained on 12-lead recordings, weighting contrastive pairs by clinical risk for cardiovascular risk prediction.

Released: December 2025

Parameters: 296 Million

CLEF (Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models) is a self-supervised foundation model for single-lead ECG analysis, developed by researchers at Nokia Bell Labs together with collaborators at University College London, the University of Glasgow, and Tampere University's Heart Hospital, and released as a preprint in December 2025. It is designed to power remote health monitoring and wearable intelligence, where only a single ECG lead is typically available, rather than the 12-lead recordings used in clinical settings.

The central problem CLEF addresses is that conventional contrastive self-supervised learning treats every pair of distinct patients as equally dissimilar, ignoring the fact that two people can be physiologically very close or very far apart. CLEF instead uses an established clinical risk score to adaptively weight negative pairs, so that the geometry of the learned embedding space reflects clinically meaningful differences between subjects. This injects domain knowledge into pretraining without requiring per-sample disease labels.

By pretraining on full 12-lead data but randomly sampling a single lead per example, CLEF produces one flexible encoder that can be applied to any lead at inference time. This sidesteps the need to train a separate model for each lead or device, supporting cross-device generalization for the diverse single-lead sensors found in consumer and clinical wearables.

Key Features

Clinically-guided contrastive objective: CLEF uses the SCORE2 10-year cardiovascular risk score (derived from age, sex, smoking, blood pressure, diabetes, and cholesterol) to weight negative pairs, aligning embedding similarity with clinical risk distance rather than treating all subjects as equally distinct.
Single-model, any-lead deployment: Pretraining randomly selects one of the 12 leads per sample, yielding a single encoder usable on any lead and bridging the gap between 12-lead training data and single-lead wearable deployment.
Missing-metadata handling: An explicit mechanism allows the model to exploit risk-score supervision even when some patient metadata fields are absent, a common situation in real-world clinical records.
Three model scales: Released in Small (448K), Medium (30.7M), and Large (296M) parameter variants, letting users trade accuracy for the compute and memory budgets of edge or wearable hardware.

Technical Details

CLEF is built on a one-dimensional ResNeXt1D convolutional encoder, offered in three depth/width configurations spanning 448K to 296M parameters. It is pretrained on MIMIC-IV-ECG, comprising 12-lead recordings from roughly 161,000 patients, with SCORE2 risk scores computed from associated metadata used to modulate the contrastive loss. Across an evaluation suite of 18 downstream clinical tasks drawn from 7 held-out datasets, CLEF reports at least a 2.6% improvement in average AUROC on classification tasks and at least a 3.2% reduction in mean absolute error on regression tasks relative to self-supervised foundation-model baselines, with the medium variant noted as a strong default. Code is released under a BSD-3-Clause-Clear license, and pretrained checkpoints for all three scales are distributed via Zenodo.

Applications

CLEF targets remote and wearable cardiac monitoring, where single-lead ECG from smartwatches, patches, and handheld devices is the norm. Its embeddings can be fine-tuned or probed for a range of downstream tasks including arrhythmia and disease classification and continuous physiological regression, making it useful for researchers building cardiovascular screening tools, clinicians exploring scalable triage from consumer devices, and engineers developing on-device health features. The availability of compact variants makes it practical for deployment on resource-constrained edge hardware.

Impact

CLEF demonstrates that embedding clinical risk knowledge directly into the contrastive pretraining objective can outperform purely label-agnostic self-supervised approaches for ECG, offering a template for incorporating structured medical priors into biosignal foundation models. By openly releasing code and multi-scale weights and positioning itself for single-lead, any-device use, it lowers the barrier to building robust ECG analysis for remote health monitoring. As a recent preprint, its benchmark claims await peer review and broader independent validation, and its reliance on SCORE2 ties the clinical guidance specifically to cardiovascular risk rather than arbitrary conditions.

Citation

CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

Preprint

Shu, Y., et al. (2025) CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models. arXiv.org.

DOI: 10.48550/arXiv.2512.02180

Recent citations

Papers that recently cited this model.

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models
Sotirios Vavaroutas, Y. Wu, Ali Etemad, et al.
Jun 2026
0
CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment
Amir Mousavi, E. Nourbakhsh, Mohammad Sadegh Sirjani, et al.
May 2026
0
Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis
Bo Cui, Xiaowen Song, Yaowen Zhang, et al.
May 2026
0

Top citations

The most-cited papers that cite this model.

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models
Sotirios Vavaroutas, Y. Wu, Ali Etemad, et al.
Jun 2026
0
PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL
Sheng Wong, Ravi Shankar, B. Albert, et al.
Apr 2026
0
Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis
Bo Cui, Xiaowen Song, Yaowen Zhang, et al.
May 2026
0
CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment
Amir Mousavi, E. Nourbakhsh, Mohammad Sadegh Sirjani, et al.
May 2026
0

Citations

Total Citations4

Influential0

References80

GitHub

Stars55

Forks5

Open Issues1

Contributors3

Last Push7mo ago

LanguagePython

LicenseBSD-3-Clause-Clear

Fields of citing research

Computer Science100%
Engineering100%
Medicine100%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

62Partial

Usability — can I run it?69

Reproducibility — can I retrain it?66

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper Dataset

Key Features

Clinically-guided contrastive objective: CLEF uses the SCORE2 10-year cardiovascular risk score (derived from age, sex, smoking, blood pressure, diabetes, and cholesterol) to weight negative pairs, aligning embedding similarity with clinical risk distance rather than treating all subjects as equally distinct.

Single-model, any-lead deployment: Pretraining randomly selects one of the 12 leads per sample, yielding a single encoder usable on any lead and bridging the gap between 12-lead training data and single-lead wearable deployment.

Missing-metadata handling: An explicit mechanism allows the model to exploit risk-score supervision even when some patient metadata fields are absent, a common situation in real-world clinical records.

Three model scales: Released in Small (448K), Medium (30.7M), and Large (296M) parameter variants, letting users trade accuracy for the compute and memory budgets of edge or wearable hardware.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

Sotirios Vavaroutas, Y. Wu, Ali Etemad, et al.

Jun 2026

CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment

Amir Mousavi, E. Nourbakhsh, Mohammad Sadegh Sirjani, et al.

May 2026

Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

Bo Cui, Xiaowen Song, Yaowen Zhang, et al.

May 2026

Top citations

The most-cited papers that cite this model.

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

Sotirios Vavaroutas, Y. Wu, Ali Etemad, et al.

Jun 2026

PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL

Sheng Wong, Ravi Shankar, B. Albert, et al.

Apr 2026

Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

Bo Cui, Xiaowen Song, Yaowen Zhang, et al.

May 2026

CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment

Amir Mousavi, E. Nourbakhsh, Mohammad Sadegh Sirjani, et al.

May 2026

CLEF

#Key Features

#Technical Details

#Applications

#Impact

Citation

CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

Recent citations

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment

Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

Top citations

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL

Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

CLEF

#Key Features

#Technical Details

#Applications

#Impact

Citation

CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

Recent citations

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment

Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

Top citations

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL

Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

CogAdapt: Adapting Clinical ECG Foundation Models for Wearable Cognitive Load Assessment

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact