A clinically-guided contrastive foundation model for single-lead ECG analysis, pretrained on 12-lead recordings from 161K patients using SCORE2 risk to weight pairs.
CLEF (Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models) is a self-supervised foundation model for single-lead ECG analysis, developed by researchers at Nokia Bell Labs together with collaborators at University College London, the University of Glasgow, and Tampere University's Heart Hospital, and released as a preprint in December 2025. It is designed to power remote health monitoring and wearable intelligence, where only a single ECG lead is typically available, rather than the 12-lead recordings used in clinical settings.
The central problem CLEF addresses is that conventional contrastive self-supervised learning treats every pair of distinct patients as equally dissimilar, ignoring the fact that two people can be physiologically very close or very far apart. CLEF instead uses an established clinical risk score to adaptively weight negative pairs, so that the geometry of the learned embedding space reflects clinically meaningful differences between subjects. This injects domain knowledge into pretraining without requiring per-sample disease labels.
By pretraining on full 12-lead data but randomly sampling a single lead per example, CLEF produces one flexible encoder that can be applied to any lead at inference time. This sidesteps the need to train a separate model for each lead or device, supporting cross-device generalization for the diverse single-lead sensors found in consumer and clinical wearables.
CLEF is built on a one-dimensional ResNeXt1D convolutional encoder, offered in three depth/width configurations spanning 448K to 296M parameters. It is pretrained on MIMIC-IV-ECG, comprising 12-lead recordings from roughly 161,000 patients, with SCORE2 risk scores computed from associated metadata used to modulate the contrastive loss. Across an evaluation suite of 18 downstream clinical tasks drawn from 7 held-out datasets, CLEF reports at least a 2.6% improvement in average AUROC on classification tasks and at least a 3.2% reduction in mean absolute error on regression tasks relative to self-supervised foundation-model baselines, with the medium variant noted as a strong default. Code is released under a BSD-3-Clause-Clear license, and pretrained checkpoints for all three scales are distributed via Zenodo.
CLEF targets remote and wearable cardiac monitoring, where single-lead ECG from smartwatches, patches, and handheld devices is the norm. Its embeddings can be fine-tuned or probed for a range of downstream tasks including arrhythmia and disease classification and continuous physiological regression, making it useful for researchers building cardiovascular screening tools, clinicians exploring scalable triage from consumer devices, and engineers developing on-device health features. The availability of compact variants makes it practical for deployment on resource-constrained edge hardware.
CLEF demonstrates that embedding clinical risk knowledge directly into the contrastive pretraining objective can outperform purely label-agnostic self-supervised approaches for ECG, offering a template for incorporating structured medical priors into biosignal foundation models. By openly releasing code and multi-scale weights and positioning itself for single-lead, any-device use, it lowers the barrier to building robust ECG analysis for remote health monitoring. As a recent preprint, its benchmark claims await peer review and broader independent validation, and its reliance on SCORE2 ties the clinical guidance specifically to cardiovascular risk rather than arbitrary conditions.
Shu, Y., et al. (2025) CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models. arXiv.org.
DOI: 10.48550/arXiv.2512.02180Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data