bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

ECG-JEPA

Zuse Institute Berlin / Freie Universität Berlin

A Vision-Transformer Joint-Embedding Predictive Architecture self-supervised on 1M+ ECG records, improving 12-lead ECG classification on PTB-XL.

Released: October 2024

ECG-JEPA is a self-supervised learning framework for the electrocardiogram (ECG) that adapts the Joint-Embedding Predictive Architecture (JEPA) to cardiac time-series. Introduced by Kuba Weimann and Tim O. F. Conrad at the Zuse Institute Berlin (with Freie Universität Berlin) in an October 2024 preprint, it addresses a persistent bottleneck in computational cardiology: high-quality ECG labels are expensive and scarce, while raw recordings are abundant. By pre-training on more than one million unlabeled records, ECG-JEPA learns transferable representations that boost downstream diagnostic classification.

The central idea behind JEPA is to predict the latent representation of a masked target region from a visible context region, rather than reconstructing the raw signal or relying on hand-crafted augmentations. This places ECG-JEPA between two dominant self-supervised paradigms. Unlike generative masked-autoencoder methods, it predicts abstract features instead of reconstructing every sample, which avoids wasting capacity on noise and signal detail that is irrelevant for diagnosis. Unlike invariance-based contrastive methods, it does not require domain-specific augmentations whose physiological validity for ECG is uncertain.

The work demonstrates that a representation-prediction objective, originally developed for images, transfers effectively to multi-lead physiological signals and outperforms both invariance-based and generative alternatives on standard ECG benchmarks.

#Key Features

  • Augmentation-free self-supervision: JEPA learns by predicting masked latent features from visible context, eliminating the need for hand-designed ECG augmentations whose physiological correctness is hard to guarantee.
  • Latent prediction over reconstruction: The model predicts abstract representations rather than reconstructing the raw waveform, focusing capacity on diagnostically relevant structure instead of high-frequency noise.
  • Large-scale unlabeled pretraining: Training draws on over one million ECG records aggregated from ten public databases, far exceeding any single labeled dataset.
  • Strong fine-tuned and linear performance: The pretrained encoder yields high accuracy both when fully fine-tuned and under frozen linear evaluation, indicating that the learned features are broadly useful.
  • Open, reproducible code: The full pretraining and evaluation pipeline is released under the MIT license.

#Technical Details

ECG-JEPA uses a Vision Transformer (ViT) backbone applied to multi-lead ECG, released in three sizes — ViT-XS, ViT-S, and ViT-B. The pretraining corpus combines ten public databases totaling over one million records, dominated by MIMIC-IV-ECG (~800,000 records) and CODE-15 (~128,000), and including Chapman-Shaoxing, CPSC and CPSC-Extra, Georgia, Ningbo, PTB, St-Petersburg, and the PTB-XL training partition. The JEPA objective trains context and target encoders so that a predictor maps the context embedding to the latent representation of masked target blocks. On the PTB-XL "all statements" multi-label benchmark, the ViT-S JEPA model reaches an AUC of 0.945 with fine-tuning and 0.938 under linear evaluation; on the superdiagnostic single-label task it reaches 0.935 (fine-tuned) and 0.928 (linear). Across settings the JEPA pretraining consistently surpasses invariance-based and generative self-supervised baselines. The public repository provides pretraining and evaluation code but does not currently distribute pretrained checkpoints, so users reproduce the encoders by running the released pretraining scripts.

#Applications

ECG-JEPA targets automated interpretation of 12-lead ECGs, a core task in cardiology screening, triage, and large-scale clinical research. Because the pretrained encoder transfers well even under frozen linear evaluation, it is well suited to settings where labeled cardiac data are limited — smaller hospital cohorts, rare-condition detection, or new diagnostic label sets — letting teams adapt a strong representation with modest supervision. Researchers building ECG diagnostic models, biosignal foundation-model developers, and groups studying self-supervised learning for physiological time-series are the primary beneficiaries.

#Impact

ECG-JEPA contributes evidence that joint-embedding predictive pretraining, rather than masked reconstruction or contrastive invariance, is a strong recipe for physiological signals, extending the JEPA family beyond vision into biosignals. Its demonstration that latent-feature prediction outperforms generative and invariance-based self-supervision on PTB-XL provides a useful design signal for the growing field of ECG and biosignal foundation models. The main practical limitation is that no pretrained weights are released, so adoption currently requires the compute to repeat large-scale pretraining; the open MIT-licensed code nonetheless makes the approach fully reproducible.

Citation

Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance

Preprint

Weimann, K. & Conrad, T. O. F. (2024) Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance. Comput. Biol. Medicine.

DOI: 10.48550/arXiv.2410.13867

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations10
Influential0
References60

GitHub

Stars15
Forks1
Open Issues0
Contributors1
Last Push7mo ago
LanguagePython
LicenseMIT

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
62Partial
Usability — can I run it?64
Reproducibility — can I retrain it?62
Model Openness Framework
Unclassified
Missing required components

Tags

ecg_classificationelectrocardiogramfoundation_modeljoint_embedding_predictive_architecturerepresentation_learningself_supervisedvision_transformer

Resources

GitHub RepositoryResearch Paper