ECG-JEPA

Zuse Institute Berlin / Freie Universität Berlin

Joint-embedding predictive foundation model pretrained on over a million unlabeled ECGs, learning transferable 12-lead representations for diagnosis.

Released: October 2024

ECG-JEPA is a self-supervised learning framework for the electrocardiogram (ECG) that adapts the Joint-Embedding Predictive Architecture (JEPA) to cardiac time-series. Introduced by Kuba Weimann and Tim O. F. Conrad at the Zuse Institute Berlin (with Freie Universität Berlin) in an October 2024 preprint, it addresses a persistent bottleneck in computational cardiology: high-quality ECG labels are expensive and scarce, while raw recordings are abundant. By pre-training on more than one million unlabeled records, ECG-JEPA learns transferable representations that boost downstream diagnostic classification.

The central idea behind JEPA is to predict the latent representation of a masked target region from a visible context region, rather than reconstructing the raw signal or relying on hand-crafted augmentations. This places ECG-JEPA between two dominant self-supervised paradigms. Unlike generative masked-autoencoder methods, it predicts abstract features instead of reconstructing every sample, which avoids wasting capacity on noise and signal detail that is irrelevant for diagnosis. Unlike invariance-based contrastive methods, it does not require domain-specific augmentations whose physiological validity for ECG is uncertain.

The work demonstrates that a representation-prediction objective, originally developed for images, transfers effectively to multi-lead physiological signals and outperforms both invariance-based and generative alternatives on standard ECG benchmarks.

Key Features

Augmentation-free self-supervision: JEPA learns by predicting masked latent features from visible context, eliminating the need for hand-designed ECG augmentations whose physiological correctness is hard to guarantee.
Latent prediction over reconstruction: The model predicts abstract representations rather than reconstructing the raw waveform, focusing capacity on diagnostically relevant structure instead of high-frequency noise.
Large-scale unlabeled pretraining: Training draws on over one million ECG records aggregated from ten public databases, far exceeding any single labeled dataset.
Strong fine-tuned and linear performance: The pretrained encoder yields high accuracy both when fully fine-tuned and under frozen linear evaluation, indicating that the learned features are broadly useful.
Open, reproducible code: The full pretraining and evaluation pipeline is released under the MIT license.

Technical Details

ECG-JEPA uses a Vision Transformer (ViT) backbone applied to multi-lead ECG, released in three sizes — ViT-XS, ViT-S, and ViT-B. The pretraining corpus combines ten public databases totaling over one million records, dominated by MIMIC-IV-ECG (~800,000 records) and CODE-15 (~128,000), and including Chapman-Shaoxing, CPSC and CPSC-Extra, Georgia, Ningbo, PTB, St-Petersburg, and the PTB-XL training partition. The JEPA objective trains context and target encoders so that a predictor maps the context embedding to the latent representation of masked target blocks. On the PTB-XL "all statements" multi-label benchmark, the ViT-S JEPA model reaches an AUC of 0.945 with fine-tuning and 0.938 under linear evaluation; on the superdiagnostic single-label task it reaches 0.935 (fine-tuned) and 0.928 (linear). Across settings the JEPA pretraining consistently surpasses invariance-based and generative self-supervised baselines. The public repository provides pretraining and evaluation code but does not currently distribute pretrained checkpoints, so users reproduce the encoders by running the released pretraining scripts.

Applications

ECG-JEPA targets automated interpretation of 12-lead ECGs, a core task in cardiology screening, triage, and large-scale clinical research. Because the pretrained encoder transfers well even under frozen linear evaluation, it is well suited to settings where labeled cardiac data are limited — smaller hospital cohorts, rare-condition detection, or new diagnostic label sets — letting teams adapt a strong representation with modest supervision. Researchers building ECG diagnostic models, biosignal foundation-model developers, and groups studying self-supervised learning for physiological time-series are the primary beneficiaries.

Impact

ECG-JEPA contributes evidence that joint-embedding predictive pretraining, rather than masked reconstruction or contrastive invariance, is a strong recipe for physiological signals, extending the JEPA family beyond vision into biosignals. Its demonstration that latent-feature prediction outperforms generative and invariance-based self-supervision on PTB-XL provides a useful design signal for the growing field of ECG and biosignal foundation models. The main practical limitation is that no pretrained weights are released, so adoption currently requires the compute to repeat large-scale pretraining; the open MIT-licensed code nonetheless makes the approach fully reproducible.

Citation

Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance

Preprint

Weimann, K. & Conrad, T. O. F. (2024) Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance. Comput. Biol. Medicine.

DOI: 10.48550/arXiv.2410.13867

Recent citations

Papers that recently cited this model.

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification
Seyyedhamid Azimidokht, M. Monemi, Abdelhak Kharbouch, et al.
Jul 2026
0
Hierarchical Self-Supervised Representation Learning Framework for Multivariate Time Series Grounded in ECG Analysis
Siwon Kim
Jul 2026
0Influential
A Lightweight Self-Supervised Learning Framework for Multivariate Time Series using Hierarchical-JEPA on ECG Data
Siwon Kim
Jul 2026
0Influential

Top citations

The most-cited papers that cite this model.

ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models
Phu X. Nguyen, Huy Phan, Hieu Pham, et al.
Aug 2025
2
Enhancing Contrastive Learning-Based Electrocardiogram Pretrained Model with Patient Memory Queue
Xiaoyun Sun, Yang Yang, Xunde Dong
IEEE International Conference on Bioinformatics and Biomedicine · May 2025
2
Advanced Self-Supervised Learning for Enhanced Heart Disease Prediction
Nesrine Atitallah, Feras Dalou, B. A. A. Al-Ghamdi
International Conference on Wireless Networks and Mobile Communications · Nov 2025
1
CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining
H. Muhammad, Zechen Li, Flora D. Salim, et al.
May 2026
1
Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification
Seyyedhamid Azimidokht, M. Monemi, Abdelhak Kharbouch, et al.
Jul 2026
0

Citations

Total Citations17

Influential1

References60

GitHub

Stars16

Forks2

Open Issues0

Contributors1

Last Push9mo ago

LanguagePython

LicenseMIT

Fields of citing research

Computer Science100%
Medicine88%
Engineering38%
Environmental Science6%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

62Partial

Usability — can I run it?64

Reproducibility — can I retrain it?62

Model Openness Framework

Unclassified

Missing required components

Resources

GitHub Repository Research Paper

Key Features

Augmentation-free self-supervision: JEPA learns by predicting masked latent features from visible context, eliminating the need for hand-designed ECG augmentations whose physiological correctness is hard to guarantee.

Latent prediction over reconstruction: The model predicts abstract representations rather than reconstructing the raw waveform, focusing capacity on diagnostically relevant structure instead of high-frequency noise.

Large-scale unlabeled pretraining: Training draws on over one million ECG records aggregated from ten public databases, far exceeding any single labeled dataset.

Strong fine-tuned and linear performance: The pretrained encoder yields high accuracy both when fully fine-tuned and under frozen linear evaluation, indicating that the learned features are broadly useful.

Open, reproducible code: The full pretraining and evaluation pipeline is released under the MIT license.

Technical Details

Applications

Impact

Citation

Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance

Preprint

Weimann, K. & Conrad, T. O. F. (2024) Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance. Comput. Biol. Medicine.

DOI: 10.48550/arXiv.2410.13867

Recent citations

Papers that recently cited this model.

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification

Seyyedhamid Azimidokht, M. Monemi, Abdelhak Kharbouch, et al.

Jul 2026

Hierarchical Self-Supervised Representation Learning Framework for Multivariate Time Series Grounded in ECG Analysis

Siwon Kim

Jul 2026

0Influential

A Lightweight Self-Supervised Learning Framework for Multivariate Time Series using Hierarchical-JEPA on ECG Data

Siwon Kim

Jul 2026

0Influential

Top citations

The most-cited papers that cite this model.

ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models

Phu X. Nguyen, Huy Phan, Hieu Pham, et al.

Aug 2025

Enhancing Contrastive Learning-Based Electrocardiogram Pretrained Model with Patient Memory Queue

Xiaoyun Sun, Yang Yang, Xunde Dong

IEEE International Conference on Bioinformatics and Biomedicine · May 2025

Advanced Self-Supervised Learning for Enhanced Heart Disease Prediction

Nesrine Atitallah, Feras Dalou, B. A. A. Al-Ghamdi

International Conference on Wireless Networks and Mobile Communications · Nov 2025

CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

H. Muhammad, Zechen Li, Flora D. Salim, et al.

May 2026

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification

Seyyedhamid Azimidokht, M. Monemi, Abdelhak Kharbouch, et al.

Jul 2026

ECG-JEPA

#Key Features

#Technical Details

#Applications

#Impact

Citation

Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance

Recent citations

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification

Hierarchical Self-Supervised Representation Learning Framework for Multivariate Time Series Grounded in ECG Analysis

A Lightweight Self-Supervised Learning Framework for Multivariate Time Series using Hierarchical-JEPA on ECG Data

Top citations

ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models

Enhancing Contrastive Learning-Based Electrocardiogram Pretrained Model with Patient Memory Queue

Advanced Self-Supervised Learning for Enhanced Heart Disease Prediction

CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

ECG-JEPA

#Key Features

#Technical Details

#Applications

#Impact

Citation

Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance

Recent citations

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification

Hierarchical Self-Supervised Representation Learning Framework for Multivariate Time Series Grounded in ECG Analysis

A Lightweight Self-Supervised Learning Framework for Multivariate Time Series using Hierarchical-JEPA on ECG Data

Top citations

ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models

Enhancing Contrastive Learning-Based Electrocardiogram Pretrained Model with Patient Memory Queue

Advanced Self-Supervised Learning for Enhanced Heart Disease Prediction

CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

Joint-Embedding Predictive Architecture for Solar PV Panel Fault Classification

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact