ECGFounder

Peking University / Harvard Medical School / Emory University

Convolutional ECG foundation model trained on expert annotations spanning 150 diagnostic categories, with 12-lead and single-lead wearable variants.

Released: October 2024

Parameters: 76.3 Million

ECGFounder is a foundation model for electrocardiogram (ECG) analysis designed to serve as a general-purpose backbone for cardiovascular diagnosis. While deep learning has produced strong task-specific ECG classifiers, most are trained on small, narrowly labeled datasets and generalize poorly across recording domains, devices, and lead configurations. ECGFounder addresses this by pretraining a single model on a very large, expert-annotated corpus and then transferring it to downstream tasks, including the increasingly common single-lead signals captured by wearable devices.

The model was built on the Harvard-Emory ECG Database (HEEDB), using real-world annotations from cardiology experts spanning 150 diagnostic categories. It was developed by a collaboration led by Shenda Hong's group at Peking University together with clinical researchers at Massachusetts General Hospital / Harvard Medical School and Emory University, and first released as a preprint in October 2024. Pretrained checkpoints for both 12-lead and single-lead variants are publicly available.

By coupling the scale of HEEDB with broad diagnostic label coverage, ECGFounder aims to be a reusable starting point for ECG research, lowering the data and compute burden for groups that cannot assemble million-scale labeled datasets of their own.

Key Features

Large-scale expert supervision: Pretrained on over 10 million ECGs with 150 expert-annotated diagnostic categories, capturing a far broader label space than typical ECG datasets.
12-lead and single-lead variants: A standard 12-lead model plus a single-lead model trained with lead augmentation that simulates axis inversion, enabling deployment on wearable and Holter-style devices.
Transfer-ready backbone: Released checkpoints support fine-tuning on downstream tasks such as arrhythmia detection and demographic or clinical-event prediction, demonstrated on MIMIC-ECG and PTB-XL.
Strong cross-domain generalization: Validated externally on CODE-test, PTB-XL, and PhysioNet, maintaining high AUROC across datasets it was not trained on.

Technical Details

ECGFounder uses a RegNet-based 1D convolutional neural network with stage-wise scaling and bottleneck blocks that combine group convolutions and channel-wise attention. The primary variant has 76.3 million parameters; ablations across 11.7M, 25.6M, and 110M parameter models found 76.3M optimal. Training used 7,519,035 ECGs from 1,319,128 patients, with a held-out set of 834,926 ECGs from 146,570 patients, drawn predominantly from 10-second, 12-lead clinical recordings. On a committee-reviewed internal test set the model reached an average AUROC of 0.968 (95% CI 0.955-0.982), exceeding 0.95 for 80 individual diagnoses. External evaluation produced average AUROCs of 0.981 on CODE-test and 0.924 on PTB-XL, and the single-lead variant scored 0.975 for normal sinus rhythm and 0.957 for atrial fibrillation on PhysioNet data.

Applications

ECGFounder targets both research and clinical-adjacent workflows. Cardiology and signal-processing groups can fine-tune it for specific tasks—arrhythmia classification, cardiac event detection, demographic inference, or estimation of clinical variables—without training a large model from scratch. The single-lead variant is aimed at consumer wearables and ambulatory monitors, where only one lead is available, while the 12-lead model suits hospital and clinic settings. Downstream studies have already adapted it for laboratory-value estimation and ICD-code-based disease profiling, illustrating its use as a transferable starting point.

Impact

ECGFounder is one of the first ECG foundation models built at the scale of millions of expert-labeled recordings with broad, externally validated diagnostic coverage, and its public 12-lead and single-lead checkpoints have made it a practical base for subsequent work (for example AnyECG and distillation studies). Its main limitations stem from supervised pretraining on a single institutional database, which may carry annotation and population biases, and from a licensing discrepancy worth noting: the released code and weights on GitHub and Hugging Face are under the MIT license, while the preprint states CC BY 4.0. Even so, by demonstrating that large-scale expert supervision yields a robust, transferable ECG backbone, it helps push biosignal analysis toward the reusable foundation-model paradigm already common in protein and language modeling.

Citation

An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

Preprint

Li, J., et al. (2024) An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains. arXiv.org.

DOI: 10.48550/arXiv.2410.04133

Recent citations

Papers that recently cited this model.

LSTrans: Efficient Knowledge Transfer for Lightweight and Automated ECG Classification
Yi Zhao, Jiajun Gao, Chenyang Xu, et al.
Jul 2026
0Influential
Do ECG Foundation Models Transfer to Rare Cardiac Diseases? Evidence from Brugada Syndrome Detection
B. Zanchi, G. Monachino, A. D. Rossi, et al.
Jul 2026
0Influential
BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing
M. R. Kjaer, Haejun Han, Ashish Neupane, et al.
Jun 2026
0Influential

Top citations

The most-cited papers that cite this model.

A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons
Yu Han, V. Murino, Xiaofeng Liu, et al.
Oct 2024
11
OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records
Zhijiang Wan, Qianhao Yu, J. Mao, et al.
arXiv.org · Mar 2025
9
A Multi-Scale Deep Learning Framework Combining MobileViT-ECA and LSTM for Accurate ECG Analysis
Abduljabbar S. Ba Mahel, Mehdhar S. A. M. Al-Gaashani, R. Alkanhel, et al.
IEEE Access · 2025
9
PPGFlowECG: Latent Rectified Flow with Cross-Modal Encoding for PPG-Guided ECG Generation and Cardiovascular Disease Detection
Xiaocheng Fang, Jiarui Jin, Haoyu Wang, et al.
arXiv.org · Sep 2025
8
CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models
Yuxuan Shu, P. Charlton, F. Kawsar, et al.
arXiv.org · Dec 2025
3

Citations

Total Citations34

Influential6

References47

GitHub

Stars141

Forks27

Open Issues1

Contributors2

Last Push5mo ago

LanguagePython

LicenseMIT

HuggingFace

Downloads118

Likes32

Last Modified1y ago

Fields of citing research

Computer Science100%
Medicine100%
Engineering53%
Biology3%
Physics3%

Share of papers citing this model.

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

75Open

Usability — can I run it?95

Reproducibility — can I retrain it?44

Model Openness Framework

Unclassified

Missing required components

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Large-scale expert supervision: Pretrained on over 10 million ECGs with 150 expert-annotated diagnostic categories, capturing a far broader label space than typical ECG datasets.

12-lead and single-lead variants: A standard 12-lead model plus a single-lead model trained with lead augmentation that simulates axis inversion, enabling deployment on wearable and Holter-style devices.

Transfer-ready backbone: Released checkpoints support fine-tuning on downstream tasks such as arrhythmia detection and demographic or clinical-event prediction, demonstrated on MIMIC-ECG and PTB-XL.

Strong cross-domain generalization: Validated externally on CODE-test, PTB-XL, and PhysioNet, maintaining high AUROC across datasets it was not trained on.

Technical Details

Applications

Impact

Citation

An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

Preprint

Li, J., et al. (2024) An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains. arXiv.org.

DOI: 10.48550/arXiv.2410.04133

Recent citations

Papers that recently cited this model.

LSTrans: Efficient Knowledge Transfer for Lightweight and Automated ECG Classification

Yi Zhao, Jiajun Gao, Chenyang Xu, et al.

Jul 2026

0Influential

Do ECG Foundation Models Transfer to Rare Cardiac Diseases? Evidence from Brugada Syndrome Detection

B. Zanchi, G. Monachino, A. D. Rossi, et al.

Jul 2026

0Influential

BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing

M. R. Kjaer, Haejun Han, Ashish Neupane, et al.

Jun 2026

0Influential

ECGFounder

#Key Features

#Technical Details

#Applications

#Impact

Citation

An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

Recent citations

LSTrans: Efficient Knowledge Transfer for Lightweight and Automated ECG Classification

Do ECG Foundation Models Transfer to Rare Cardiac Diseases? Evidence from Brugada Syndrome Detection

BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing

Top citations

A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

ECGFounder

#Key Features

#Technical Details

#Applications

#Impact

Citation

An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

Recent citations

LSTrans: Efficient Knowledge Transfer for Lightweight and Automated ECG Classification

Do ECG Foundation Models Transfer to Rare Cardiac Diseases? Evidence from Brugada Syndrome Detection

BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing

Top citations

A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact