bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

ECGFounder

Peking University / Harvard Medical School / Emory University

A 1D convolutional foundation model for electrocardiogram analysis, trained on over 10 million expert-annotated recordings across 150 diagnostic categories.

Released: October 2024
Parameters: 76.3 Million

ECGFounder is a foundation model for electrocardiogram (ECG) analysis designed to serve as a general-purpose backbone for cardiovascular diagnosis. While deep learning has produced strong task-specific ECG classifiers, most are trained on small, narrowly labeled datasets and generalize poorly across recording domains, devices, and lead configurations. ECGFounder addresses this by pretraining a single model on a very large, expert-annotated corpus and then transferring it to downstream tasks, including the increasingly common single-lead signals captured by wearable devices.

The model was built on the Harvard-Emory ECG Database (HEEDB), using real-world annotations from cardiology experts spanning 150 diagnostic categories. It was developed by a collaboration led by Shenda Hong's group at Peking University together with clinical researchers at Massachusetts General Hospital / Harvard Medical School and Emory University, and first released as a preprint in October 2024. Pretrained checkpoints for both 12-lead and single-lead variants are publicly available.

By coupling the scale of HEEDB with broad diagnostic label coverage, ECGFounder aims to be a reusable starting point for ECG research, lowering the data and compute burden for groups that cannot assemble million-scale labeled datasets of their own.

#Key Features

  • Large-scale expert supervision: Pretrained on over 10 million ECGs with 150 expert-annotated diagnostic categories, capturing a far broader label space than typical ECG datasets.
  • 12-lead and single-lead variants: A standard 12-lead model plus a single-lead model trained with lead augmentation that simulates axis inversion, enabling deployment on wearable and Holter-style devices.
  • Transfer-ready backbone: Released checkpoints support fine-tuning on downstream tasks such as arrhythmia detection and demographic or clinical-event prediction, demonstrated on MIMIC-ECG and PTB-XL.
  • Strong cross-domain generalization: Validated externally on CODE-test, PTB-XL, and PhysioNet, maintaining high AUROC across datasets it was not trained on.

#Technical Details

ECGFounder uses a RegNet-based 1D convolutional neural network with stage-wise scaling and bottleneck blocks that combine group convolutions and channel-wise attention. The primary variant has 76.3 million parameters; ablations across 11.7M, 25.6M, and 110M parameter models found 76.3M optimal. Training used 7,519,035 ECGs from 1,319,128 patients, with a held-out set of 834,926 ECGs from 146,570 patients, drawn predominantly from 10-second, 12-lead clinical recordings. On a committee-reviewed internal test set the model reached an average AUROC of 0.968 (95% CI 0.955-0.982), exceeding 0.95 for 80 individual diagnoses. External evaluation produced average AUROCs of 0.981 on CODE-test and 0.924 on PTB-XL, and the single-lead variant scored 0.975 for normal sinus rhythm and 0.957 for atrial fibrillation on PhysioNet data.

#Applications

ECGFounder targets both research and clinical-adjacent workflows. Cardiology and signal-processing groups can fine-tune it for specific tasks—arrhythmia classification, cardiac event detection, demographic inference, or estimation of clinical variables—without training a large model from scratch. The single-lead variant is aimed at consumer wearables and ambulatory monitors, where only one lead is available, while the 12-lead model suits hospital and clinic settings. Downstream studies have already adapted it for laboratory-value estimation and ICD-code-based disease profiling, illustrating its use as a transferable starting point.

#Impact

ECGFounder is one of the first ECG foundation models built at the scale of millions of expert-labeled recordings with broad, externally validated diagnostic coverage, and its public 12-lead and single-lead checkpoints have made it a practical base for subsequent work (for example AnyECG and distillation studies). Its main limitations stem from supervised pretraining on a single institutional database, which may carry annotation and population biases, and from a licensing discrepancy worth noting: the released code and weights on GitHub and Hugging Face are under the MIT license, while the preprint states CC BY 4.0. Even so, by demonstrating that large-scale expert supervision yields a robust, transferable ECG backbone, it helps push biosignal analysis toward the reusable foundation-model paradigm already common in protein and language modeling.

Citation

An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

Preprint

Li, J., et al. (2024) An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains. arXiv.org.

DOI: 10.48550/arXiv.2410.04133

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations32
Influential4
References47

GitHub

Stars124
Forks25
Open Issues1
Contributors2
Last Push4mo ago
LanguagePython
LicenseMIT

HuggingFace

Downloads130
Likes29
Last Modified1y ago

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
75Open
Usability — can I run it?95
Reproducibility — can I retrain it?44
Model Openness Framework
Unclassified
Missing required components

Tags

arrhythmia_detectioncardiologycnnecg_classificationelectrocardiogramfoundation_modelregnetsupervisedtransfer_learning

Resources

GitHub RepositoryResearch PaperHuggingFace Model