University of Toronto / Vector Institute
An open transformer foundation model for 12-lead electrocardiograms, pretrained on 1.5M ECGs with hybrid contrastive and generative self-supervision.
ECG-FM is an open foundation model for the electrocardiogram (ECG), the most widely recorded cardiac biosignal in clinical medicine. Most ECG machine learning systems are trained from scratch on a single labeled dataset, which limits their performance when labels are scarce and hurts their ability to generalize across institutions and recording devices. ECG-FM addresses this by pretraining a large transformer on 1.5 million ECGs without labels, producing reusable representations that can be fine-tuned for many downstream clinical tasks with comparatively little task-specific data.
The model was developed by Kaden McKeen, Sameer Masood, Augustin Toma, Barry Rubin,
and Bo Wang at the University of Toronto and the Vector Institute, with the
preprint released in August 2024. It is built on the team's open fairseq_signals
framework and adapts the wav2vec 2.0 self-supervised architecture, originally designed
for speech, to multi-lead ECG waveforms.
ECG-FM is deliberately positioned as a fully open release: the code, the model weights, and the preprocessing pipeline are all public, in contrast to many proprietary ECG deep-learning systems. This makes it a practical starting point for cardiology researchers who want a strong pretrained backbone rather than building a waveform model from the ground up.
ECG-FM uses a wav2vec 2.0 transformer encoder with roughly 90.9 million parameters,
operating on standard 12-lead ECG waveforms. Pretraining draws on about 1.5 million
ECGs aggregated from MIMIC-IV-ECG v1.0 and the PhysioNet/Computing in Cardiology 2021
collection. The self-supervised objective layers contrastive multi-segment coding and
random lead masking on top of the base wav2vec 2.0 masked-feature prediction task,
encouraging representations that are consistent across temporal segments and robust to
missing leads. The model was evaluated by fine-tuning and linear probing on downstream
tasks including ECG interpretation labeling, atrial fibrillation detection, and reduced
LVEF prediction, where it outperformed comparable supervised baselines, particularly
when labeled data were limited. Released checkpoints include a pretrained backbone and a
MIMIC-IV-ECG fine-tuned variant; weights are loaded through the project's GitHub
instructions rather than the standard transformers library.
ECG-FM is intended as a reusable backbone for clinical and research ECG analysis. By fine-tuning the pretrained model, researchers can build classifiers for arrhythmia detection, diagnostic interpretation, and prediction of conditions that are not obvious from the waveform to a human reader, such as reduced ejection fraction. Because it performs well with modest labeled datasets and transfers across recording sources, it is especially useful for institutions that lack the large annotated cohorts typically needed to train ECG models from scratch, and for studying rarer cardiac conditions where labels are inherently limited.
ECG-FM is one of the first openly released ECG foundation models with public weights,
code, and preprocessing, making strong pretrained cardiac biosignal representations
broadly accessible. Its demonstration that speech-style self-supervised pretraining
transfers effectively to multi-lead ECG supports the broader move toward foundation
models for physiological signals and provides a reproducible baseline for the cardiology
machine-learning community. Practical limitations include the need to load weights
outside the standard transformers ecosystem and the usual caveat that clinical
deployment requires prospective validation beyond the retrospective benchmarks reported.
McKeen, K., et al. (2024) ECG-FM: An Open Electrocardiogram Foundation Model. arXiv.org.
DOI: 10.48550/arXiv.2408.05178Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data