bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

HeartLang

Peking University

Self-supervised ECG foundation model that treats heartbeats as words and rhythms as sentences, using a QRS-Tokenizer and dual-level pretraining on MIMIC-IV-ECG.

Released: February 2025

HeartLang is a self-supervised foundation model for the electrocardiogram (ECG) that reframes signal modeling as a language-modeling problem. Rather than carving the waveform into fixed-length time windows—the dominant practice in deep ECG models—it treats individual heartbeats as "words" and the sequence of beats that forms a rhythm strip as a "sentence." This semantic segmentation is designed to respect the natural structure of cardiac signals, where the clinically meaningful unit is the heartbeat and its morphology, not an arbitrary slice of time.

The model was introduced in "Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model" by Jiarui Jin, Haoyu Wang, Hongyan Li, Jun Li, Jiahui Pan, and Shenda Hong from the PKUDigitalHealth group at Peking University, and accepted to ICLR 2025. It sits within the broader wave of biosignal foundation models that adapt masked-prediction pretraining—popularized by language and vision models—to physiological time series, and is distinguished by its explicitly linguistic, heartbeat-centric tokenization.

By pretraining on a large corpus of unlabeled ECGs and transferring to downstream diagnostic tasks, HeartLang aims to reduce the heavy annotation burden that has limited supervised ECG deep learning, while producing representations that capture both single-beat form and longer-range rhythm context.

#Key Features

  • Heartbeat-as-word tokenization: A QRS-Tokenizer detects QRS complexes and segments the raw signal into individual heartbeats, converting a continuous waveform into a sequence of semantically meaningful, beat-aligned tokens ("ECG sentences").
  • Dual-level representation learning: Pretraining operates at two levels—the form level captures the morphology of individual heartbeats, while the rhythm level captures how beats are arranged over time—mirroring the word/sentence distinction.
  • Vector-quantized heartbeat vocabulary: A Vector-Quantized Heartbeat Reconstruction (VQ-HBR) stage learns a discrete codebook of 8192 entries, building what the authors describe as the largest heartbeat-based ECG vocabulary to date.
  • Masked ECG sentence pretraining: Rhythm-level representations are learned through a masked-prediction objective over sequences of heartbeat tokens, analogous to masked language modeling.
  • Self-supervised and transferable: The model is pretrained without diagnostic labels and fine-tuned for downstream classification, lowering reliance on scarce expert annotations.

#Technical Details

HeartLang uses a transformer backbone trained in two stages. First, the VQ-HBR module encodes each tokenized heartbeat into a discrete code drawn from an 8192-entry codebook, establishing the ECG "vocabulary"; this stage is reconstruction-based and vector quantized. Second, a masked ECG sentence pretraining stage learns rhythm-level representations by masking and predicting heartbeat tokens across sequences. Pretraining uses the MIMIC-IV-ECG dataset from PhysioNet, a large collection of 12-lead clinical recordings, run for roughly 200 epochs with learning-rate scheduling; the reference implementation trains VQ-HBR on 8 NVIDIA RTX 4090 GPUs. The model is evaluated across six public ECG datasets, including diagnostic subsets of PTB-XL, CPSC2018, and the Chapman-Shaoxing-Ningbo (CSN) arrhythmia dataset, where the authors report improved downstream classification over prior self-supervised ECG baselines. Code and pretrained checkpoints (the pretraining and VQ-HBR weights) are released under an MIT license.

#Applications

HeartLang targets automated ECG interpretation tasks such as multi-label diagnostic classification and arrhythmia detection. Because it is pretrained self-supervised on unlabeled recordings, it is particularly useful in settings where labeled ECGs are limited: a hospital or research group can fine-tune the released checkpoints on a modest annotated dataset rather than training from scratch. Beyond classification, its heartbeat-level discrete vocabulary and learned embeddings can serve as reusable features for downstream cardiac analysis, and the framework offers a template for applying language-model-style pretraining to other quasi-periodic physiological signals.

#Impact

By recasting ECG modeling as learning "words" and "sentences," HeartLang contributes a distinctive, biologically motivated tokenization strategy to the rapidly growing space of biosignal foundation models, and its acceptance at ICLR 2025 reflects interest in structure-aware self-supervised approaches. The public release of code, an 8192-entry heartbeat codebook, and pretrained weights lowers the barrier for downstream ECG research. Key limitations include dependence on accurate QRS detection for tokenization—noisy or abnormal beats may be mis-segmented—and evaluation centered on standard public benchmarks, so prospective clinical validation and robustness across diverse populations and devices remain open questions.

Citation

Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model

Preprint

Jin, J., et al. (2025) Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2502.10707

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations42
Influential9
References37

GitHub

Stars56
Forks6
Open Issues0
Contributors1
Last Push11mo ago
LanguagePython
LicenseMIT

HuggingFace

Downloads0
Likes4
Last Modified1y ago

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
78Open
Usability — can I run it?94
Reproducibility — can I retrain it?66
Model Openness Framework
Class III
Open Model

Tags

arrhythmia_detectionecg_classificationelectrocardiogramfoundation_modelrepresentation_learningself_supervisedtransformervector_quantization

Resources

GitHub RepositoryResearch PaperHuggingFace Model