ECG-LM

Multimodal ECG language model pairing a specialized signal encoder with a biomedical LLM for cardiovascular disease detection and question answering.

Released: February 2025

Parameters: 7 Billion

ECG-LM is a multimodal large language model for interpreting electrocardiograms (ECGs), developed by researchers at the Institute for AI Industry Research (AIR) at Tsinghua University together with Beijing Tsinghua Changgung Hospital and PharMolix Inc., and published in Health Data Science in February 2025. It is presented as the first model to align a specialized ECG signal encoder with a general-purpose biomedical LLM, bridging raw physiological waveforms and free-form clinical language in a single system.

The central problem ECG-LM addresses is that conventional deep-learning ECG models are typically trained as narrow, fixed-label classifiers, which limits their ability to handle the open-ended, conversational questions that arise in clinical practice. Large language models excel at such reasoning over text but cannot natively consume time-series ECG signals. ECG-LM connects the two by projecting features from a dedicated ECG encoder into the text feature space of an LLM, enabling the language model to "read" an ECG and answer questions about it.

A practical obstacle to this approach is the scarcity of paired text–ECG training data. The authors address this by synthesizing instruction-style training pairs from cardiovascular clinical guidelines and structured ECG report features, allowing the model to learn diagnostic associations without requiring large volumes of manually annotated multimodal data.

Key Features

Signal-to-language alignment: A specialized ECG encoder is aligned with the text feature space of an LLM, so cardiac waveforms can be reasoned over alongside natural-language prompts.
Guideline-grounded pretraining: Text–ECG pairs are generated from cardiovascular clinical guidelines and structured report features, mitigating the shortage of paired multimodal data.
Zero- and few-shot detection: The model performs cardiovascular disease detection across diagnostic, rhythm, and form tasks without task-specific retraining, outperforming contrastive few-shot baselines.
Variable-lead input: An improved ResNet-18 encoder accepts a variable number of ECG leads without architectural changes, accommodating diverse recording setups.
Clinical question answering: Beyond classification, ECG-LM answers verify, choose, and query style questions over ECGs, supporting more interactive clinical use.

Technical Details

ECG-LM couples an improved ResNet-18 convolutional encoder, modified to handle variable input sizes and lead counts, with BioMedGPT-LM-7B, a 7-billion-parameter LLM built on LLaMA2-Chat-7B and pretrained on roughly 4.2 million biomedical articles from the S2ORC corpus. Training and evaluation draw on PTB-XL (21,799 clinical 12-lead records from over 18,000 patients) and the PTB-XL+ feature dataset, with non-English reports translated and manually validated. On PTB-XL, the zero-shot ECG-LM outperforms a SimCLR-based few-shot baseline across all three task families: diagnostic (F1 0.647 vs. 0.485), rhythm (F1 0.524 vs. 0.456), and form (F1 0.570 vs. 0.549). On the ECG-QA benchmark it reaches 0.758 accuracy on Single-Verify, 0.574 on Single-Choose, and 0.399 on Single-Query questions, for a 0.577 average across the three question types.

Applications

ECG-LM targets clinical and research settings where ECG interpretation must be combined with natural-language reasoning, such as automated triage, decision support, and interactive question answering over cardiac recordings. By handling diagnostic, rhythm, and form classification within a single conversational interface, it can assist clinicians who need explanations rather than bare labels, and supports researchers building ECG-aware assistants. Its tolerance for variable lead counts makes it adaptable to settings ranging from standard 12-lead hospital ECGs to reduced-lead acquisitions.

Impact

ECG-LM is an early demonstration that general biomedical LLMs can be extended to consume physiological signals directly, pointing toward conversational diagnostic tools for cardiology. Its guideline-driven synthetic-pairing strategy offers a template for other signal-to-language alignment problems where paired data are scarce. A key limitation for reproducibility and downstream adoption is that, as of publication, the code and weights had not been released; the authors stated they were preparing all code and data for public release, with parts of the supervised fine-tuning data contingent on hospital data-sharing agreements. No model card or data card is currently available, and independent benchmarking will depend on that pending release.

Citation

ECG-LM: Understanding Electrocardiogram with a Large Language Model

Yang, K., et al. (2024) ECG-LM: Understanding Electrocardiogram with a Large Language Model. Health Data Science.

DOI: 10.34133/hds.0221

Recent citations

Papers that recently cited this model.

Artificial Intelligence-enabled Electrocardiography for Early Detection of Silent Atrial Fibrillation: Advances, Challenges, and Future Perspectives
Haneen Abdul Rasheed, I. Maideen, Maria Elenor Bulatao Lozada, et al.
Indian Journal of Clinical Cardiology · Jul 2026
0
Physiology-Aware CNN and Zero-Shot Multimodal LLMs for ECG Image Classification: A Comparative Study
K. Ahammad, D. Abbott, M. Dorraki
Jun 2026
0
An Edge–Cloud Collaborative ECG-Assisted Diagnostic System Leveraging Cross-Lead Knowledge Distillation and Large Language Models
Haohan Su, Shuai Wang, Hongxiao Wang, et al.
Italian National Conference on Sensors · Jun 2026
0

Top citations

The most-cited papers that cite this model.

Data-Centric Foundation Models in Computational Healthcare: A Survey
Yunkun Zhang, Jin Gao, Zheling Tan, et al.
ACM Computing Surveys · Jan 2024
42
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
Xiang Lan, Feng Wu, Kai He, et al.
arXiv.org · Mar 2025
37
Q-HEART: ECG Question Answering via Knowledge-Informed Multimodal LLMs
Hung Manh Pham, Jialu Tang, Aaqib Saeed, et al.
European Conference on Artificial Intelligence · May 2025
10Influential
SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
Mingsheng Cai, Jiuming Jiang, Wenhao Huang, et al.
Conference on Empirical Methods in Natural Language Processing · Feb 2025
7
The cost of explainability in artificial intelligence-enhanced electrocardiogram models
K. Patlatzoglou, L. Pastika, Joseph Barker, et al.
npj Digital Medicine · Dec 2025
6

Citations

Total Citations42

Influential1

References0

Fields of citing research

Computer Science97%
Medicine92%
Engineering22%
Biology5%
Environmental Science3%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

24Closed

Usability — can I run it?15

Reproducibility — can I retrain it?16

Model Openness Framework

Unclassified

Missing required components

Resources

Research Paper Research Paper

Key Features

Signal-to-language alignment: A specialized ECG encoder is aligned with the text feature space of an LLM, so cardiac waveforms can be reasoned over alongside natural-language prompts.

Guideline-grounded pretraining: Text–ECG pairs are generated from cardiovascular clinical guidelines and structured report features, mitigating the shortage of paired multimodal data.

Zero- and few-shot detection: The model performs cardiovascular disease detection across diagnostic, rhythm, and form tasks without task-specific retraining, outperforming contrastive few-shot baselines.

Variable-lead input: An improved ResNet-18 encoder accepts a variable number of ECG leads without architectural changes, accommodating diverse recording setups.

Clinical question answering: Beyond classification, ECG-LM answers verify, choose, and query style questions over ECGs, supporting more interactive clinical use.

Technical Details

Applications

Impact

ECG-LM

#Key Features

#Technical Details

#Applications

#Impact

Citation

ECG-LM: Understanding Electrocardiogram with a Large Language Model

Recent citations

Physiology-Aware CNN and Zero-Shot Multimodal LLMs for ECG Image Classification: A Comparative Study

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

ECG-LM

#Key Features

#Technical Details

#Applications

#Impact

Citation

ECG-LM: Understanding Electrocardiogram with a Large Language Model

Recent citations

Physiology-Aware CNN and Zero-Shot Multimodal LLMs for ECG Image Classification: A Comparative Study

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact