PULSE

The Ohio State University / Carnegie Mellon University

Multimodal large language model that interprets 12-lead electrocardiogram images, answering open-ended clinical questions and generating ECG reports.

Released: October 2024

Parameters: 7 Billion

PULSE is a multimodal large language model (MLLM) built to interpret electrocardiograms presented as images rather than as digitized raw waveform signals. In clinical practice, ECGs are most often shared as printed or scanned 12-lead plots — embedded in PDFs, photographed on paper, or exported from monitors — and the underlying numeric signal is frequently unavailable. Most prior ECG-AI systems consume raw signals and target a narrow set of arrhythmias, limiting their use in resource-constrained settings. PULSE instead reasons directly over the ECG image, the same artifact a clinician sees, and answers open-ended questions about it in natural language.

Developed by researchers at The Ohio State University (Ruoqi Liu and Ping Zhang) in collaboration with Carnegie Mellon University (Xiang Yue), the model was introduced in the October 2024 preprint "Teach Multimodal LLMs to Comprehend Electrocardiographic Images." The work contributes three artifacts: ECGInstruct, a large instruction-tuning dataset of ECG images; PULSE itself, a 7B-parameter model fine-tuned on that data; and ECGBench, a standardized evaluation suite for ECG image understanding.

By framing ECG interpretation as an image-to-text task, PULSE connects the fast-moving world of general-purpose vision-language models to a clinically important biosignal modality, and demonstrates that instruction tuning on domain data substantially closes the gap between generalist MLLMs and the specialized reasoning ECG reading demands.

Key Features

Image-native ECG interpretation: Operates on rendered 12-lead ECG plots, the format actually circulated in clinics, rather than requiring raw signal access that is often unavailable.
ECGInstruct dataset: A curated instruction-tuning corpus of over 1 million ECG image-text samples spanning diverse cardiac conditions, sourced from multiple public ECG repositories.
ECGBench evaluation suite: A new benchmark covering four ECG interpretation tasks across nine datasets (including PTB-XL, CODE-15, CPSC, CSN, ECG-QA, and out-of-distribution sets), enabling reproducible comparison.
Strong gains over generalist MLLMs: Achieves an average accuracy improvement of roughly 15–30% over general-purpose multimodal LLMs on the benchmark tasks.
Open weights and demo: Apache-2.0 licensed weights (PULSE-7B), training/evaluation datasets, and a live HuggingFace Space demo are all publicly released.

Technical Details

PULSE-7B is fine-tuned from the open multimodal backbone LLaVA-v1.6 (Vicuna-7B), which pairs a CLIP-style vision encoder with a 7B Vicuna language model through a projection layer. The model is trained via instruction tuning on ECGInstruct, whose 1M+ samples are constructed from existing ECG signal datasets by rendering signals into images and pairing them with task-oriented instructions covering abnormality detection, diagnosis, rhythm and morphology questions, and report generation. Evaluation is performed on ECGBench, which organizes four core tasks — including multiple-choice question answering and report generation — across nine datasets, with held-out and out-of-distribution sets (such as an MMMU-style ECG split) to probe generalization. Across these tasks PULSE outperforms strong proprietary and open MLLM baselines by an average of 15–30% in accuracy, with the largest gains on tasks requiring fine-grained reading of waveform morphology.

Applications

PULSE is most directly useful where ECGs exist only as images and signal data is inaccessible: low-resource clinics, telemedicine, retrospective chart review, and educational settings where students query annotated ECG plots. Its conversational interface lets clinicians or researchers ask targeted questions ("Is there evidence of atrial fibrillation?") or request a structured report from a single photo of a tracing. The released datasets and benchmark also give the research community a shared foundation for building and fairly comparing the next generation of ECG image-understanding models.

Impact

PULSE establishes ECG image interpretation as a tractable multimodal LLM task and provides the field's first large-scale instruction dataset (ECGInstruct) and standardized benchmark (ECGBench) for it, lowering the barrier to entry for follow-on work. The fully open release — Apache-2.0 weights, training and evaluation data, code, and a hosted demo — makes the results reproducible and extensible. Important caveats remain: the model is a research artifact rather than a cleared clinical device, its training images are largely rendered from existing signal datasets and may not capture all real-world scan artifacts, and outputs require expert verification before any diagnostic use.

Citation

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Preprint

Liu, R., et al. (2024) Teach Multimodal LLMs to Comprehend Electrocardiographic Images. arXiv.org.

DOI: 10.48550/arXiv.2410.19008

Recent citations

Papers that recently cited this model.

Clinical artificial intelligence applications of vision-language foundation models
A. Thirunavukarasu, Siyou Li, Pengyao Qin, et al.
PLOS Digital Health · Jun 2026
0
EVL-ECG: Efficient ECG Interpretation With Multi-Aspect Heterogeneous Knowledge Distillation
Dan Hong, Nhi Ngoc-Yen Nguyen, Huy-Hieu Pham
May 2026
0Influential
Reasoning Before Diagnosis: Physician-Inspired Structured Thinking for ECG Classification
Yang Wu, Xiaoyan Yuan, H. Wong, et al.
May 2026
0

Top citations

The most-cited papers that cite this model.

GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
Xiang Lan, Feng Wu, Kai He, et al.
arXiv.org · Mar 2025
37Influential
A Current Review of Generative AI in Medicine: Core Concepts, Applications, and Current Limitations
Pouria Rouzrokh, Bardia Khosravi, S. Faghani, et al.
Current Reviews in Musculoskeletal Medicine · Apr 2025
21
From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining
Fuying Wang, Jiacheng Xu, Lequan Yu
International Conference on Machine Learning · Jun 2025
16
A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons
Yu Han, V. Murino, Xiaofeng Liu, et al.
Oct 2024
11
ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling
William Jongwon Han, Chaojing Duan, Michael A. Rosenberg, et al.
arXiv.org · Dec 2024
7

Citations

Total Citations30

Influential8

References48

GitHub

Stars67

Forks14

Open Issues9

Contributors2

Last Push1y ago

LanguagePython

HuggingFace

Downloads2K

Likes37

Last Modified1y ago

Pipelineimage-text-to-text

Fields of citing research

Computer Science100%
Medicine93%
Engineering25%
Psychology4%
Linguistics4%
Biology4%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

84Open

Usability — can I run it?100

Reproducibility — can I retrain it?72

Model Openness Framework

Class III

Open Model

Resources

GitHub Repository Research Paper Official Website HuggingFace Model Demo Dataset Dataset

Key Features

Image-native ECG interpretation: Operates on rendered 12-lead ECG plots, the format actually circulated in clinics, rather than requiring raw signal access that is often unavailable.

ECGInstruct dataset: A curated instruction-tuning corpus of over 1 million ECG image-text samples spanning diverse cardiac conditions, sourced from multiple public ECG repositories.

ECGBench evaluation suite: A new benchmark covering four ECG interpretation tasks across nine datasets (including PTB-XL, CODE-15, CPSC, CSN, ECG-QA, and out-of-distribution sets), enabling reproducible comparison.

Strong gains over generalist MLLMs: Achieves an average accuracy improvement of roughly 15–30% over general-purpose multimodal LLMs on the benchmark tasks.

Open weights and demo: Apache-2.0 licensed weights (PULSE-7B), training/evaluation datasets, and a live HuggingFace Space demo are all publicly released.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

Clinical artificial intelligence applications of vision-language foundation models

A. Thirunavukarasu, Siyou Li, Pengyao Qin, et al.

PLOS Digital Health · Jun 2026

EVL-ECG: Efficient ECG Interpretation With Multi-Aspect Heterogeneous Knowledge Distillation

Dan Hong, Nhi Ngoc-Yen Nguyen, Huy-Hieu Pham

May 2026

0Influential

Reasoning Before Diagnosis: Physician-Inspired Structured Thinking for ECG Classification

Yang Wu, Xiaoyan Yuan, H. Wong, et al.

May 2026

Top citations

The most-cited papers that cite this model.

GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images

Xiang Lan, Feng Wu, Kai He, et al.

arXiv.org · Mar 2025

37Influential

A Current Review of Generative AI in Medicine: Core Concepts, Applications, and Current Limitations

Pouria Rouzrokh, Bardia Khosravi, S. Faghani, et al.

Current Reviews in Musculoskeletal Medicine · Apr 2025

From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining

Fuying Wang, Jiacheng Xu, Lequan Yu

International Conference on Machine Learning · Jun 2025

A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons

Yu Han, V. Murino, Xiaofeng Liu, et al.

Oct 2024

ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling

William Jongwon Han, Chaojing Duan, Michael A. Rosenberg, et al.

arXiv.org · Dec 2024

PULSE

#Key Features

#Technical Details

#Applications

#Impact

Citation

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Recent citations

EVL-ECG: Efficient ECG Interpretation With Multi-Aspect Heterogeneous Knowledge Distillation

Reasoning Before Diagnosis: Physician-Inspired Structured Thinking for ECG Classification

Top citations

A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

PULSE

#Key Features

#Technical Details

#Applications

#Impact

Citation

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Recent citations

EVL-ECG: Efficient ECG Interpretation With Multi-Aspect Heterogeneous Knowledge Distillation

Reasoning Before Diagnosis: Physician-Inspired Structured Thinking for ECG Classification

Top citations

A Systematic Review on Foundation Models for Electrocardiogram Analysis: Initial Strides and Expansive Horizons

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact