PaPaGei

Open foundation model for photoplethysmography (PPG), learning morphology-aware waveform representations for cardiovascular and wearable health tasks.

Released: October 2024

PaPaGei is the first open foundation model for photoplethysmography (PPG), the optical signal captured by pulse oximeters, smartwatches, and other wearables to measure blood-volume changes. While PPG underlies a growing range of consumer and clinical health monitoring, prior deep-learning efforts were largely task-specific or relied on generic time-series models that ignore the physiological structure of the pulse waveform. PaPaGei was developed by Arvind Pillai, Dimitris Spathis, and colleagues at Nokia Bell Labs and introduced in a preprint released in October 2024 and accepted at ICLR 2025.

The central idea is a representation-learning approach that leverages domain knowledge of PPG signal morphology across individuals, rather than treating each recording as an opaque time series. This morphology-aware objective lets the model learn embeddings that capture clinically meaningful waveform features and generalize across cardiovascular health, sleep, pregnancy, and wellbeing tasks. By releasing both code and pretrained weights trained exclusively on publicly available data, the authors aim to provide a reusable, reproducible backbone for the wearable-sensing community.

A distinguishing feature is that PaPaGei pairs strong benchmark performance with an explicit fairness analysis, evaluating robustness across skin tones, a known failure mode for optical sensors, and establishing a benchmark for bias evaluation in future PPG models.

Key Features

First open PPG foundation model: Provides openly released code and pretrained weights (PaPaGei-S, hosted on Zenodo) trained only on public datasets, with no proprietary data.
Morphology-aware pretraining: Uses a self-supervised objective grounded in PPG waveform morphology across individuals, capturing richer representations than conventional contrastive learning.
Frozen-feature transfer: Embeddings are evaluated as frozen features across 20 downstream tasks, so the model serves as a plug-in feature extractor or multimodal encoder without task-specific fine-tuning.
Parameter and data efficiency: Matches or exceeds time-series foundation models up to roughly 70x larger, lowering the compute barrier for wearable applications.
Built-in fairness benchmark: Reports performance across skin tones, addressing a well-documented source of bias in optical physiological sensing.

Technical Details

PaPaGei-S is built on a 1D convolutional backbone, a ResNet-style encoder (ResNet1DMoE) with 18 residual blocks, base filters of 32, kernel size 3, and a mixture-of-experts head of three expert modules with a gating mechanism, producing 512-dimensional embeddings from single-channel PPG segments. It was pretrained on over 20 million unlabeled PPG segments totaling more than 57,000 hours of recordings drawn entirely from publicly available datasets. The model was evaluated under a frozen-feature linear-probing protocol on 20 tasks spanning 10 datasets, covering cardiovascular health, sleep disorders, pregnancy monitoring, and wellbeing. Against state-of-the-art time-series foundation models and self-supervised baselines, PaPaGei improved classification metrics by 6.3% and regression metrics by 2.9% on at least 14 tasks, while remaining more data- and parameter-efficient than substantially larger competitors. The released implementation is in PyTorch under a BSD-3-Clause license.

Applications

PaPaGei targets researchers and developers building health-monitoring systems on top of wearable and clinical PPG sensors. Its frozen embeddings can be used directly as input features for downstream models predicting cardiovascular risk markers, sleep stages and disorders, pregnancy-related signals, and wellbeing indicators, removing the need to train PPG encoders from scratch for each new task. Because it functions as a general feature extractor and multimodal encoder, it is well suited to settings with limited labeled data, and its parameter efficiency makes it practical for resource- constrained or on-device deployment.

Impact

By establishing an openly available foundation model for PPG, PaPaGei fills a notable gap in the digital-health and wearable-sensing ecosystem, where reproducible, pretrained backbones have lagged behind those for images, text, and protein sequences. Its demonstration that a compact, morphology-aware model can outperform much larger generic time-series models offers a template for domain-informed biosignal pretraining. Equally important, its explicit skin-tone bias benchmark brings fairness evaluation into the standard reporting for PPG models, an issue with direct consequences for equitable health monitoring. As an early open release in this space, its main limitations are the focus on a single signal modality and reliance on public datasets whose demographic coverage may constrain generalization.

Citation

PaPaGei: Open Foundation Models for Optical Physiological Signals

Preprint

Pillai, A., et al. (2024) PaPaGei: Open Foundation Models for Optical Physiological Signals. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2410.20542

Recent citations

Papers that recently cited this model.

Contactless Arrhythmia Detection via Diversity-Invariant Contrastive mmWave Sensing
Xinmeng Cai, Jinbo Chen, Haoyu Wang, et al.
IEEE Transactions on Mobile Computing · Aug 2026
0
Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection
L. Simon, M. Chetouani
Jun 2026
0
SPOTR: Spatio-temporal Pooling One-Token Reconstruction for Universal Physiological Signal Self-supervised Learning
Yiyu Gui, Mingzhi Chen, Yuesheng Zhu, et al.
Jun 2026
0

Top citations

The most-cited papers that cite this model.

Scaling Wearable Foundation Models
Girish Narayanswamy, Xin Liu, Kumar Ayush, et al.
International Conference on Learning Representations · Oct 2024
58
Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications across Lab and Field Settings
Mithun Saha, Maxwell A. Xu, Wanting Mao, et al.
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies · Feb 2025
40Influential
RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
Maxwell A. Xu, Jaya Narain, Gregory Darnell, et al.
International Conference on Learning Representations · Nov 2024
26
A Scoping Review of Deep Learning Methods for Photoplethysmography Data
Guangkun Nie, Jiabao Zhu, G. Tang, et al.
Health Data Science · Jan 2024
25
LSM-2: Learning from Incomplete Wearable Sensor Data
Maxwell A. Xu, Girish Narayanswamy, Kumar Ayush, et al.
arXiv.org · Jun 2025
22

Citations

Total Citations81

Influential12

References70

GitHub

Stars174

Forks35

Open Issues4

Contributors3

Last Push1y ago

LanguagePython

LicenseBSD-3-Clause

Fields of citing research

Computer Science88%
Medicine81%
Engineering64%
Environmental Science6%
Psychology3%
Biology3%
Physics1%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

67Partial

Usability — can I run it?83

Reproducibility — can I retrain it?59

Model Openness Framework

Class III

Open Model

Resources

GitHub Repository Research Paper Dataset

Key Features

First open PPG foundation model: Provides openly released code and pretrained weights (PaPaGei-S, hosted on Zenodo) trained only on public datasets, with no proprietary data.

Morphology-aware pretraining: Uses a self-supervised objective grounded in PPG waveform morphology across individuals, capturing richer representations than conventional contrastive learning.

Frozen-feature transfer: Embeddings are evaluated as frozen features across 20 downstream tasks, so the model serves as a plug-in feature extractor or multimodal encoder without task-specific fine-tuning.

Parameter and data efficiency: Matches or exceeds time-series foundation models up to roughly 70x larger, lowering the compute barrier for wearable applications.

Built-in fairness benchmark: Reports performance across skin tones, addressing a well-documented source of bias in optical physiological sensing.

Technical Details

Applications

Impact

Recent citations

Papers that recently cited this model.

Contactless Arrhythmia Detection via Diversity-Invariant Contrastive mmWave Sensing

Xinmeng Cai, Jinbo Chen, Haoyu Wang, et al.

IEEE Transactions on Mobile Computing · Aug 2026

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

L. Simon, M. Chetouani

Jun 2026

SPOTR: Spatio-temporal Pooling One-Token Reconstruction for Universal Physiological Signal Self-supervised Learning

Yiyu Gui, Mingzhi Chen, Yuesheng Zhu, et al.

Jun 2026

Top citations

The most-cited papers that cite this model.

Scaling Wearable Foundation Models

Girish Narayanswamy, Xin Liu, Kumar Ayush, et al.

International Conference on Learning Representations · Oct 2024

Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications across Lab and Field Settings

Mithun Saha, Maxwell A. Xu, Wanting Mao, et al.

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies · Feb 2025

40Influential

PaPaGei

#Key Features

#Technical Details

#Applications

#Impact

Citation

PaPaGei: Open Foundation Models for Optical Physiological Signals

Recent citations

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

SPOTR: Spatio-temporal Pooling One-Token Reconstruction for Universal Physiological Signal Self-supervised Learning

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

PaPaGei

#Key Features

#Technical Details

#Applications

#Impact

Citation

PaPaGei: Open Foundation Models for Optical Physiological Signals

Recent citations

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

SPOTR: Spatio-temporal Pooling One-Token Reconstruction for Universal Physiological Signal Self-supervised Learning

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact