bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

PaPaGei

Nokia Bell Labs

The first open foundation model for photoplethysmography (PPG), pretrained on ~20M PPG segments with morphology-aware representation learning for cardiovascular and health tasks.

Released: October 2024

PaPaGei is the first open foundation model for photoplethysmography (PPG), the optical signal captured by pulse oximeters, smartwatches, and other wearables to measure blood-volume changes. While PPG underlies a growing range of consumer and clinical health monitoring, prior deep-learning efforts were largely task-specific or relied on generic time-series models that ignore the physiological structure of the pulse waveform. PaPaGei was developed by Arvind Pillai, Dimitris Spathis, and colleagues at Nokia Bell Labs and introduced in a preprint released in October 2024 and accepted at ICLR 2025.

The central idea is a representation-learning approach that leverages domain knowledge of PPG signal morphology across individuals, rather than treating each recording as an opaque time series. This morphology-aware objective lets the model learn embeddings that capture clinically meaningful waveform features and generalize across cardiovascular health, sleep, pregnancy, and wellbeing tasks. By releasing both code and pretrained weights trained exclusively on publicly available data, the authors aim to provide a reusable, reproducible backbone for the wearable-sensing community.

A distinguishing feature is that PaPaGei pairs strong benchmark performance with an explicit fairness analysis, evaluating robustness across skin tones, a known failure mode for optical sensors, and establishing a benchmark for bias evaluation in future PPG models.

#Key Features

  • First open PPG foundation model: Provides openly released code and pretrained weights (PaPaGei-S, hosted on Zenodo) trained only on public datasets, with no proprietary data.
  • Morphology-aware pretraining: Uses a self-supervised objective grounded in PPG waveform morphology across individuals, capturing richer representations than conventional contrastive learning.
  • Frozen-feature transfer: Embeddings are evaluated as frozen features across 20 downstream tasks, so the model serves as a plug-in feature extractor or multimodal encoder without task-specific fine-tuning.
  • Parameter and data efficiency: Matches or exceeds time-series foundation models up to roughly 70x larger, lowering the compute barrier for wearable applications.
  • Built-in fairness benchmark: Reports performance across skin tones, addressing a well-documented source of bias in optical physiological sensing.

#Technical Details

PaPaGei-S is built on a 1D convolutional backbone, a ResNet-style encoder (ResNet1DMoE) with 18 residual blocks, base filters of 32, kernel size 3, and a mixture-of-experts head of three expert modules with a gating mechanism, producing 512-dimensional embeddings from single-channel PPG segments. It was pretrained on over 20 million unlabeled PPG segments totaling more than 57,000 hours of recordings drawn entirely from publicly available datasets. The model was evaluated under a frozen-feature linear-probing protocol on 20 tasks spanning 10 datasets, covering cardiovascular health, sleep disorders, pregnancy monitoring, and wellbeing. Against state-of-the-art time-series foundation models and self-supervised baselines, PaPaGei improved classification metrics by 6.3% and regression metrics by 2.9% on at least 14 tasks, while remaining more data- and parameter-efficient than substantially larger competitors. The released implementation is in PyTorch under a BSD-3-Clause license.

#Applications

PaPaGei targets researchers and developers building health-monitoring systems on top of wearable and clinical PPG sensors. Its frozen embeddings can be used directly as input features for downstream models predicting cardiovascular risk markers, sleep stages and disorders, pregnancy-related signals, and wellbeing indicators, removing the need to train PPG encoders from scratch for each new task. Because it functions as a general feature extractor and multimodal encoder, it is well suited to settings with limited labeled data, and its parameter efficiency makes it practical for resource- constrained or on-device deployment.

#Impact

By establishing an openly available foundation model for PPG, PaPaGei fills a notable gap in the digital-health and wearable-sensing ecosystem, where reproducible, pretrained backbones have lagged behind those for images, text, and protein sequences. Its demonstration that a compact, morphology-aware model can outperform much larger generic time-series models offers a template for domain-informed biosignal pretraining. Equally important, its explicit skin-tone bias benchmark brings fairness evaluation into the standard reporting for PPG models, an issue with direct consequences for equitable health monitoring. As an early open release in this space, its main limitations are the focus on a single signal modality and reliance on public datasets whose demographic coverage may constrain generalization.

Citation

PaPaGei: Open Foundation Models for Optical Physiological Signals

Preprint

Pillai, A., et al. (2024) PaPaGei: Open Foundation Models for Optical Physiological Signals. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2410.20542

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations70
Influential12
References70

GitHub

Stars164
Forks32
Open Issues3
Contributors3
Last Push11mo ago
LanguagePython
LicenseBSD-3-Clause

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
67Partial
Usability — can I run it?83
Reproducibility — can I retrain it?59
Model Openness Framework
Class III
Open Model

Tags

bias_evaluationcardiovascular_healthcnnfoundation_modelmixture_of_expertsphotoplethysmographyrepresentation_learningself_supervised

Resources

GitHub RepositoryResearch PaperDataset