University of California, San Diego
A multimodal foundation model for wearable physiological sensing (PPG, ECG, EEG, GSR, IMU) using channel-aware attention to learn generalizable representations.
NormWear is a multimodal foundation model for wearable physiological sensing, designed to extract generalizable representations from the heterogeneous time-series signals produced by consumer and clinical wearables. Wearable devices capture diverse modalities — photoplethysmography (PPG), electrocardiography (ECG), electroencephalography (EEG), galvanic skin response (GSR), and inertial measurement (IMU) — but real-world data is plagued by variability in sensor placement, sampling rate, channel availability, and device configuration. Most prior models were trained narrowly for a single signal type or task, limiting transfer. NormWear targets this fragmentation by learning a shared representation space that is compatible with arbitrary combinations of sensors and channels.
Developed by Yunfei Luo, Yuliang Chen, Asif Salekin, and Tauhidur Rahman at the MOSAIC mobile-sensing lab at the University of California, San Diego, NormWear was first released as a preprint in December 2024 and has since been accepted (in press, 2026) at ACM Transactions on Computing for Healthcare. It is presented as the first general-purpose foundation model spanning this breadth of wearable modalities, positioning it alongside emerging biosignal foundation models while emphasizing cross-sensor generality rather than single-modality specialization.
The central technical contribution is a channel-aware attention mechanism paired with a shared liaison [CLS] token, which lets the model reason both within individual sensor channels and across multiple sensors simultaneously, regardless of how many channels a given downstream dataset provides.
NormWear is a transformer-based encoder (with a decoder used during masked pretraining) of roughly 0.2B parameters. Inputs are converted to CWT scalograms and tokenized into patches, producing 768-dimensional patch embeddings with on the order of 365 patches per channel. Pretraining uses masked signal reconstruction as the self-supervised objective, with random or structured masking applied across time and frequency dimensions. The channel-aware attention layers operate over the per-channel patch sequences while the shared [CLS] token aggregates information across channels, enabling the architecture to scale to whatever sensor set a downstream dataset exposes. The model was benchmarked across 11 public wearable sensing datasets spanning 18 downstream applications in four domains — mental health, body-state inference, vital-sign estimation, and disease-risk evaluation — under zero-shot, partial-shot, and full-shot protocols, where it reported consistent improvements over state-of-the-art baselines.
NormWear is intended for researchers and developers building health-inference pipelines on wearable data. Its representations support mental-health assessment, body-state inference (such as activity or stress states), vital-sign estimation (for example heart-rate or respiration-related targets), and disease-risk evaluation. Because the model accepts heterogeneous sensor inputs and supports zero-shot text-aligned classification, it lowers the barrier for teams with small or label-scarce datasets to bootstrap a task without training from scratch, and it provides a common backbone for benchmarking new wearable health applications.
NormWear advances the case for general-purpose foundation models in the biosignal domain, where most prior work was siloed by signal type. By demonstrating that one channel-aware encoder can transfer across five modalities and 18 tasks, it offers a reusable starting point for the mobile-health community. The code is released under Apache-2.0 with pretrained weights distributed via Hugging Face and a GitHub release, supporting reproduction and extension. As a recent release its long-term adoption is still emerging, and reported gains depend on the specific dataset and evaluation regime; broader independent validation across devices and populations will determine how well its generality holds in deployment.
Luo, Y., et al. (2024) Toward Foundation Model for Multivariate Wearable Sensing of Physiological Signals. ACM Transactions on Computing for Healthcare.
DOI: 10.48550/arXiv.2412.09758Luo, Y., et al. (2024) Toward Foundation Model for Multivariate Wearable Sensing of Physiological Signals. ACM Transactions on Computing for Healthcare.
DOI: 10.1145/3803808Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data