Accelerometry foundation model distilled from PPG encoders on 20M minutes of wearable data to predict health biomarkers from any motion-sensing device.
Wrist-worn accelerometers are ubiquitous in consumer wearables, yet they are typically used for coarse tasks like step counting and activity recognition. The Apple Wearable Accelerometer Foundation Model, introduced by Abbaspourazad and colleagues at Apple in December 2024, asks a more ambitious question: can a single generalist model learn rich representations from raw acceleration signals that transfer to a wide range of downstream health prediction tasks? The answer the paper demonstrates is yes — and crucially, that these representations can be imbued with cardiovascular information that is not directly observable in motion data alone.
The central innovation is cross-modal knowledge distillation. Photoplethysmography (PPG), the optical blood-volume sensor found in many smartwatches, carries direct signal about heart rate and heart rate variability, but it is power-hungry and not present on every device. Accelerometry, by contrast, is cheap, low-power, and nearly universal. By training an accelerometry encoder to match the embeddings of a pretrained PPG encoder, the model transfers physiological knowledge from the richer modality into the more accessible one, so that downstream biomarkers can be inferred from accelerometry alone.
The work fits into the broader movement toward foundation models for wearable and biosignal data, alongside Apple's earlier PPG and ECG foundation models. It is pretrained on data from the Apple Heart and Movement Study, one of the largest longitudinal wearable cohorts assembled, positioning it as a generalist backbone for population-scale digital health.
The model is an accelerometry encoder trained via knowledge distillation against a PPG teacher encoder, using approximately 20 million minutes of unlabeled wearable recordings collected from about 172,000 participants. Pretraining is self-supervised in the sense that no human health labels are required; the supervisory signal comes entirely from the teacher's embeddings. After pretraining, the frozen accelerometry representations are evaluated on downstream tasks via lightweight probes. Relative to self-supervised and supervised accelerometry baselines, the distilled model delivers at least 23%–49% improved performance on predicting heart rate and heart rate variability, and achieves 99.2% top-1 accuracy on the cross-modal retrieval task of matching accelerometry embeddings to their corresponding PPG embeddings. The paper reports these results across multiple downstream health biomarkers, characterizing the embeddings as a generalist substrate rather than a single-task predictor.
The model targets continuous, passive health monitoring on consumer wearables. Because it extracts cardiovascular-informative features from accelerometry alone, it can extend heart-rate and HRV-style estimates to contexts where optical sensing is unavailable, unreliable, or too power-intensive — for example on simpler fitness trackers or during periods when the PPG sensor is gated to save battery. Researchers running large digital-health studies benefit from a reusable backbone that produces informative embeddings for many endpoints, reducing the labeled data needed for each new biomarker. More broadly, it illustrates a template for transferring knowledge from expensive, information-rich sensors to cheap, ubiquitous ones.
The work advances the case that wearable accelerometry, often treated as a low-value modality, can carry far more physiological information than its direct measurements suggest when paired with cross-modal distillation. It contributes to a growing family of Apple biosignal foundation models and to the wider effort to build generalist encoders for digital health. A key limitation for the research community is openness: as of publication, neither the model weights nor the training code were released, and the underlying Apple Heart and Movement Study data is not publicly available, so the results cannot be independently reproduced or directly built upon outside Apple. The contribution is therefore best read as a methodological demonstration of cross-modal distillation at population scale rather than a deployable open artifact.
Abbaspourazad, S., et al. (2024) Wearable Accelerometer Foundation Models for Health via Knowledge Distillation. arXiv.org.
DOI: 10.48550/arXiv.2412.11276Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data