A multimodal wearable foundation model trained on 40M hours of six-sensor data from 165K+ people, establishing scaling laws for physiological sensor signals.
LSM (Large Sensor Model) is a multimodal foundation model for consumer wearable data, developed by researchers at Google in collaboration with the University of Washington, MIT, and USC, and presented at ICLR 2025 ("Scaling Wearable Foundation Models," arXiv 2410.13638, October 2024). It addresses a gap in the foundation-model landscape: while text, image, and biological-sequence models have been scaled extensively, the continuous physiological and behavioral signals streamed by wrist-worn devices had not been studied at comparable scale. LSM treats these heterogeneous sensor channels as a single multimodal substrate for self-supervised pretraining.
The model is trained on what the authors describe as the largest wearable-signals dataset assembled to date — up to 40 million hours of per-minute data drawn from more than 165,000 individuals wearing Fitbit and Pixel Watch devices. Six modalities are combined: heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter. Rather than building a bespoke model per sensor or per task, LSM learns a shared representation that transfers across modalities and downstream objectives.
A central contribution is empirical: the work characterizes how performance improves with data, compute, and model size, establishing scaling laws for the wearable-signal domain analogous to those documented for language and vision. This gives the field a principled basis for investing in larger sensor models and datasets, and demonstrates that the data-efficiency benefits seen in other modalities also hold for noisy, sparsely-sampled physiological time series.
LSM is a transformer-based model pretrained in a self-supervised, masked-modeling fashion over multimodal sensor windows, learning to reconstruct masked regions across both time and sensor channels. Inputs are minute-resolution streams from the six wearable modalities, drawn from Fitbit and Pixel Watch data. The authors sweep dataset size (up to 40M hours), model size, and compute to fit scaling curves, and they evaluate the learned representations on discriminative tasks (exercise and activity recognition) and generative tasks (imputation across time gaps, interpolation, and extrapolation, including cross-modal reconstruction of a withheld sensor). Across these settings, larger models trained on more data yield better and more sample-efficient downstream behavior, consistent with the fitted scaling laws.
LSM targets consumer health and behavioral sensing: its representations can be fine-tuned with relatively few labels for activity and exercise recognition, making it useful for fitness tracking, sleep and stress monitoring, and digital health research that relies on passively collected wearable signals. Its generative imputation and extrapolation capabilities are valuable for handling the gaps, dropped channels, and missing readings that are pervasive in real-world wearable data, supporting more robust downstream analytics and population health studies built on noisy device streams.
LSM is among the first works to systematically establish scaling laws for wearable physiological signals, providing evidence that the foundation-model paradigm extends to multimodal sensor data and offering the field a roadmap for scaling data and model size. Its demonstration of sample-efficient transfer is particularly consequential for health applications, where labeled outcomes are scarce and expensive. A key limitation for external researchers is openness: the model weights and code have not been released, and the underlying dataset is proprietary, so the results are not independently reproducible and the model cannot be directly reused. The work spawned a follow-up, LSM-2, which builds on this original model.
Narayanswamy, G., et al. (2024) Scaling Wearable Foundation Models. International Conference on Learning Representations.
DOI: 10.48550/arXiv.2410.13638Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data