Weizmann Institute of Science / Mohamed bin Zayed University of Artificial Intelligence / NVIDIA
A generative transformer foundation model for continuous glucose monitoring data, pretrained on 10M+ CGM measurements to forecast glycemia and stratify long-term health risk.
GluFormer is a generative foundation model for continuous glucose monitoring (CGM) data, developed by Guy Lutsker, Eran Segal, and colleagues at the Weizmann Institute of Science in collaboration with MBZUAI and NVIDIA. It was first posted as an arXiv preprint in August 2024 and published in Nature in January 2026. CGM sensors produce dense, multi-day glucose time series that are increasingly central to diabetes care and metabolic-health research, yet most analyses still reduce these rich signals to a handful of hand-crafted summary statistics such as time-in-range or the Glucose Management Indicator. GluFormer instead learns general-purpose representations directly from raw glucose traces and transfers them across cohorts, devices, and clinical tasks.
The model's defining contribution is its scale and demonstrated generalization. It is pretrained on more than 10 million CGM measurements from 10,812 adults—largely without diabetes—drawn from the Human Phenotype Project cohort, then evaluated on 19 external cohorts totaling 6,044 participants spanning multiple countries, ethnicities, CGM devices, and pathophysiological states including prediabetes, type 1 and type 2 diabetes, gestational diabetes, and obesity. This breadth establishes GluFormer as one of the first CGM models to show that a single pretrained backbone can transfer robustly across heterogeneous glucose data sources.
GluFormer sits at the head of a small but growing family of CGM foundation models—including GlucoFM and predictive self-supervised approaches—that bring the pretrain-then-transfer paradigm to wearable metabolic biosignals. Its distinguishing feature is the pairing of a purely generative, autoregressive pretraining objective with extensive external validation tied directly to long-term health outcomes.
GluFormer is a transformer-based foundation model trained in a generative, autoregressive fashion via next-token prediction over tokenized CGM signals, capturing longitudinal glucose dynamics from raw sensor traces. Pretraining used over 10 million glucose measurements from 10,812 adults in the Human Phenotype Project, run on NVIDIA AI infrastructure. Across downstream evaluations, the learned representations consistently outperformed baseline fasting glucose, HbA1c, and standard CGM-derived metrics for forecasting glycemic parameters, and they predicted clinical endpoints more effectively than HbA1c. The published Nature paper does not disclose an exact parameter count or context length, so those specifics are omitted here. An official implementation is released under the Apache-2.0 license, though pretrained weights are not currently distributed in the public repository.
GluFormer is aimed at researchers and clinicians working with CGM data in diabetes and metabolic-health settings. Its pretrained representations can serve as a shared backbone for downstream tasks such as forecasting future glucose, characterizing glycemic control, and stratifying individuals by diabetes and cardiovascular risk—potentially reducing the labeled data needed to build each new predictor. Because the model transfers across cohorts and sensor hardware, it is well suited to studies that pool heterogeneous CGM sources, and its multimodal dietary extension points toward personalized nutrition tools that anticipate an individual's response to specific meals.
Published in Nature, GluFormer is a flagship example of extending the foundation-model paradigm into wearable metabolic monitoring, shifting CGM analysis from hand-crafted summary metrics toward learned, transferable representations. Its unusually broad external validation—linking glucose embeddings to multi-year diabetes and cardiovascular outcomes—offers some of the strongest evidence to date that CGM foundation models can carry clinically meaningful signal. Key limitations include the lack of publicly released pretrained weights and undisclosed architectural specifics, which constrain independent reproduction, and a pretraining cohort drawn largely from a single phenotyping study, leaving open questions about scaling to still larger and more diverse populations.
Lutsker, G., et al. (2024) From Glucose Patterns to Health Outcomes: A Generalizable Foundation Model for Continuous Glucose Monitor Data Analysis. arXiv.org.
DOI: 10.48550/arXiv.2408.11876Lutsker, G., et al. (2026) A foundation model for continuous glucose monitoring data. Nature.
DOI: 10.1038/s41586-025-09925-9Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data