Electronic health record (EHR) foundation models have demonstrated that self-supervised pretraining over longitudinal clinical sequences can learn transferable representations of patient health, much as language models learn from text. Most of these models, however, treat the EHR as the sole modality and ignore the genetic factors that shape disease risk. The Verily Multimodal EHR + Genomics Foundation Model, described by Amar and colleagues at Verily Life Sciences in an October 2025 preprint, addresses this gap by making genomics a first-class input alongside the clinical timeline.

The model's central contribution is the integration of polygenic risk scores (PRS) as a foundational data modality. Rather than appending genetic features as static covariates, the architecture fuses PRS into a GPT-2-style autoregressive EHR backbone through a cross-attention mechanism, allowing the model to condition its predictions about a patient's clinical trajectory on inherited risk. It is pretrained on participant data from the All of Us Research Program, a large and demographically diverse U.S. cohort, positioning the work as an early example of genomics-aware multimodal foundation models built on real-world, controlled-access biobank data.

Once pretrained, the model supports zero-shot disease prediction and can be adapted to downstream classification tasks through transfer learning without retraining the backbone. The authors emphasize Type 2 Diabetes prediction as a demonstration of how combining EHR context with genetic predisposition improves risk estimation over EHR-only baselines.

Key Features

PRS as a foundational modality: Polygenic risk scores are integrated into the model via cross-attention rather than treated as auxiliary features, letting genetic predisposition directly inform clinical-sequence modeling.
GPT-2-style EHR backbone: A 155M-parameter autoregressive transformer models longitudinal EHR events, following the language-model paradigm adapted to structured clinical data.
Zero-shot disease prediction: After pretraining, the model can estimate disease risk for conditions such as Type 2 Diabetes without task-specific fine-tuning.
Transfer learning to downstream tasks: Learned representations can be reused for custom classification problems without retraining the full model, lowering the data and compute cost of new applications.
Built on a diverse biobank: Pretraining on the All of Us Research Program (~135,000 participants) grounds the model in a large, demographically varied real-world cohort.

Technical Details

The architecture couples a GPT-2-style autoregressive transformer (approximately 155 million parameters) that encodes the longitudinal EHR with a cross-attention pathway that injects polygenic risk scores into the model's representations. This design lets genetic signal modulate predictions across the patient timeline instead of acting as a one-time input. Pretraining uses data from roughly 135,000 All of Us participants who have both linked EHR records and genomic data. The authors evaluate the model on disease prediction, highlighting Type 2 Diabetes, and on transfer-learning setups where pretrained representations are adapted to new classification targets; reported results indicate that adding the PRS modality improves predictive performance relative to EHR-only configurations. The work is a Verily project with an associated Verily–NVIDIA precision-health AI collaboration.

Applications

The model is aimed at precision-health and clinical research settings where both phenotypic history and genetic risk are available. Potential use cases include risk stratification for common polygenic conditions, cohort enrichment for clinical studies, and as a pretrained backbone that downstream teams can adapt to specialized prediction tasks with limited labeled data. Because it learns from linked EHR and genomic records, it is particularly relevant to biobank-scale research programs and health systems exploring genomics-informed decision support.

Impact

The model illustrates a broader trend toward multimodal foundation models that unify clinical and molecular data, and it is among the early efforts to treat polygenic risk as a native modality within an EHR foundation model rather than a bolt-on feature. Its reliance on the controlled-access All of Us Research Program is also a key limitation for reproducibility and reuse: neither the model weights nor the training code are publicly available, since the model is trained on restricted participant data. As a result, the work is best read as a methodological demonstration of genomics–EHR fusion whose independent validation and external adoption will depend on future releases or replication on accessible cohorts.

Key Features

PRS as a foundational modality: Polygenic risk scores are integrated into the model via cross-attention rather than treated as auxiliary features, letting genetic predisposition directly inform clinical-sequence modeling.

GPT-2-style EHR backbone: A 155M-parameter autoregressive transformer models longitudinal EHR events, following the language-model paradigm adapted to structured clinical data.

Zero-shot disease prediction: After pretraining, the model can estimate disease risk for conditions such as Type 2 Diabetes without task-specific fine-tuning.

Transfer learning to downstream tasks: Learned representations can be reused for custom classification problems without retraining the full model, lowering the data and compute cost of new applications.

Built on a diverse biobank: Pretraining on the All of Us Research Program (~135,000 participants) grounds the model in a large, demographically varied real-world cohort.

Technical Details

Applications

Impact

Verily Multimodal EHR + Genomics Foundation Model

Key Features

Technical Details

Applications

Impact

Citation

Integrating Genomics into Multimodal EHR Foundation Models

Recent citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Top citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Citations

Fields of citing research

Openness

Tags

Resources

Verily Multimodal EHR + Genomics Foundation Model

Key Features

Technical Details

Applications

Impact

Citation

Integrating Genomics into Multimodal EHR Foundation Models

Recent citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Top citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Citations

Fields of citing research

Openness

Tags

Resources

Verily Multimodal EHR + Genomics Foundation Model

#Key Features

#Technical Details

#Applications

#Impact

Citation

Integrating Genomics into Multimodal EHR Foundation Models

Recent citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Top citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Citations

Fields of citing research

Openness

Tags

Resources

Verily Multimodal EHR + Genomics Foundation Model

#Key Features

#Technical Details

#Applications

#Impact

Citation

Integrating Genomics into Multimodal EHR Foundation Models

Recent citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Top citations

Cross-Modal Generative Augmentation for Multimodal Biological Classification

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact