bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

DERIVE

Guangzhou National Laboratory

A multimodal generative model that learns disentangled evolutionary representations to predict viral antigenic change, generalizing zero-shot across viral families.

Released: February 2026

DERIVE is a generative foundation model for predicting viral antigenic evolution — the process by which viruses change their surface proteins to escape host immunity. Anticipating antigenic change is central to vaccine design and surveillance, but it is hard to model because it depends on the interplay of evolutionary history, physicochemical properties, and protein structure. DERIVE addresses this by learning a disentangled latent representation that integrates these complementary signals into a single multimodal model.

The model's core idea is disentanglement: it separates the factors that drive antigenic change so that sequence-homology information, physicochemical features, and structural features each occupy interpretable parts of the latent space. This design enables cross-virus generalization — DERIVE is reported to transfer zero-shot to four viral families, predicting antigenic change for viruses beyond those seen during training. It was developed by Zhang, Lin, Zhong, Zhou, Li, and Yu at the Guangzhou National Laboratory and released as a February 2026 bioRxiv preprint.

DERIVE belongs to an emerging class of evolution-aware viral foundation models. Its emphasis on disentangled, multimodal representations and cross-family transfer distinguishes it from virus-specific antigenic prediction methods that must be retrained for each pathogen.

#Key Features

  • Disentangled latent representation: DERIVE separates evolutionary, physicochemical, and structural factors in its latent space, producing more interpretable and transferable representations of antigenic change.
  • Multimodal integration: The model combines sequence homology with physicochemical and structural features rather than relying on sequence alone.
  • Zero-shot cross-virus transfer: DERIVE generalizes to four viral families without retraining, predicting antigenic evolution for viruses unseen during training.
  • Generative formulation: A flow-based generative framework lets the model represent distributions over antigenic variation rather than producing only point predictions.
  • Surveillance-oriented design: By forecasting antigenic change across pathogens, the model targets the practical problem of anticipating immune escape for vaccine and monitoring efforts.

#Technical Details

DERIVE learns a disentangled latent representation by jointly modeling sequence homology together with physicochemical and structural features of viral proteins, using a flow-based generative framework to capture the distribution of antigenic change. The disentanglement is what enables cross-virus predictive modeling: by factoring the latent space into evolutionary and biophysical components, the model can apply patterns learned on some viruses to others. The preprint reports zero-shot generalization to four viral families, indicating transfer beyond the training pathogens. Full architectural specifications, training datasets, parameter counts, and quantitative benchmark results are detailed in the paper, which is released under a CC BY license. As a recent preprint, the availability of code and trained weights should be verified from the authors.

#Applications

DERIVE is intended for viral surveillance and vaccine development, where predicting antigenic change helps anticipate immune escape and guide strain selection. Its cross-family generalization makes it especially relevant for emerging or under-studied pathogens for which limited antigenic data exist, since the model can transfer knowledge from better-characterized viruses. Researchers tracking the evolution of respiratory and other rapidly evolving viruses could use DERIVE to prioritize variants of concern and to interpret which sequence, physicochemical, or structural changes are driving antigenic shifts.

#Impact

DERIVE contributes a disentangled, multimodal approach to a problem usually tackled with virus-specific models, and its reported zero-shot transfer across four viral families suggests a path toward more general antigenic-evolution forecasting. Coming from the Guangzhou National Laboratory, which focuses on respiratory and infectious disease, the work targets a problem of clear public-health relevance. As a February 2026 preprint, its results have not yet been independently validated, and the practical reliability of cross-family predictions — particularly for viruses very different from those in training — will require further external evaluation.

Tags

variant_effect_predictionrepresentation_learningflow_matchinggenerativemultimodalfoundation_modelzero_shotvirology