bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

Proteina-Complexa

NVIDIA

Partially latent flow-matching generative model for de novo atomistic protein binder design against protein and small-molecule targets, with experimental validation at million-design scale.

Released: March 2026

Proteina-Complexa is a partially latent flow-matching generative model from NVIDIA's GenAIR group for fully atomistic de novo design of protein binders against protein and small-molecule targets. Released as an arXiv preprint in March 2026 alongside large-scale experimental validation results, the model extends NVIDIA's earlier La-Proteina architecture to multi-molecular conditioning and demonstrates inference-time compute scaling — design quality increases monotonically with the number of samples drawn, achieving 68% target-level hit rates at million-design scale across 127 targets.

Notably, the validation campaign included the first reported de novo computational design of carbohydrate binders, addressing a long-standing target class that traditional protein-design tools have struggled with due to the geometric and chemical diversity of glycan ligands.

#Key Features

  • Partially latent flow matching: Generates sequence and full atomistic structure jointly through a flow-matching objective in a partially latent representation, balancing geometric fidelity against computational tractability.
  • Multi-modal conditioning: Conditions on arbitrary target molecules — proteins, peptides, small molecules, and carbohydrates — within a unified framework.
  • Inference-time compute scaling: Sampling more candidates monotonically improves hit rates, enabling tunable trade-offs between compute and design quality.
  • Million-design experimental validation: Validated at unprecedented scale (1 million designs against 127 targets in a single multiplexed yeast-display experiment), with 68% of targets receiving at least one validated binder.
  • First de novo carbohydrate binders: First reported computational design of de novo proteins binding carbohydrate ligands, expanding the addressable target space.

#Technical Details

Proteina-Complexa builds on the La-Proteina backbone (an ICLR 2026 paper) by adding target conditioning. The model represents proteins in a partially latent space that captures backbone frames at high precision and atomic-level coordinates through a learned encoder, then runs flow matching to transport from prior to data distribution. Training uses the Protein Data Bank for protein-target complexes and an in-house protein-carbohydrate dataset for glycan binders.

The validation experiment used a yeast-display assay multiplexed across 127 targets simultaneously, with sequencing readout to identify enriched designs. Hit rates are reported as the fraction of targets with at least one experimentally confirmed binder among the designs tested.

#Applications

Proteina-Complexa is useful for early-stage therapeutic discovery in target classes that are hard to address with antibody or small-molecule modalities, including protein-protein interaction interfaces, carbohydrate binders for glycan-based diagnostics or therapeutics, and de novo binders against undruggable targets. The inference-time scaling property makes it well-suited to compute-rich industrial settings where researchers can trade GPU hours for hit rates.

#Impact

Proteina-Complexa establishes inference-time compute scaling as a useful axis in generative protein design and demonstrates the most ambitious experimental validation campaign yet for a de novo binder design model, validating across 127 targets in a single experiment. The carbohydrate-binding result opens a previously inaccessible target class to computational design. Paired with La-Proteina (its open-source backbone), Proteina-Complexa positions NVIDIA as a serious contributor in the generative protein design space alongside Baker Lab (RFdiffusion) and Chroma developers.

Citation

Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute

Preprint

Didi, K., et al. (2026) Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute.

DOI: 10.48550/arXiv.2603.27950

Recent citations

Papers that recently cited this model.

  • Promera: a unified model for biomolecular structure prediction, filtering, and design

    Bowen Jing, Mihir Bafna, Daniel J. Diaz, et al.

    bioRxiv · Jun 2026

    0
  • Few-step Cofolding with All-Atom Flow Maps

    G. Scarpellini, Ron Shprints, Peter Holderrieth, et al.

    Jun 2026

    0
  • AlloGen: Conformation-Selective Binder Generation with Differential State Scoring

    Hanqun Cao, Z. Quinn, Aastha Pal, et al.

    Jun 2026

    0

Top citations

The most-cited papers that cite this model.

  • La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

    Tomas Geffner, Kieran Didi, Zhonglin Cao, et al.

    arXiv.org · Jul 2025

    40
  • AlphaFold Database expands to proteome-scale quaternary structures

    Yewon Han, Maxim I. Tsenkov, N. Venanzi, et al.

    bioRxiv · Mar 2026

    7
  • Latent Generative Search unlocks de novo Design of Untapped Biomolecular Interactions at Scale

    Kieran Didi, Danny Reidenbach, Matthew Penner, et al.

    2Influential
  • Sharpen Your Flow: Sharpness-Aware Sampling for Flow Matching

    Aditi Gupta, Soon Hoe Lim, Annan Yu, et al.

    May 2026

    1
  • Few-step Cofolding with All-Atom Flow Maps

    G. Scarpellini, Ron Shprints, Peter Holderrieth, et al.

    Jun 2026

    0

Citations

Total Citations15
Influential0
References0

GitHub

Stars376
Forks65
Open Issues19
Contributors1
Last Push1mo ago
LanguagePython

HuggingFace

Downloads216
Likes5
Last Modified3mo ago

Fields of citing research

  • Computer Science100%
  • Biology80%
  • Chemistry13%
  • Physics13%
  • Medicine13%
  • Materials Science13%
  • Mathematics7%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible
68Partial
Usability — can I run it?69
Reproducibility — can I retrain it?75
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

all_atomcarbohydratede_novo_designflow_matchingfoundation_modelgenerativeprotein_binder_designprotein_ligand_complexsmall_molecule_binder_designstructure_generationtransformer

Resources

GitHub RepositoryResearch PaperOfficial WebsiteHuggingFace Model