bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

La-Proteina

NVIDIA

Partially latent flow-matching model for joint generation of protein amino-acid sequence and full atomistic structure (backbone plus side chains) for proteins up to 800 residues.

Released: January 2026

La-Proteina is a partially latent flow-matching generative model from NVIDIA's GenAIR group for joint generation of protein amino-acid sequence and full atomistic structure (backbone plus side chains). Originally announced as an ICLR 2026 paper and released publicly on GitHub in early 2026, La-Proteina is the architectural backbone underlying NVIDIA's later Proteina-Complexa target-conditioned binder design system.

The model generates proteins up to 800 residues in length with both sequence identity and complete atomic coordinates produced jointly. La-Proteina advances unconditional all-atom protein generation past the regime where most prior baselines fail to produce designable, foldable proteins at this length.

#Key Features

  • Joint sequence-structure generation: Produces amino-acid sequence and full atomistic structure (backbone plus side chains) jointly within a single generative pass.
  • Up to 800 residues: Handles substantially longer proteins than prior all-atom generative baselines that typically saturate around 200 to 300 residues.
  • Partially latent flow matching: Combines explicit and latent representations to balance geometric fidelity against computational tractability.
  • All-atom output: Side chains generated alongside backbone, removing the need for a separate sequence-design pipeline.
  • Architectural backbone for Proteina-Complexa: Same architecture, with target conditioning, underlies NVIDIA's binder design system.

#Technical Details

La-Proteina represents proteins in a partially latent space where backbone frames are modeled in an explicit Cartesian representation and atomic-level coordinates are encoded in a learned latent space. Flow matching transports samples from a Gaussian prior to the joint sequence-structure data distribution. The ICLR 2026 paper provides architectural details, training corpus (PDB-derived), and ablations on representation choices.

The released code and weights are available through NVIDIA's research GitHub. Inference can be performed at moderate compute cost on single high-end GPUs.

#Applications

La-Proteina is suited for unconditional generative protein design at lengths where prior atomic-level baselines fail. Researchers can use La-Proteina to explore the space of designable proteins for downstream conditioning, scaffolding, or as a starting point for target-conditioned design through extensions like Proteina-Complexa.

#Impact

La-Proteina advances the state of the art in unconditional all-atom protein generation by handling longer proteins than prior baselines and producing designable sequence-structure pairs jointly. As the architectural foundation for Proteina-Complexa, La-Proteina is a key piece of NVIDIA's growing protein-design ecosystem and a useful open-source reference implementation for future flow-matching-based generative protein models.

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

GitHub

Stars299
Forks37
Open Issues9
Contributors1
Last Push9mo ago
LanguagePython

HuggingFace

Downloads134
Likes1
Last Modified6mo ago

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
69Partial
Usability — can I run it?77
Reproducibility — can I retrain it?70
Model Openness Framework
Unclassified
Restrictive license on core components

Resources

GitHub RepositoryResearch PaperOfficial WebsiteHuggingFace ModelDataset