BoltzGen

All-atom generative model for de novo protein and peptide binder design against diverse biomolecular targets, wet-lab validated across 26 targets.

Released: November 2025

BoltzGen is an all-atom generative model for de novo binder design, developed by researchers in the Barzilay and Jaakkola labs at MIT and released as a preprint in November 2025. Rather than predicting the structure of an existing complex, BoltzGen designs entirely new protein and peptide binders against a specified target, generating both backbone geometry and sequence. The goal stated by its authors is "universal" binder design: a single model that can produce binders across a wide range of target classes rather than a method tuned for one interaction type.

It is important not to confuse BoltzGen with the earlier Boltz-1 and Boltz-2 models from the same broader research community. Those are structure prediction models — Boltz-1 predicts biomolecular complex structures and Boltz-2 adds binding affinity estimation. BoltzGen is a fundamentally different tool: a generative design model whose output is a novel binder candidate, not a predicted structure of a known complex. In practice BoltzGen relies on a structure predictor (Boltz-2) downstream to evaluate and filter its generated designs, but the design step itself is the model's defining contribution.

What distinguishes BoltzGen is the breadth of targets it addresses within one framework and the scale of experimental validation reported. The authors describe wet-lab testing across 26 targets spanning 8 distinct design campaigns, recovering nanomolar binders for roughly 66% of the targets attempted — a level of cross-target generality that has historically been difficult for binder design methods to achieve.

Key Features

Universal binder design: A single generative model targets diverse biomolecular partners — proteins, peptides, and small molecules — rather than being specialized to one interaction class.
All-atom generation: BoltzGen reasons over full atomic detail, allowing it to design against and around small-molecule ligands and chemically modified sites rather than backbone-only abstractions.
Diffusion plus inverse folding pipeline: A diffusion model proposes binder backbones and an inverse folding model assigns sequences, after which structure prediction and analysis filter large candidate pools down to a shortlist for synthesis.
Specialized design tasks: The released system supports protein-protein and peptide-protein binders, protein-small molecule binders, antibody CDR and nanobody design, symmetric complexes, and protein redesign.
Extensive wet-lab validation: Experimental campaigns across 26 targets yielded nanomolar binders for about 66% of targets, an unusually broad demonstration of real-world efficacy for a generative design model.

Technical Details

BoltzGen uses a diffusion-based backbone generation model paired with an inverse folding model that predicts amino acid sequences for the generated backbones. The default configuration ships two design checkpoints — one optimized for diversity and one for adherence to design constraints — and uses a separate Boltz-2 model for downstream structure prediction and scoring. The model operates on full all-atom representations and uses canonical mmCIF residue indexing, with support for explicit constraints such as disulfide bonds and chemical cross-links. Pretrained weights total roughly 6 GB and download automatically on first use. The primary training source is Protein Data Bank structural data; an additional distillation dataset used for the larger model variant had not been publicly released at the time of writing. A typical design run generates on the order of 10,000–60,000 intermediate candidates that are then filtered to a small final set, and the workflow is GPU-based (tested on A100 hardware). The headline experimental result is the recovery of nanomolar binders for approximately 66% of the 26 wet-lab targets across 8 campaigns.

Applications

BoltzGen is aimed at researchers who need novel binding proteins or peptides for targets that lack existing binders, including therapeutic antibody and nanobody engineering, peptide therapeutics, biosensor and diagnostic reagent development, and tool-binder generation for basic research. Because it operates at the all-atom level and supports small-molecule targets, it is also applicable to designing proteins that recognize specific ligands or chemically modified epitopes. Its constraint system makes it useful for engineering tasks such as introducing stabilizing disulfides or designing symmetric assemblies. The open-source release — pip-installable with pretrained weights, training, and inference code under an MIT license — lowers the barrier for academic and biotech groups to incorporate de novo binder design into their pipelines.

Impact

BoltzGen extends the open Boltz ecosystem from structure and affinity prediction into generative design, and its central claim — competitive binder design across many target classes from a single model, backed by validation on 26 experimental targets — addresses one of the most practically valuable and difficult problems in protein engineering. The breadth of reported wet-lab success distinguishes it from binder design methods that demonstrate efficacy on only a handful of targets. As a preprint, its results await peer review, and some assets (notably the distillation dataset behind the larger variant) were not yet released, while a weights-specific license was not explicitly stated even though the code carries an MIT license. Even so, the rapid community uptake of the codebase and its integration with the widely used Boltz structure predictors position BoltzGen as an influential entry in the fast-moving field of computational binder design.

Citation

BoltzGen: Toward Universal Binder Design

Preprint

Stark, H., et al. (2025) BoltzGen: Toward Universal Binder Design. bioRxiv.

DOI: 10.1101/2025.11.20.689494

Recent citations

Papers that recently cited this model.

De novo Design of Polymorph-Specific Binders Targeting α-Synuclein Fibrils
Ahmed Sadek, Nolwen L. Rey, Antonin Kunka, et al.
bioRxiv · Jun 2026
0
MoE-Bind: Guiding De Novo Protein Binder Generation with Sparse Experts
Dipayan Sarkar, Chiranjib Sarkar
bioRxiv · Jun 2026
0
GermRL: Alleviating The Germline Bias In Autoregressive Antibody Language Models Through Reinforcement Learning
Laurent Ludwig, Michael Chungyoun, Jeffrey J. Gray
bioRxiv · Jun 2026
0

Top citations

The most-cited papers that cite this model.

PXDesign: Fast, Modular, and Accurate De Novo Design of Protein Binders
Protenix Team, Milong Ren, Jinyuan Sun, et al.
bioRxiv · Dec 2025
20
Protenix-v1: Toward High-Accuracy Open-Source Biomolecular Structure Prediction
Yuxuan Zhang, Chengyue Gong, Hanyu Zhang, et al.
bioRxiv · Feb 2026
12
Protein Diffusion Models as Statistical Potentials
James P. Roney, Chenxi Ou, Sergey Ovchinnikov
bioRxiv · Mar 2026
7
Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis
Alexus A. Smith, Edmund Wong, R. Donovan, et al.
bioRxiv · Feb 2026
6
Drug-like antibody design against challenging targets with atomic precision
Jacques Boitreaud, Robert Chen, Jack Dent, et al.
bioRxiv · Dec 2025
5

Citations

Total Citations65

Influential4

References0

GitHub

Stars989

Forks242

Open Issues157

Contributors16

Last Push27d ago

LanguageJupyter Notebook

LicenseMIT

Fields of citing research

Biology85%
Computer Science77%
Medicine51%
Chemistry32%
Engineering6%
Materials Science5%
Mathematics3%
Environmental Science2%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

78Open

Usability — can I run it?95

Reproducibility — can I retrain it?61

Model Openness Framework

Unclassified

Missing required components

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Universal binder design: A single generative model targets diverse biomolecular partners — proteins, peptides, and small molecules — rather than being specialized to one interaction class.

All-atom generation: BoltzGen reasons over full atomic detail, allowing it to design against and around small-molecule ligands and chemically modified sites rather than backbone-only abstractions.

Diffusion plus inverse folding pipeline: A diffusion model proposes binder backbones and an inverse folding model assigns sequences, after which structure prediction and analysis filter large candidate pools down to a shortlist for synthesis.

Specialized design tasks: The released system supports protein-protein and peptide-protein binders, protein-small molecule binders, antibody CDR and nanobody design, symmetric complexes, and protein redesign.

Extensive wet-lab validation: Experimental campaigns across 26 targets yielded nanomolar binders for about 66% of targets, an unusually broad demonstration of real-world efficacy for a generative design model.

Technical Details

Applications

Impact

BoltzGen

#Key Features

#Technical Details

#Applications

#Impact

Citation

BoltzGen: Toward Universal Binder Design

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Resources

BoltzGen

#Key Features

#Technical Details

#Applications

#Impact

Citation

BoltzGen: Toward Universal Binder Design

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact