bio.rodeo
HomeCompetitorsLeaderboardOrganizations
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

© 2026 bio.rodeo. All rights reserved.
Protein

RFdiffusion3

Institute for Protein Design

All-atom diffusion model for de novo protein design conditioned on ligands, nucleic acids, and arbitrary non-protein atoms, enabling enzyme and DNA binder design.

Released: 2025
Parameters: 168,000,000

Overview

RFdiffusion3 (RFD3) is the third-generation protein design diffusion model from the Baker Lab at the University of Washington Institute for Protein Design, released in December 2025. It introduces all-atom modeling as the fundamental architectural innovation: rather than diffusing over protein backbone frames alone, RFD3 treats every atom in a biomolecular system as a first-class citizen in the generative process. This allows the model to design proteins conditioned on ligands, nucleic acids, and arbitrary non-protein atoms simultaneously — a capability that prior backbone-only diffusion models could not achieve natively.

The preprint "De novo Design of All-atom Biomolecular Interactions with RFdiffusion3" was posted to bioRxiv in September 2025, and the code and weights were made publicly available through the RosettaCommons Foundry repository in December 2025. RFD3 shares no code with its predecessors RFdiffusion and RFdiffusion2; it is a complete architectural rebuild designed around atom-level representations of multi-molecular systems.

The model achieves these capabilities at approximately one-tenth the inference cost of predecessor models, lowering the barrier to applying diffusion-based design to challenging multi-constraint problems involving enzyme active sites, DNA recognition interfaces, and small-molecule binding pockets.

Key Features

  • All-atom generation: Every backbone and side-chain atom is modeled explicitly rather than as a residue-level frame, enabling precise conditioning on atomic environments of binding pockets, active sites, and nucleic acid interfaces.
  • Multi-molecular conditioning: Conditions simultaneously on proteins, DNA, RNA, and small molecules within a single unified framework, without molecule-type-specific sub-models.
  • Enzyme active-site scaffolding: Supports specification of catalytic residue geometry (e.g., Cys-His-Asp triads) for de novo scaffolding of functional enzymes.
  • DNA binder design: Generates proteins that recognize specific DNA sequences through atomic-level interface modeling.
  • Ten-fold speed improvement: Sparse attention restricted to geometrically adjacent atoms reduces inference cost by an order of magnitude relative to prior RFdiffusion versions.
  • Classifier-free guidance: Applies guidance techniques from image diffusion to improve satisfaction of complex multi-constraint design problems.

Technical Details

RFdiffusion3 is a 168-million parameter transformer-based U-Net that operates directly on atomic coordinates. Each residue is represented with 4 backbone atoms and up to 10 side-chain atoms; shorter side chains are padded with virtual atoms at the Cbeta position to maintain a uniform representation. Attention is restricted to geometrically adjacent atoms rather than all pairs, concentrating computation where it is physically meaningful. The Pairformer module from AlphaFold 3 is reduced from 48 layers to 2 layers, and triangle multiplicative updates and triangle attention are omitted, yielding the order-of-magnitude speed improvement.

The model was trained on a hierarchical schedule using two data sources: all available Protein Data Bank complexes spanning protein-protein, protein-small molecule, protein-DNA, and protein-RNA interactions, supplemented by AlphaFold 2 self-distillation structures to broaden sequence space coverage. Training ran on 16 NVIDIA H200 GPUs for approximately seven days. Benchmarks show RFD3 outperforming RFdiffusion (v1) on 4 of 5 protein-protein binder targets. Experimentally, 18% of designed cysteine hydrolase scaffolds showed multi-turnover catalytic activity, with the best design achieving kcat/Km of 3,557 +/- 624 M-1s-1.

Applications

RFdiffusion3 substantially expands the range of problems addressable by diffusion-based protein design. Researchers can scaffold catalytic triads and other active-site geometries into stable protein frameworks for de novo enzyme design without natural enzyme templates. DNA-binding proteins targeting defined sequences are relevant to gene regulation, epigenetic editing, and synthetic biology. Small-molecule binding proteins can serve as biosensors or drug development starting points. The model's unified treatment of molecular heterogeneity makes it particularly well-suited to multi-constraint problems, such as designing a protein that simultaneously scaffolds a catalytic residue and binds a cofactor — tasks where backbone-only approaches require separate, sequential design stages. Sequence design remains a separate downstream step using tools such as ProteinMPNN or LigandMPNN.

Impact

RFdiffusion3 represents a meaningful advance in the field of computational protein design by bringing all-atom awareness to the generative diffusion framework. It is the first model in the RFdiffusion lineage to natively handle multi-molecular systems at atomic resolution, extending diffusion-based design beyond backbone scaffolding into the realm of functional site engineering. The model is open-source under a permissive license and distributed through RosettaCommons Foundry alongside training code, supporting community extension. As of its release in December 2025, the underlying paper is a bioRxiv preprint and has not yet undergone formal peer review, and experimental validation covers two design challenges (DNA binders and cysteine hydrolases); performance on other target classes requires independent characterization. The work is part of a broader trend toward all-atom generative models in structural biology, complementing AlphaFold 3 and Boltz-1 in the prediction space.

Citation

De novo Design of All-atom Biomolecular Interactions with RFdiffusion3

Preprint

Butcher, J., et al. (2025) De novo Design of All-atom Biomolecular Interactions with RFdiffusion3. bioRxiv.

DOI: 10.1101/2025.09.18.676967

Metrics

GitHub

Stars788
Forks138
Open Issues65
Contributors34
Last Push1d ago
LanguagePython
LicenseBSD-3-Clause

Citations

Total Citations28
Influential1
References76

Tags

DNA bindingde novo designenzyme designprotein designstructure generationdiffusionall-atomsmall molecule

Resources

GitHub RepositoryResearch PaperOfficial WebsiteOfficial Website