bio.rodeo
HomeCompetitorsLeaderboardOrganizations
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

© 2026 bio.rodeo. All rights reserved.
Protein

RoseTTAFold All-Atom

Baker Lab

Deep network that predicts structures of full biological assemblies containing proteins, nucleic acids, small molecules, metals, and covalent modifications simultaneously.

Released: 2024

Overview

RoseTTAFold All-Atom (RFAA) extends the RoseTTAFold architecture to model the full chemical complexity of biological systems. While AlphaFold 2 and the original RoseTTAFold transformed protein structure prediction, both were limited to polypeptide chains. RFAA removes that constraint by combining residue-level representations of proteins and nucleic acids with an atomic graph representation of small molecules and covalent modifications, enabling joint structure prediction across all major classes of biological macromolecules and their ligands in a single network pass.

Published in Science in March 2024 by the Baker Lab at the University of Washington, RFAA achieves protein monomer structure prediction accuracy comparable to AlphaFold 2 while simultaneously handling interaction partners that no prior generalist method could model. The work also introduced RFdiffusion All-Atom (RFdiffusionAA), a companion generative model fine-tuned from RFAA that designs entirely new protein scaffolds around target small molecules.

The release marked a significant step toward modeling the true chemical complexity of biological assemblies, where proteins rarely act in isolation but instead interact with metabolites, cofactors, nucleic acids, and post-translational modifications.

Key Features

  • Unified all-atom modeling: Simultaneously predicts structures containing proteins, DNA, RNA, small molecules, metals, and covalently modified residues within a single forward pass, without specialized pipelines for each component type.
  • Flexible backbone docking: Excels at ligand docking scenarios where the protein backbone adjusts upon binding, capturing induced-fit effects that rigid-docking approaches miss.
  • Covalent modification support: Models post-translational modifications and other covalently bound chemical groups — glycosylation, phosphorylation, cofactor attachment — with reasonable accuracy.
  • Generative design capability: Fine-tuning on diffusion denoising tasks produces RFdiffusionAA, which generates novel protein scaffolds around target small molecules, enabling de novo ligand-binding protein design.
  • Experimentally validated outputs: RFdiffusionAA-designed proteins binding digoxigenin, heme, and bilin were confirmed by X-ray crystallography and binding assays, demonstrating practical design utility.

Technical Details

RFAA builds on the RoseTTAFold2 three-track architecture, which processes 1D sequence, 2D pairwise distance, and 3D coordinate information in parallel tracks with iterative cross-track attention. The key innovation is a dual input representation: biopolymers (amino acids, DNA/RNA bases) are encoded at residue level, while small molecules, metals, and covalent modifications are encoded as atomic bond graphs fed into the 1D track (element types), 2D track (chemical bonds), and 3D track (chirality). This asymmetric scheme allows efficient polymer processing while preserving full bonded geometry for non-polymer components. Structure generation uses an SE(3)-equivariant transformer to produce all-atom coordinates.

The model was trained on biological assemblies from the Protein Data Bank, including protein-small molecule complexes, protein-metal complexes, and covalently modified proteins. Common solvents and crystallization additives were filtered from training targets to keep the model focused on biologically meaningful interactions. On standard benchmarks, RFAA achieves protein monomer accuracy comparable to AlphaFold 2, strong performance on flexible backbone docking in CAMEO evaluations, and reasonable accuracy on multi-chain assemblies containing combinations of proteins, nucleic acids, and small molecules simultaneously.

Applications

RFAA is best suited for research problems that require modeling the true chemical context of biological systems. Primary use cases include predicting ligand-bound protein structures where backbone flexibility matters, modeling metalloenzymes and cofactor-bound proteins such as heme proteins or zinc-finger domains, and characterizing covalently modified proteins like glycoproteins. The companion RFdiffusionAA model extends these capabilities into active protein design, enabling researchers to generate novel binders for specific small-molecule targets — a workflow relevant to biosensor development, therapeutic protein engineering, and synthetic biology. The combined prediction-and-design pipeline represents a practical toolkit for labs working at the chemistry-biology interface.

Impact

RFAA represented a meaningful expansion of the generalist structure prediction paradigm beyond polypeptides, addressing a longstanding gap where researchers had to chain together specialized tools to model chemically complex assemblies. The experimentally validated small-molecule binder designs demonstrated that all-atom modeling is not merely predictive but generatively useful. Limitations remain: RFAA is not a replacement for specialized docking software when the receptor structure is already known, performance decreases for very large or chemically unusual ligands, and all-atom modeling of large assemblies demands substantially more memory than protein-only prediction. Nonetheless, its open-source availability on GitHub and strong benchmark results have made it a widely adopted tool for labs working on protein-ligand and protein-small-molecule systems.

Citation

Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom

Krishna, R., et al. (2023) Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom. bioRxiv.

DOI: 10.1126/science.adl2528

Metrics

GitHub

Stars802
Forks142
Open Issues114
Contributors10
Last Push11mo ago
LanguagePython

Citations

Total Citations810
Influential39
References107

Tags

protein designstructure predictionmultimodalsmall molecule

Resources

GitHub RepositoryResearch PaperOfficial Website