bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein

A-CODE

University of Illinois Urbana-Champaign / ByteDance Seed

A fully atomic protein co-design model using unified multimodal diffusion to jointly refine atom types and coordinates in a single stage, with support for non-canonical amino acids.

Released: May 2026

A-CODE (Atomic CO-DEsign) is a generative foundation model for protein design that operates entirely at atomic granularity. Rather than treating protein generation as a multi-stage problem—first laying out a backbone, then designing a sequence, then packing side chains—A-CODE casts the full task as a single unified process in which discrete atom types and continuous atom coordinates are refined simultaneously. Amino acid identities emerge from atom-level predictions rather than being assigned in a separate sequence-design step. The work was introduced in a May 2026 preprint by researchers at the University of Illinois Urbana-Champaign and ByteDance Seed.

The central problem A-CODE addresses is the error accumulation and modeling mismatch inherent in cascaded protein co-design pipelines, where decisions made early (such as backbone geometry) constrain later stages and small inconsistencies compound. By unifying everything into one all-atom diffusion process, A-CODE lets sequence and structure inform one another throughout generation. The authors report that this formulation is particularly effective on difficult binder-design problems, where prior one-stage approaches have struggled.

A notable consequence of the fully atomic formulation is that A-CODE is, according to the authors, the first protein co-design model to support non-canonical amino acids (ncAAs). Because the model reasons over atoms rather than a fixed alphabet of 20 residues, it can in principle accommodate chemistries outside the canonical set—an area of growing interest for designing proteins with novel functions.

#Key Features

  • Single-stage all-atom co-design: Discrete atom types and continuous coordinates are refined together in one unified diffusion process, removing the cascaded backbone-then-sequence-then-packing pipeline used by most prior methods.
  • Multimodal diffusion framework: The model jointly handles categorical (atom type) and continuous (coordinate) variables, letting structure and sequence constrain each other throughout generation.
  • Non-canonical amino acid support: By modeling proteins at the atomic level, A-CODE can—per the authors, for the first time in a co-design model—adapt to non-canonical amino acids beyond the standard 20-residue alphabet.
  • Strong hard-task binder design: The authors report a roughly tenfold improvement in success rate on hard binder-design tasks compared to prior one-stage co-design methods, while rivaling or exceeding two-stage baselines.
  • Unconditional generation and conditional binder design: A-CODE serves both as an unconditional generative model for protein structures and as a conditional model for designing binders against a target.

#Technical Details

A-CODE is built on a multimodal diffusion framework that operates over all atoms in a protein rather than over residue-level tokens or backbone frames. During generation, the model iteratively denoises both the discrete atom-type assignments and the continuous 3D coordinates, with amino acid identity read out from the predicted atomic composition. The model is trained on protein structures from the Protein Data Bank (PDB). The authors evaluate unconditional generation by designability and report superior performance over existing approaches, and on binder design they report a roughly tenfold improvement in success rate on hard tasks relative to prior one-stage co-design methods. The preprint does not state the model's parameter count or the detailed composition of its training set.

#Applications

A-CODE targets de novo protein design and protein binder design—designing new proteins that fold to desired structures and bind specified targets. The fully atomic, single-stage formulation is especially aimed at hard binder-design tasks where staged pipelines tend to fail. Its ability to reason about non-canonical amino acids opens potential applications in designing proteins with chemistries outside the natural alphabet, of interest to protein engineers and synthetic biologists exploring expanded functional repertoires. As with all computational design tools, generated candidates would require experimental validation.

#Impact

A-CODE contributes to a broader shift in protein design from frame- or residue-based generative models toward fully atomic representations, joining all-atom efforts in the generative-protein-design space. Its claimed first-in-class support for non-canonical amino acid co-design and its reported gains on hard binder tasks position it as a notable step for one-stage co-design. As of the preprint's release, the authors had not released code or weights, citing an intent for responsible release; the paper is distributed under a CC BY-NC-SA 4.0 license. Because the work is a preprint and its training-data composition and parameter count are unstated, its real-world impact and reproducibility remain to be established through community evaluation.

Citation

Preprint

DOI: 10.48550/arXiv.2605.03360

DOI: 10.48550/arXiv.2605.03360

Openness

Unclassified
Restrictive license on core components

Tags

binder_designde_novo_designdiffusionfoundation_modelgenerativemultimodalprotein_design

Resources

Research Paper