A-CODE

University of Illinois Urbana-Champaign / ByteDance Seed

All-atom protein co-design model that generates sequence and structure together in one unified diffusion process, aimed at hard binder design.

Released: May 2026

A-CODE (Atomic CO-DEsign) is a generative foundation model for protein design that operates entirely at atomic granularity. Rather than treating protein generation as a multi-stage problem—first laying out a backbone, then designing a sequence, then packing side chains—A-CODE casts the full task as a single unified process in which discrete atom types and continuous atom coordinates are refined simultaneously. Amino acid identities emerge from atom-level predictions rather than being assigned in a separate sequence-design step. The work was introduced in a May 2026 preprint by researchers at the University of Illinois Urbana-Champaign and ByteDance Seed.

The central problem A-CODE addresses is the error accumulation and modeling mismatch inherent in cascaded protein co-design pipelines, where decisions made early (such as backbone geometry) constrain later stages and small inconsistencies compound. By unifying everything into one all-atom diffusion process, A-CODE lets sequence and structure inform one another throughout generation. The authors report that this formulation is particularly effective on difficult binder-design problems, where prior one-stage approaches have struggled.

A notable consequence of the fully atomic formulation is that A-CODE is, according to the authors, the first protein co-design model to support non-canonical amino acids (ncAAs). Because the model reasons over atoms rather than a fixed alphabet of 20 residues, it can in principle accommodate chemistries outside the canonical set—an area of growing interest for designing proteins with novel functions.

Key Features

Single-stage all-atom co-design: Discrete atom types and continuous coordinates are refined together in one unified diffusion process, removing the cascaded backbone-then-sequence-then-packing pipeline used by most prior methods.
Multimodal diffusion framework: The model jointly handles categorical (atom type) and continuous (coordinate) variables, letting structure and sequence constrain each other throughout generation.
Non-canonical amino acid support: By modeling proteins at the atomic level, A-CODE can—per the authors, for the first time in a co-design model—adapt to non-canonical amino acids beyond the standard 20-residue alphabet.
Strong hard-task binder design: The authors report a roughly tenfold improvement in success rate on hard binder-design tasks compared to prior one-stage co-design methods, while rivaling or exceeding two-stage baselines.
Unconditional generation and conditional binder design: A-CODE serves both as an unconditional generative model for protein structures and as a conditional model for designing binders against a target.

Technical Details

A-CODE is built on a multimodal diffusion framework that operates over all atoms in a protein rather than over residue-level tokens or backbone frames. During generation, the model iteratively denoises both the discrete atom-type assignments and the continuous 3D coordinates, with amino acid identity read out from the predicted atomic composition. The model is trained on protein structures from the Protein Data Bank (PDB). The authors evaluate unconditional generation by designability and report superior performance over existing approaches, and on binder design they report a roughly tenfold improvement in success rate on hard tasks relative to prior one-stage co-design methods. The preprint does not state the model's parameter count or the detailed composition of its training set.

Applications

A-CODE targets de novo protein design and protein binder design—designing new proteins that fold to desired structures and bind specified targets. The fully atomic, single-stage formulation is especially aimed at hard binder-design tasks where staged pipelines tend to fail. Its ability to reason about non-canonical amino acids opens potential applications in designing proteins with chemistries outside the natural alphabet, of interest to protein engineers and synthetic biologists exploring expanded functional repertoires. As with all computational design tools, generated candidates would require experimental validation.

Impact

A-CODE contributes to a broader shift in protein design from frame- or residue-based generative models toward fully atomic representations, joining all-atom efforts in the generative-protein-design space. Its claimed first-in-class support for non-canonical amino acid co-design and its reported gains on hard binder tasks position it as a notable step for one-stage co-design. As of the preprint's release, the authors had not released code or weights, citing an intent for responsible release; the paper is distributed under a CC BY-NC-SA 4.0 license. Because the work is a preprint and its training-data composition and parameter count are unstated, its real-world impact and reproducibility remain to be established through community evaluation.

Citation

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

Preprint

Cheng, C., et al. (2026) A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion. arXiv.

DOI: 10.48550/arXiv.2605.03360

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References38

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

8Closed

Usability — can I run it?7

Reproducibility — can I retrain it?10

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

Research Paper

Key Features

Single-stage all-atom co-design: Discrete atom types and continuous coordinates are refined together in one unified diffusion process, removing the cascaded backbone-then-sequence-then-packing pipeline used by most prior methods.

Multimodal diffusion framework: The model jointly handles categorical (atom type) and continuous (coordinate) variables, letting structure and sequence constrain each other throughout generation.

Non-canonical amino acid support: By modeling proteins at the atomic level, A-CODE can—per the authors, for the first time in a co-design model—adapt to non-canonical amino acids beyond the standard 20-residue alphabet.

Strong hard-task binder design: The authors report a roughly tenfold improvement in success rate on hard binder-design tasks compared to prior one-stage co-design methods, while rivaling or exceeding two-stage baselines.

Unconditional generation and conditional binder design: A-CODE serves both as an unconditional generative model for protein structures and as a conditional model for designing binders against a target.

Technical Details

Applications

Impact

A-CODE

Key Features

Technical Details

Applications

Impact

Citation

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

A-CODE

Key Features

Technical Details

Applications

Impact

Citation

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

A-CODE

#Key Features

#Technical Details

#Applications

#Impact

Citation

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

A-CODE

#Key Features

#Technical Details

#Applications

#Impact

Citation

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact