Unified diffusion-based model predicting structures of protein complexes with nucleic acids, small molecules, ions, and modified residues with atomic accuracy.
AlphaFold 3 is Google DeepMind's third-generation structure prediction system, published in Nature in May 2024. Where AlphaFold 2 transformed the prediction of single-chain protein structures, AlphaFold 3 extends that capability to the full spectrum of biomolecular interactions: proteins, DNA, RNA, small-molecule ligands, ions, and post-translational modifications can all be modeled together within a single unified framework. This generality represents a fundamental architectural departure — not merely an incremental improvement — and addresses one of the central limitations of its predecessor.
The shift matters enormously in practice. Biological function rarely arises from isolated proteins. Enzymes bind substrates, transcription factors grip DNA, ribosomes choreograph RNA and protein in concert, and drug candidates must dock precisely into binding pockets shaped by protein flexibility. AlphaFold 3 enables researchers to model these interactions computationally at atomic resolution, dramatically compressing the experimental iteration cycle in structural biology and drug discovery.
AlphaFold 3 was released alongside the AlphaFold Server, a free web interface that allows researchers without local GPU infrastructure to submit jobs and receive predicted structures. The underlying model weights were subsequently made available for non-commercial use through the Google DeepMind GitHub repository.
AlphaFold 3's architecture consists of a large trunk that constructs pairwise and single sequence representations of the input complex, followed by a diffusion module that generates raw atom coordinates. The Evoformer of AlphaFold 2 is replaced by the Pairformer, which processes single and pair representations using triangle-based attention and SwiGLU activation functions (replacing the ReLU activations used in AF2). MSA processing is retained but reduced to four Evoformer layers, reflecting the finding that MSA information is less critical for non-protein molecules. The diffusion module treats structure prediction as a denoising problem: starting from random atom positions, the network learns to iteratively remove noise guided by the pair and single representations, converging on physically plausible 3D coordinates.
Training data draws from the Protein Data Bank (PDB) with expanded inclusion of small-molecule ligands and nucleic acid structures. The model is trained to predict all heavy atoms across all molecule types from a unified token representation that encodes each residue or small-molecule atom group. Input features include MSA-derived evolutionary information, template structures, and a comprehensive chemical description of each molecular entity. The model was evaluated against held-out PDB structures and dedicated benchmark sets including PoseBusters V1 for ligand docking and standard multimer benchmarks for protein complex accuracy.
AlphaFold 3 is directly applicable to structure-based drug discovery, where predicting how a small molecule binds to a protein target can guide medicinal chemistry campaigns. It is equally useful for studying protein-nucleic acid interactions — such as CRISPR-Cas9 guide RNA binding or transcription factor-DNA recognition — that were inaccessible to AlphaFold 2. Structural biologists use AlphaFold 3 predictions as molecular replacement search models and as hypotheses to design mutagenesis experiments. The AlphaFold Server makes these capabilities accessible to wet-lab researchers who do not maintain dedicated compute infrastructure, lowering the barrier for hypothesis-driven structural work across academic and clinical research.
AlphaFold 3 substantially expands the scope of accurate computational structural biology from isolated proteins to the broader universe of biomolecular complexes. Its performance on protein-ligand docking benchmarks drew immediate attention from the pharmaceutical industry, where docking accuracy directly influences hit identification and lead optimization. The model's release prompted debate around access and openness — the server is free but non-commercial, and the initial code release imposed restrictions that the research community noted as a departure from the fully open release of AlphaFold 2. A key limitation is that diffusion-based sampling can occasionally produce physically implausible structures, particularly for highly flexible or disordered regions, and the model does not yet match the accuracy of dedicated RNA folding tools for complex RNA tertiary structures. Nonetheless, AlphaFold 3 marks a decisive step toward a general-purpose structural biology engine.
Sources:
Abramson, J., et al. (2024) Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature.
DOI: 10.1038/s41586-024-07487-w