ESMFold2 is the structure-prediction and design engine of Biohub's "world model of protein biology," released on May 27, 2026. Where the original ESMFold (Lin et al., 2023, then at Meta AI) showed that an evolutionary-scale protein language model could fold single sequences without multiple sequence alignments, ESMFold2 extends that lineage to atomically-resolved 3D structures of full biomolecular complexes — proteins together with DNA, RNA, small molecules, and modified residues. It is released alongside ESMC, the language model whose sequence representations it consumes, and the ESM Atlas of predicted structures.
The model is built and maintained by Biohub, the unified entity formed from CZI Science, CZ Biohub, and the acquired EvolutionaryScale team. Rather than learning structure directly from raw sequence, ESMFold2 translates the evolutionary patterns already encoded in ESMC's embeddings into all-atom coordinates, which lets it inherit the breadth of life that ESMC was trained on while focusing its own capacity on geometry. Its central architectural idea is a looped transformer that reuses the same blocks repeatedly, so the amount of compute spent on a target can be scaled at inference time rather than being fixed by the network depth.
Beyond folding existing molecules, ESMFold2 is positioned as a design engine: it was used to generate novel protein binders that were then validated experimentally against disease-relevant targets, with the binder search completing in days rather than the months or years typical of conventional campaigns.
ESMFold2 is a transformer that operates on per-residue and pairwise representations derived from ESMC, with a looped trunk that iterates a shared set of blocks so that additional recycling passes can be traded for accuracy at inference time. It ships in two variants: the full ESMFold2 model, which can be conditioned on optional multiple sequence alignments, and ESMFold2-Fast, an inference-optimized single-sequence model. Both use a training-data cutoff of September 2021 and were trained on experimental structures from the Protein Data Bank together with predicted structures from the AlphaFold Database. Evaluation is reported on FoldBench, where ESMFold2 matches or surpasses AlphaFold 3 on antibody–antigen and general protein–protein complexes; comparisons in the accompanying preprint also place it favorably against Chai-1 (Chai Discovery) and Boltz-1 (MIT). Wet-lab validation across five disease targets — the receptor tyrosine kinases EGFR and PDGFRβ, the immune checkpoints PD-L1 and CTLA-4, and the signaling regulator CD45 — produced experimentally confirmed binders, with the design loop running in days.
ESMFold2 serves structural biologists and protein engineers who need accurate complex structures and de novo binders without depending on slow MSA construction or experimental scaffolds. Its complex-prediction accuracy makes it useful for antibody–antigen modeling, protein–protein interaction studies, and structure-based interpretation of biomolecular assemblies that include nucleic acids or small-molecule ligands. As a design engine it supports therapeutic discovery in oncology and immunology, where its demonstrated ability to produce validated mini-binders and antibody-format binders against checkpoint and receptor targets compresses early discovery timelines. The inference-time compute scaling lets users invest more computation in difficult or high-value targets.
ESMFold2 marks the structure-and-design pillar of Biohub's first integrated world model of protein biology, extending the original alignment-free ESMFold concept to full complexes and to generative binder design with experimental confirmation. Its open MIT release lowers the barrier to high-accuracy complex prediction and binder design, an area where comparable frontier capability has often sat behind closed commercial models. Reported parity with or advantage over AlphaFold 3 on antibody–antigen complexes is notable given how challenging that class has been for prior folding systems. The work is documented in a preprint posted with the release; because that preprint is distributed as a hosted PDF rather than through a DOI-issuing repository, no DOI is yet available, and the headline benchmark and wet-lab numbers should be read as preprint results pending peer review and independent replication.