FutureHouse / Mila / McGill University
Multimodal diffusion model that co-designs protein sequence and 3D structure around arbitrary biomolecules, demonstrated by designing novel heme enzymes catalyzing carbene-transfer reactions.
DISCO is a multimodal diffusion model that jointly designs protein sequence and three-dimensional structure conditioned on arbitrary non-protein biomolecules (small molecules, cofactors, DNA, RNA). Released as an arXiv preprint in April 2026 from a collaboration between FutureHouse, Mila, and McGill University, DISCO is notable for being the first model to co-design sequence and structure around reactive intermediates rather than ground-state substrates, enabling de novo design of enzymes that catalyze new-to-nature carbene-transfer reactions.
Experimental validation in the preprint demonstrates designed heme enzymes with carbene-transfer activity exceeding what has been achieved through directed evolution, suggesting that intermediate-state conditioning unlocks design solutions that ground-state-constrained methods cannot reach.
DISCO uses a diffusion architecture trained on a multi-task corpus combining the PDB, ligand-bound co-crystal structures, and a curated set of enzyme transition-state geometries. The diffusion process operates jointly on amino acid identity and atomic coordinates, with classifier-free guidance for context conditioning. The arXiv preprint reports architectural details, training data, and ablation studies.
Experimental validation centers on a panel of designed heme-binding scaffolds tested for carbene-transfer activity in vitro. Reported turnover numbers exceed directed-evolution results from the published literature for the same reaction class.
DISCO is applicable to enzyme design problems where the desired chemistry is poorly represented in nature or where directed evolution cannot reach sufficient activity. Beyond carbene transfer, the framework is intended for any reaction whose mechanism can be specified through a transition-state geometry — a broad class spanning C-H activation, nitrene transfer, cycloadditions, and unnatural amino acid incorporation. The model also supports cofactor-binding protein design (NAD, FAD, metal-coordinated enzymes) and nucleic-acid binder design through the same conditioning interface.
DISCO advances the state of the art in computational enzyme design by formalizing transition-state conditioning as a generative-model interface. The reported activities of designed carbene-transfer enzymes exceeding directed-evolution baselines suggest that integrated sequence-structure co-design, properly conditioned, can outperform iterative experimental optimization for some reaction classes. As an arXiv-only release at this stage, the work is pending peer review, but the experimental validation provides a strong empirical anchor.