Energy-based model of protein conformational space derived from diffusion-model likelihoods, usable as a universal statistical potential for many tasks.
ProteinEBM is an energy-based model of protein conformational space that turns a trained protein diffusion model into a universal statistical potential. Generative diffusion models for protein structure are typically used to sample new conformations; ProteinEBM instead shows that the likelihood implied by such a model defines a smooth, differentiable free-energy landscape over protein structures. From a single fixed pretrained model, that landscape can be queried for many different structural-biology tasks without task-specific retraining.
The work, titled "Protein Diffusion Models as Statistical Potentials," was developed by James P. Roney, Chenxi Ou, and Sergey Ovchinnikov at MIT and posted to bioRxiv in December 2025. Its central contribution is conceptual as much as practical: it connects denoising diffusion training to classical statistical potentials, learning an energy function over conformations without requiring equivariant network architectures.
Because the resulting energy is differentiable and defined everywhere in conformational space, ProteinEBM can rank candidate structures, score the energetic effects of mutations, sample conformational ensembles, predict structures, and even trace folding pathways, all from one model. Across these tasks it reports performance competitive with or exceeding prior machine-learning and physics-based methods.
ProteinEBM is built on a denoising diffusion model trained over protein structures; the authors show that the diffusion training objective yields an energy-based model whose energies approximate a statistical potential over conformational space. Because the energy is differentiable, gradients can be used for optimization, sampling, and landscape exploration. The same fixed model is applied across structure ranking, mutation-effect prediction, conformational-landscape sampling, structure prediction, and folding-pathway simulation. On these benchmarks the model reports results competitive with or exceeding previous machine-learning and physics-based potentials, which the authors frame as a step toward physically grounded learned models for protein science.
ProteinEBM is relevant to structural biologists and protein engineers who need to rank predicted or designed structures, estimate the stability effects of mutations, or generate conformational ensembles beyond a single static prediction. Its differentiable energy makes it a candidate scoring component within protein-design and structure-refinement pipelines, and its folding-pathway simulations may interest researchers studying protein dynamics and folding mechanisms.
By recasting protein diffusion models as statistical potentials, ProteinEBM offers a unifying view that links generative modeling, structure scoring, and mutation analysis under a single learned energy function. This consolidation is notable because these tasks are usually handled by separate, specialized tools. As a recent preprint from the Ovchinnikov lab without confirmed public code or weights, its adoption will depend on release and independent reproduction, but it points toward physically grounded, multi-purpose learned potentials for protein modeling.