Northwestern Polytechnical University
Viability-guided diffusion model for de novo design of AAV capsid protein sequences, using a gradient-guided viability classifier to bias generation toward assemblable, packaging-competent variants.
AAVDiffusion is a generative model for designing adeno-associated virus (AAV) capsid protein sequences, the protein shells that determine how AAV gene-therapy vectors assemble, package their DNA cargo, and target tissues. Engineering AAV capsids is a central problem in gene therapy because small changes to the capsid surface can dramatically alter tropism and immune evasion. A persistent obstacle is that randomly mutated or naively generated capsid libraries are dominated by nonviable variants that fail to assemble or to package DNA, wasting experimental throughput.
Developed at Northwestern Polytechnical University (Key Laboratory of Big Data Storage and Management) and posted to bioRxiv in January 2026, AAVDiffusion addresses this by making viability an explicit objective during generation. It is a diffusion model that iteratively denoises Gaussian vectors into AAV capsid protein sequences, with a gradient-based viability classifier steering the sampling trajectory toward sequences predicted to be assemblable and packaging-competent. This places it alongside earlier diffusion-based capsid designers (such as AAVDiff) while emphasizing viability-guided sampling.
A key practical advantage is that the model requires no per-target retraining: the same generative process can be guided toward different objectives at inference time. In their computational study, the authors used a selection workflow built on the model to identify 196 candidate capsids predicted to have the potential to cross the blood-brain barrier, a property of strong interest for central-nervous-system gene therapy.
AAVDiffusion is a sequence-generating diffusion model: it learns to reverse a Gaussian noising process over vector representations of AAV capsid protein sequences, producing intermediate latent variables that are progressively denoised into candidate sequences. Generation is steered by a viability classifier whose gradients are injected into the sampling process, a classifier-guidance scheme that pushes samples toward the predicted-viable region of sequence space. In computational benchmarks the authors report that AAVDiffusion outperforms baselines at generating viable AAV sequences. The work is computational; wet-lab validation of the generated capsids is not reported in the preprint, and weights and code are not released at preprint time.
AAVDiffusion targets gene-therapy vector engineers seeking to expand AAV capsid libraries with a higher fraction of functional variants. By front-loading viability prediction into the generative process, it can reduce the experimental burden of screening assembly- and packaging-defective designs and can be steered toward delivery-relevant properties such as blood-brain-barrier crossing, making it a candidate in-silico filter ahead of library synthesis and wet-lab characterization.
AAVDiffusion contributes to a growing line of generative models for AAV capsid engineering by foregrounding viability-guided sampling, addressing the practical bottleneck that most generated capsids are nonviable. Its significance is currently tempered by being a computational study without reported experimental validation and without released code or weights, so the real-world hit rate of its designs, including the 196 BBB-crossing candidates, remains to be confirmed in the laboratory. As a focused, gene-therapy-specific design tool, it nonetheless illustrates how property classifiers can be coupled to diffusion generators for viral vector design.