A diffusion model that generates 3D small molecules conditioned on protein pockets and partial fragments via continuous spatial density-map conditioning.
Sesame (Spatial Evoformer for a Structure-Aware Molecular Engine) is a diffusion-based generative model for 3D small molecules that conditions generation on both a target protein pocket and any partial molecular structure a chemist wants to keep. Developed by Tessel Biosciences, a Cambridge, Massachusetts startup, and released as an arXiv preprint in June 2026, it addresses a core problem in structure-based drug design (SBDD): generating chemically valid, synthetically reasonable molecules that fit a binding site while respecting fragments a medicinal chemist has already committed to.
The model's central novelty is a single conditioning mechanism that unifies two tasks usually handled by separate pipelines. Rather than encoding the pocket and a seed fragment as discrete graphs or point sets, Sesame represents both as continuous spatial density maps—volumetric fields encoding properties like charge, hydrophobicity, hydrogen-bond donors and acceptors, aromaticity, and van der Waals potential. A spatial pairformer module (the "spatial Evoformer" of the name) reads these maps to steer the diffusion process. Because pocket conditioning and fragment conditioning flow through the same channel, one trained model performs both de novo generation and fragment-conditioned lead optimization, or scaffold growing.
Sesame enters a crowded SBDD landscape alongside models such as DiffSBDD, Pocket2Mol, and TargetDiff, but distinguishes itself through its shared density-map conditioning and a trajectory-finetuning scheme that trains on the model's own sampling rollouts to sharpen output quality.
Sesame is built on a MoleculePairformer—a 24-layer, 4-head transformer with a single-representation dimension of 384 and a pair-representation dimension of 128—that ingests the 6-channel density maps by sampling 1,024 points per layer. The forward diffusion process maintains three independent noise channels for position, atom type, and bond type, with optimal-assignment re-pairing of coordinates; inference runs as a 20-step reverse trajectory from a fixed pretrained checkpoint. Pretraining draws on two corpora: roughly 15 billion ligand-only compounds from ZINC22 (filtered to 3–50 heavy atoms) and about 8 million protein-ligand complexes from the SAIR (Structurally-Augmented IC50 Repository) dataset. In the authors' internal ablations, the finetuned model reaches roughly 88.7% molecular validity for de novo (protein-only) generation and 92.4–94.8% validity in fragment-conditioned (protein-plus-fragment) mode, with fragment retention near 95% and 950 of 964 single-fragment cases preserving connectivity. These figures are internal ablations only; the preprint reports no head-to-head comparisons against external SBDD baselines.
Sesame targets structure-based drug discovery, where a known or predicted protein pocket guides the search for binders. Computational and medicinal chemists can use it to enumerate de novo candidates that complement a pocket's physicochemical landscape, or—more distinctively—to perform lead optimization by fixing a privileged fragment or scaffold and growing it into the surrounding site. The unified conditioning makes it well suited to iterative hit-to-lead campaigns, where a chemist repeatedly prunes and regrows a molecule while preserving the substructure responsible for binding.
As a fresh June 2026 preprint from an early-stage company, Sesame's downstream influence is still unestablished, and its real significance is a conceptual one: collapsing pocket conditioning and fragment conditioning into a single continuous density-map channel that serves both de novo design and scaffold growing. Adoption is currently limited by openness—no code, pretrained weights, model card, or data card have been released, and the reported benchmarks are internal ablations without external SBDD comparisons, so independent validation is not yet possible. If weights or a comparable benchmark are released, the density-map conditioning approach could prove a useful template for unifying generative SBDD workflows; until then the work is best read as a promising methodological proposal.
Yatsenko, K. & Thiagarajan, A. (2026) Sesame: Structure-Aware Molecular Generation via Spatial Density-Map Conditioning. arXiv.
DOI: 10.48550/arXiv.2606.23856Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data