A Schrödinger-bridge diffusion model for virtual multiplex staining that translates standard H&E histology into multiplex immunohistochemistry images.
SMILE (Schrödinger-bridge for Multiplex ImmunoLabel Estimation) is a generative diffusion model that performs virtual multiplex staining, computationally translating standard hematoxylin and eosin (H&E) histology images into multiplex immunohistochemistry (mIHC) images. Acquiring true mIHC is expensive, slow, and consumes tissue, while H&E is the cheap, ubiquitous workhorse of pathology. SMILE aims to recover protein-level molecular signal directly from archival H&E slides, turning routine brightfield images into a proxy for multiplexed protein readouts.
The model was developed at Johns Hopkins University and posted to bioRxiv in April 2026. Its primary demonstration targets the pancreatic islet across stages of type 1 diabetes progression, predicting insulin, glucagon, and CD3 staining from H&E. This is a clinically meaningful setting: islet composition and immune infiltration are central to understanding beta-cell loss in type 1 diabetes, yet are difficult to profile at scale from limited donor tissue.
SMILE's key methodological choice is the Schrödinger bridge. Unlike conventional diffusion models that map a source image to a target by first passing through an intermediate Gaussian-noise distribution, a Schrödinger-bridge formulation learns a direct stochastic transport between the two image domains. This skips the noise bottleneck and has been shown to better preserve fine tissue structure during image-to-image translation, which is critical when the spatial registration between predicted protein signal and underlying morphology must be trusted.
SMILE is built on a Schrödinger-bridge diffusion framework for paired image-to-image translation. Training used a large purpose-built cohort of high-fidelity H&E–mIHC image pairs generated from human pancreatic organ donors, targeting insulin, glucagon, and CD3, with the dataset deliberately sampled across type 1 diabetes status, pancreatic anatomical location, donor age, and sex to support generalization. Performance was benchmarked against generative adversarial network (GAN) approaches using a composite evaluation framework combining texture metrics, distributional comparisons, and antibody-specific measures, complemented by blinded pathologist assessment, with SMILE outperforming the GAN baselines. The authors further demonstrate that a fixed checkpoint transfers across acquisition sites and to breast cancer tissue, evidencing cross-site and cross-tissue robustness.
SMILE targets digital and computational pathology, where it can convert large archives of routine H&E slides into estimated multiplexed protein maps without additional immunolabeling. In type 1 diabetes research, it enables high-throughput profiling of islet endocrine composition (insulin, glucagon) and T-cell infiltration (CD3) across donor cohorts, helping researchers study disease progression at scale from tissue that is scarce and irreplaceable. Its cross-tissue transfer to breast cancer suggests broader applicability to oncology and other settings where mIHC is informative but costly, benefiting pathologists, tissue-atlas efforts, and translational researchers.
By recovering protein-level signal from the most widely available histology stain, SMILE points toward a scalable route for retrospective proteomic analysis of existing pathology archives, lowering the cost and tissue burden of multiplexed imaging. The demonstration that a structure-preserving Schrödinger-bridge diffusion model outperforms GANs on quantitative metrics and blinded pathologist review—and generalizes across sites and tissues from a single checkpoint—adds to growing evidence that diffusion-based virtual staining is maturing into a practical tool. As a recent preprint, SMILE has not yet released public code or weights, and its predictions, like all virtual staining, require careful validation before any diagnostic use.