ARUP Laboratories / University of Utah
A self-supervised foundation model for clinical flow cytometry that produces unified, panel-agnostic specimen-level representations from heterogeneous multi-panel data.
Clinical flow cytometry is a cornerstone of hematopathology, immunology, and the diagnosis of leukemias and lymphomas, measuring the expression of many protein markers across millions of individual cells per specimen. Yet the data is notoriously hard to analyze at scale: different laboratories and assays use different antibody panels, so each panel measures a different, only partially overlapping set of markers. This heterogeneity has historically forced machine learning models to be trained one panel at a time, fragmenting effort and preventing the kind of unified, transferable representations that have transformed other areas of biology.
EventHorizon, developed by researchers at ARUP Laboratories' Institute for Research and Innovation and the University of Utah Department of Pathology and posted to bioRxiv in June 2026, is — to the catalog's knowledge — the first self-supervised foundation model for clinical flow cytometry. It learns a single shared latent space into which specimens from many different panels can be embedded, producing panel-agnostic, specimen-level representations directly from raw multi-panel data. Rather than predicting a fixed set of diagnostic labels, it is pretrained without labels and then evaluated by probing the frozen representations.
The work is notable for translating the self-supervised "foundation model" recipe into a real, high-volume clinical setting, using a corpus of over 100,000 clinical specimens spanning 17 distinct antibody panels drawn from routine diagnostic workflows.
EventHorizon is built on a two-stage hierarchical transformer. Individual cells (events) are first tokenized in a marker-aware fashion so that markers shared across panels map to a common representation while panel-specific markers are still accommodated; a first transformer stage builds cell-level representations, and a second stage aggregates them into a single specimen-level embedding. Pretraining follows a DINO-style self-distillation scheme — a student network is trained to match a momentum-updated teacher across augmented views — adapted with augmentations tailored to the statistical structure of flow cytometry data. The model was pretrained on more than 100,000 clinical specimens spanning 17 distinct antibody panels. Representations were evaluated by freezing the backbone and applying lightweight k-nearest-neighbor probes, which matched the performance of fully supervised models and panel-specific self-supervised baselines on diagnostic classification tasks.
The model targets clinical diagnostic workflows in hematopathology and immunology, where flow cytometry is used to characterize and classify hematologic malignancies and immune phenotypes. Because a single embedding space spans many panels, EventHorizon could support scalable, reproducible diagnostic decision support that generalizes across laboratories and assay configurations, retrieval of similar historical cases, and downstream classifiers trained with comparatively few labels. Clinical laboratories, pathologists, and researchers analyzing large archives of heterogeneous cytometry data are the primary beneficiaries.
As an early foundation model purpose-built for clinical flow cytometry, EventHorizon demonstrates that self-supervised pretraining can unify one of the most panel-fragmented modalities in laboratory medicine into a single representation space, pointing toward more transferable and label-efficient diagnostic tooling. Its authorship by a high-volume clinical reference laboratory underscores a clear translational intent. The work is an early-stage preprint released under a CC BY-ND license; at the time of writing no public code or model weights had been released, and the reported results rest on retrospective probing rather than prospective clinical validation, so its real-world diagnostic utility remains to be established.
Grespan, M. M., et al. (2026) EventHorizon: A Foundation Model for Clinical Flow Cytometry. bioRxiv.
DOI: 10.64898/2026.06.18.733197Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data