Self-supervised pathology foundation model (ViT-L/16, DINOv2) pretrained on 100M+ H&E tiles from 100,000+ whole-slide images. State-of-the-art on 34 pathology tasks.
UNI is a general-purpose computational pathology foundation model developed by the Mahmood Lab at Harvard Medical School and Brigham and Women's Hospital. Published in Nature Medicine in March 2024, UNI established a new benchmark for self-supervised learning in digital pathology by training on a dataset of unprecedented scale and tissue-type diversity. At the time of publication, it achieved state-of-the-art performance across 34 representative pathology tasks, spanning patch-level classification, slide-level classification, survival prediction, and pathology visual question answering.
The model addresses a longstanding bottleneck in computational pathology: the scarcity of labeled training data. Annotating histopathology images requires expert pathologists and is expensive and time-consuming. By learning rich, transferable representations from unlabeled H&E slides through self-supervised pretraining, UNI enables strong downstream performance with minimal labeled data — including competitive results in few-shot and low-data regimes where prior models struggled.
UNI represents a shift from task-specific or organ-specific pathology models toward a single general-purpose backbone that can be fine-tuned or linearly probed for a wide range of clinical and research tasks. A successor model, UNI2, with a larger ViT-H backbone pretrained on approximately 200 million tiles, was released in January 2025 and represents the current recommended checkpoint for new projects.
UNI uses a Vision Transformer Large (ViT-L/16) backbone with approximately 307 million parameters, trained using the DINOv2 self-supervised learning framework. DINOv2 employs a teacher-student self-distillation objective without any labels, and has been shown to produce feature representations with superior linear probing performance compared to masked image modeling approaches such as MAE — an important advantage for patch-level pathology tasks.
Input patches are 224x224 pixels extracted at 20x magnification (0.5 microns per pixel), the standard resolution used in surgical pathology workflows. The model outputs fixed-dimensional patch-level embeddings that can be aggregated into slide-level representations using multiple-instance learning (MIL) frameworks such as ABMIL. Training data consisted exclusively of hematoxylin and eosin (H&E) stained slides from the de-identified Mass General Brigham clinical archive, covering 20 tissue types. At publication, UNI outperformed all prior self-supervised pathology models on the majority of its 34-task benchmark suite.
UNI is designed as a general-purpose feature extractor for computational pathology pipelines. Researchers use it for cancer subtyping — for example, distinguishing lung adenocarcinoma from squamous cell carcinoma — and for biomarker prediction tasks such as inferring microsatellite instability or HER2 status directly from H&E morphology. Slide-level prognostic modeling benefits from UNI embeddings aggregated via MIL, enabling survival prediction without molecular assays. The model's strong few-shot capability makes it particularly valuable for rare disease classification, where labeled pathology data is scarce. UNI also serves as a visual backbone for multimodal models that align histology image features with pathology text, supporting pathology report grounding and visual question answering.
UNI's publication in Nature Medicine marked a significant milestone in establishing general-purpose foundation models for digital pathology, demonstrating that large-scale self-supervised pretraining on diverse tissue types could produce representations competitive with or superior to task-specific approaches. The model has been widely adopted by the computational pathology community as a standard feature extractor baseline. Notable limitations include its restriction to H&E staining (performance on IHC or special stains requires adaptation), processing at a single magnification level (20x), and a gated access model on HuggingFace that restricts use to research under agreed terms. The pretraining data reflects the patient population of a major US academic medical center, which may affect generalization to globally diverse datasets. These limitations are areas of active work in the field, and the release of UNI2 in early 2025 represents the Mahmood Lab's continued investment in scaling and improving the foundation model.
Chen, R. J., et al. (2024) Towards a general-purpose foundation model for computational pathology. Nature Medicine.
DOI: 10.1038/s41591-024-02857-3