Overview

UNI is a general-purpose computational pathology foundation model developed by the Mahmood Lab at Harvard Medical School and Brigham and Women's Hospital. Published in Nature Medicine in March 2024, UNI established a new benchmark for self-supervised learning in digital pathology by training on a dataset of unprecedented scale and tissue-type diversity. At the time of publication, it achieved state-of-the-art performance across 34 representative pathology tasks, spanning patch-level classification, slide-level classification, survival prediction, and pathology visual question answering.

The model addresses a longstanding bottleneck in computational pathology: the scarcity of labeled training data. Annotating histopathology images requires expert pathologists and is expensive and time-consuming. By learning rich, transferable representations from unlabeled H&E slides through self-supervised pretraining, UNI enables strong downstream performance with minimal labeled data — including competitive results in few-shot and low-data regimes where prior models struggled.

UNI represents a shift from task-specific or organ-specific pathology models toward a single general-purpose backbone that can be fine-tuned or linearly probed for a wide range of clinical and research tasks. A successor model, UNI2, with a larger ViT-H backbone pretrained on approximately 200 million tiles, was released in January 2025 and represents the current recommended checkpoint for new projects.

Key Features

Massive pretraining scale: Trained on over 100 million H&E-stained image tiles extracted from more than 100,000 whole-slide images sourced from the Mass General Brigham pathology archive, one of the largest pretraining datasets in computational pathology at the time of publication.
Broad tissue-type coverage: Pretraining spans 20 distinct tissue types — including lung, breast, colon, prostate, kidney, liver, skin, and brain — enabling generalizable representations that transfer across diverse anatomical sites.
DINOv2 self-supervised training: Uses self-distillation with no labels (DINOv2) as the training objective, producing features with strong linear probing performance well-suited to the dense visual recognition requirements of pathology workflows.
Strong low-data performance: Demonstrates competitive zero-shot and few-shot results, substantially reducing the annotation burden needed to train reliable downstream classifiers.
Comprehensive evaluation: Benchmarked across 34 pathology tasks at publication, outperforming prior self-supervised pathology models including CONCH-vision, PLIP, CTransPath, and Phikon on the majority of benchmarks.
Active development: UNI2-h (ViT-H backbone, ~200M tiles) released January 2025 further advances performance and is available on HuggingFace alongside the original UNI weights.

Technical Details

UNI uses a Vision Transformer Large (ViT-L/16) backbone with approximately 307 million parameters, trained using the DINOv2 self-supervised learning framework. DINOv2 employs a teacher-student self-distillation objective without any labels, and has been shown to produce feature representations with superior linear probing performance compared to masked image modeling approaches such as MAE — an important advantage for patch-level pathology tasks.

Input patches are 224x224 pixels extracted at 20x magnification (0.5 microns per pixel), the standard resolution used in surgical pathology workflows. The model outputs fixed-dimensional patch-level embeddings that can be aggregated into slide-level representations using multiple-instance learning (MIL) frameworks such as ABMIL. Training data consisted exclusively of hematoxylin and eosin (H&E) stained slides from the de-identified Mass General Brigham clinical archive, covering 20 tissue types. At publication, UNI outperformed all prior self-supervised pathology models on the majority of its 34-task benchmark suite.

Applications

UNI is designed as a general-purpose feature extractor for computational pathology pipelines. Researchers use it for cancer subtyping — for example, distinguishing lung adenocarcinoma from squamous cell carcinoma — and for biomarker prediction tasks such as inferring microsatellite instability or HER2 status directly from H&E morphology. Slide-level prognostic modeling benefits from UNI embeddings aggregated via MIL, enabling survival prediction without molecular assays. The model's strong few-shot capability makes it particularly valuable for rare disease classification, where labeled pathology data is scarce. UNI also serves as a visual backbone for multimodal models that align histology image features with pathology text, supporting pathology report grounding and visual question answering.

Impact

UNI's publication in Nature Medicine marked a significant milestone in establishing general-purpose foundation models for digital pathology, demonstrating that large-scale self-supervised pretraining on diverse tissue types could produce representations competitive with or superior to task-specific approaches. The model has been widely adopted by the computational pathology community as a standard feature extractor baseline. Notable limitations include its restriction to H&E staining (performance on IHC or special stains requires adaptation), processing at a single magnification level (20x), and a gated access model on HuggingFace that restricts use to research under agreed terms. The pretraining data reflects the patient population of a major US academic medical center, which may affect generalization to globally diverse datasets. These limitations are areas of active work in the field, and the release of UNI2 in early 2025 represents the Mahmood Lab's continued investment in scaling and improving the foundation model.

Overview

Key Features

Massive pretraining scale: Trained on over 100 million H&E-stained image tiles extracted from more than 100,000 whole-slide images sourced from the Mass General Brigham pathology archive, one of the largest pretraining datasets in computational pathology at the time of publication.

Broad tissue-type coverage: Pretraining spans 20 distinct tissue types — including lung, breast, colon, prostate, kidney, liver, skin, and brain — enabling generalizable representations that transfer across diverse anatomical sites.

DINOv2 self-supervised training: Uses self-distillation with no labels (DINOv2) as the training objective, producing features with strong linear probing performance well-suited to the dense visual recognition requirements of pathology workflows.

Strong low-data performance: Demonstrates competitive zero-shot and few-shot results, substantially reducing the annotation burden needed to train reliable downstream classifiers.

Comprehensive evaluation: Benchmarked across 34 pathology tasks at publication, outperforming prior self-supervised pathology models including CONCH-vision, PLIP, CTransPath, and Phikon on the majority of benchmarks.

Active development: UNI2-h (ViT-H backbone, ~200M tiles) released January 2025 further advances performance and is available on HuggingFace alongside the original UNI weights.

Technical Details

Applications

Impact

UNI

Overview

Key Features

Technical Details

Applications

Impact

Citation

Towards a general-purpose foundation model for computational pathology

Metrics

GitHub

Citations

HuggingFace

Tags

Resources

UNI

Overview

Key Features

Technical Details

Applications

Impact

Citation

Towards a general-purpose foundation model for computational pathology

Metrics

GitHub

Citations

HuggingFace

Tags

Resources