Pathology vision foundation model adapting DINOv3-style self-supervised learning to whole-slide histopathology across continuous magnifications and scales.
DaX is a pathology vision foundation model from Alibaba DAMO Academy, introduced in the June 2026 preprint "DaX: Learning General Pathology Representations Across Scales." It adapts DINOv3-style self-supervised learning to whole-slide histopathology, producing a single transferable encoder whose features can be reused across a wide range of diagnostic, biomarker, and prognostic tasks without retraining the backbone.
The central problem DaX targets is the brittleness of pathology encoders to the many sources of variation that distinguish digital slides: magnification, staining, scanner type, slide preparation, and input resolution. Most prior pathology foundation models — including UNI, Virchow, CONCH, Hibou, Prov-GigaPath, and DiGePath — are pretrained at a fixed magnification (typically 20x), which limits how robustly their representations transfer when downstream data is captured or tiled differently. DaX instead trains across a continuum of magnifications and scales, aiming for representations that stay stable as the field of view and resolution change.
DaX is initialized from natural-image DINOv3 weights and then continues self-supervised pretraining on histopathology, transferring the strong dense-feature inductive biases of DINOv3 into the pathology domain. The authors position it as a fixed, general-purpose encoder rather than a task-specific model, and evaluate it as such on a new large-scale whole-slide-image benchmark.
DaX is a Vision Transformer encoder trained with a DINOv3-style self-supervised objective, extended with continuous-magnification sampling, cross-scale view pairing, orientation- and acquisition-robust augmentation, and a Gram-anchored dense-consistency term that keeps token-level features coherent across scales. It supports multiple input sizes, consistent with its goal of resolution-robust representations. The model was evaluated as a frozen, transferable encoder — features are extracted once and reused — rather than being fine-tuned per task. Evaluation used a new whole-slide-image benchmark spanning 161 clinically meaningful tasks built from 44 public datasets covering 28,182 patients and 34,394 slides, organized across four clinical domains and nine task categories and scored with patient-level cross-validation and fold-level statistical ranking. Across this benchmark, DaX reported the highest mean performance across tasks among the encoders compared, with consistent rankings spanning diagnostic pathology, biomarker profiling, tissue-context, and prognostic prediction.
As a general-purpose, frozen feature extractor, DaX is intended to slot into computational-pathology pipelines in place of an existing encoder such as UNI or Prov-GigaPath. Extracted patch- and token-level embeddings can be aggregated with multiple-instance-learning heads for slide-level tasks like cancer subtyping, biomarker and molecular-status prediction from H&E morphology, tissue-context characterization, and prognostic or survival modeling. Its emphasis on robustness to magnification, staining, and scanner variation is particularly relevant for groups working across heterogeneous datasets and multi-institution cohorts, where slides are captured under differing acquisition conditions.
DaX contributes to a fast-moving line of pathology foundation models by arguing that scale- and magnification-robust pretraining, rather than larger fixed-magnification datasets alone, is a key axis for transferable histopathology representations. Its accompanying 161-task, 44-dataset benchmark is a notable artifact in its own right, offering a broad, patient-level protocol for comparing encoders across diagnostic, biomarker, tissue-context, and prognostic tasks. Important caveats apply: the work is a preprint and has not been peer-reviewed, and as of release the public GitHub repository is essentially a placeholder (Apache-2.0 license and README only, with no released code or model weights), so the reported results cannot yet be independently reproduced. Benchmark numbers and comparative rankings are author-reported. Adoption will depend on the team releasing the encoder weights and training or inference code referenced by the project's benchmark page.
Zhao, B., et al. (2026) DaX: Learning General Pathology Representations Across Scales.
DOI: 10.48550/arXiv.2606.06983Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data