bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Pathology foundation models
Pathology

Hibou

HistAI

DINOv2-based Vision Transformer foundation models for digital pathology, trained on over 1 million whole-slide images. Available as Hibou-B (86M) and Hibou-L (307M) under Apache 2.0.

Released: June 2024
Parameters: 307 Million

Hibou is a family of Vision Transformer foundation models for digital pathology, developed by HistAI and released in June 2024. The family comprises two variants — Hibou-B and Hibou-L — pretrained on a curated dataset of over 1 million whole-slide images (WSIs) using the DINOv2 self-supervised learning framework with additional register tokens for improved feature quality.

What distinguishes Hibou from competing pathology foundation models is its combination of training scale, stain diversity, and permissive licensing. The pretraining corpus spans both H&E-stained slides (936,441 WSIs) and non-H&E modalities (202,464 slides including immunohistochemistry, special stains, and cytology), exposing the model to the full breadth of tissue preparation techniques encountered in real clinical and research settings. Both variants are released under the Apache 2.0 license, enabling unrestricted commercial and research use without the restrictive gating common among competing pathology foundation models such as Prov-GigaPath and Virchow.

At time of publication, Hibou-L established state-of-the-art average accuracy across six standard patch classification datasets and outperformed Prov-GigaPath on all three slide-level WSI classification benchmarks evaluated. Hibou-B, despite having roughly 13 times fewer parameters than GigaPath, matched or exceeded it on two of three slide-level tasks, demonstrating strong parameter efficiency from the DINOv2 training strategy.

#Key Features

  • Two model sizes: Hibou-B (ViT-B/14, ~86M parameters) and Hibou-L (ViT-L/14, ~307M parameters) accommodate different compute budgets without sacrificing representational quality.
  • DINOv2 with register tokens: Self-supervised pretraining is extended with learnable register tokens that improve attention map quality and reduce artifacts, yielding cleaner patch-level features than standard DINOv2.
  • Multi-stain training corpus: Coverage of H&E, immunohistochemistry, and special stains ensures features generalize across staining protocols, unlike models trained exclusively on H&E slides.
  • Apache 2.0 license: Fully permissive for commercial and research use, with no gating or institutional registration requirements beyond a HuggingFace account.
  • HuggingFace integration: Both variants load directly via the transformers library with a single AutoModel.from_pretrained call, simplifying integration into existing PyTorch pipelines.
  • State-of-the-art slide-level benchmarks: Hibou-L outperforms Prov-GigaPath on TCGA-BRCA, TCGA-NSCLC, and TCGA-RCC WSI classification tasks using attention-based multiple instance learning pooling.

#Technical Details

Both Hibou variants are built on the DINOv2 Vision Transformer architecture with a modification to incorporate register tokens — additional learnable tokens appended to the patch sequence that allow the model to offload global information processing away from local patch tokens, improving spatial feature quality. Hibou-B uses a ViT-B/14 backbone (85.7M parameters, 14-pixel patch size) and Hibou-L uses a ViT-L/14 backbone (~307M parameters, 14-pixel patch size). The choice of 14-pixel rather than the more common 16-pixel patch size yields finer spatial resolution per token, which is advantageous for pathology images where cellular-level features at high magnification are diagnostically relevant.

The pretraining corpus totaled over 1.1 million WSIs: 936,441 H&E slides, 202,464 non-H&E slides, and 2,676 cytology slides, sourced from public and proprietary collections covering multiple human organ systems. Hibou-L trained on approximately 1.2 billion clean patches over 1.175 million iterations on 32 NVIDIA A100-40G GPUs; Hibou-B trained on 512 million patches over 500,000 iterations on 8 A100-80G GPUs. Standard DINOv2 solarization augmentation was deliberately excluded, as it degrades performance on stained tissue images; instead, RandStainNA stain normalization and color jittering were applied. On patch classification benchmarks using linear probing, Hibou-L achieved an average accuracy of 0.890 across six datasets (CRC-100K, PCAM, MHIST, MSI-CRC, MSI-STAD, TIL-DET), surpassing contemporaneous models including Phikon, Kaiko-B8, Virchow, RudolfV, Prov-GigaPath, and H-optimus-0.

#Applications

Hibou functions as a general-purpose feature extractor for digital pathology workflows. Downstream tasks include cancer subtyping from WSI patches (e.g., distinguishing IDC from ILC in breast cancer, or LUAD from LUSC in lung cancer), molecular biomarker prediction from H&E slides (microsatellite instability, mutation status), and survival analysis using slide-level aggregated embeddings. The companion CellViT-Hibou-L model — combining Hibou-L features with the CellViT segmentation framework — enables panoptic nuclei segmentation on the PanNuke benchmark, with improved performance over CellViT-SAM-H baselines for epithelial and dead cell categories. Because Hibou was pretrained on non-H&E stains, its representations transfer more reliably to IHC panels and special stain workflows than models trained exclusively on H&E, broadening applicability across clinical laboratory settings.

#Impact

Hibou addresses a recognized gap in the pathology foundation model landscape: the combination of open licensing, multi-stain pretraining, and competitive benchmark performance has made it one of the more practically accessible models in the field. Its Apache 2.0 release stands in contrast to the non-commercial or gated licensing of several higher-profile competitors, lowering barriers for both academic research and clinical product development. A notable limitation is that the pretraining WSI dataset is not publicly released, limiting reproducibility of the pretraining procedure. Hibou-L was also trained on approximately one-sixth of HistAI's full proprietary dataset at time of publication, suggesting meaningful headroom for further performance improvement. As with all pathology foundation models, downstream applications require independent clinical validation before deployment in regulated healthcare settings.

Citation

Hibou: A Family of Foundational Vision Transformers for Pathology

Preprint

Nechaev, D., et al. (2024) Hibou: A Family of Foundational Vision Transformers for Pathology. arXiv.org.

DOI: 10.48550/arXiv.2406.05074

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations76
Influential8
References16

GitHub

Stars78
Forks9
Open Issues1
Contributors1
Last Push1y ago
LanguagePython
LicenseApache-2.0

HuggingFace

Downloads45.8K
Likes20
Last Modified1y ago
Pipelineimage-feature-extraction

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
46Partial
Usability — can I run it?87
Reproducibility — can I retrain it?4
open weights, closed recipe
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

foundation_modelhistologyself_supervisedvision_model

Resources

GitHub RepositoryResearch PaperOfficial WebsiteHuggingFace ModelHuggingFace Model