bio.rodeo
HomeCompetitorsLeaderboardOrganizations
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

© 2026 bio.rodeo. All rights reserved.
Imaging

Cellpose 2.0

HHMI Janelia Research Campus

Human-in-the-loop cell segmentation framework enabling custom model training from as few as 100-200 corrected annotations.

Released: 2022

Overview

Cellpose 2.0 is a cell segmentation framework developed by Marius Pachitariu and Carsen Stringer at HHMI Janelia Research Campus, published in Nature Methods in November 2022. It extends the original Cellpose algorithm with a human-in-the-loop active learning pipeline, a graphical user interface for iterative annotation and retraining, and a curated model zoo of nine specialized pretrained checkpoints.

A persistent barrier to adopting deep learning for microscopy segmentation is the cost of generating labeled training data. Prior methods demanded tens of thousands of annotated regions of interest (ROI) to generalize across cell types — a bottleneck that is impractical for individual laboratories. Cellpose 2.0 addresses this directly: by initializing from pretrained models and applying an active learning cycle, the paper demonstrates that only 100-200 manually corrected ROI are sufficient to produce models that match or exceed tools trained on datasets two to three orders of magnitude larger.

The primary benchmark used the TissueNet dataset, which contains 2,601 training and 1,249 test images across six tissue types and six fluorescence platforms. Fine-tuned Cellpose models at 1,000 training ROI outperformed Mesmer — a specialized model trained on 200,000 ROI per tissue category — on this dataset. The human-in-the-loop pipeline with only 167 ROI matched the performance of offline annotation requiring 663 ROI, confirming that iterative active learning is more label-efficient than batch annotation.

Key Features

  • Human-in-the-loop active learning: An iterative cycle of segmenting, correcting errors in the GUI, and retraining for 100 epochs takes under one minute on a GPU, enabling rapid convergence to a custom model without large labeled datasets.
  • Minimal annotation requirements: Effective custom models can be trained with 100-200 corrected ROI using the active learning pipeline, or 500-1,000 ROI with offline annotation — both are substantial reductions from the tens of thousands typically required.
  • Model zoo of nine pretrained checkpoints: Includes the original cyto/cyto2 generalist models, TissueNet-trained models (TN1, TN2, TN3), LiveCell-trained models (LC1, LC2, LC3, LC4), and a cross-dataset model (CPx), organized into style clusters to guide initialization choice.
  • GUI-integrated annotation and training: The Cellpose GUI provides lasso and brush tools for correcting cell boundaries and launches retraining without command-line interaction, making the workflow accessible to researchers without deep learning expertise.
  • 3D segmentation support: Inherits plane-by-plane volumetric segmentation from Cellpose v1 without requiring 3D-labeled training data.

Technical Details

Cellpose 2.0 uses the same U-Net-based architecture as the original Cellpose. The network predicts three output channels: a cell probability map and X/Y spatial gradient flows pointing every pixel toward the nearest cell center. At inference, pixels with high probability are grouped by integrating the flow field to convergence and clustered by convergence point to recover instance boundaries.

A key architectural element is the 256-dimensional style vector, computed by global average pooling over the deepest encoder features. This vector is broadcast and added to feature maps at later convolutional layers, allowing the network to adapt its prediction style to image-level context. Style vectors from the nine pretrained models were analyzed with t-SNE and Leiden clustering to verify that each captures a meaningfully distinct segmentation style. The generalist models were trained for 500 epochs (batch size 8, weight decay 1e-5, learning rate 0.2 with 10-epoch warmup) on TissueNet and LiveCell. Human-in-the-loop retraining cycles use only 100 epochs to keep iteration times short. Fine-tuned models at 1,000 ROI achieve approximately 0.73 average precision (AP at IoU=0.5) on TissueNet, outperforming Mesmer despite using far less training data.

Applications

Cellpose 2.0 is primarily used to adapt segmentation to imaging conditions not covered by generalist models — unusual cell lines, non-standard fluorophore combinations, proprietary imaging platforms, or morphologies absent from public training data. In pharmaceutical high-content screening, fine-tuning from a TissueNet or LiveCell checkpoint on a small annotated set from the target cell line yields substantially better segmentation than a generalist model. The LiveCell-trained checkpoints (LC1-LC4) provide strong initialization for label-free phase-contrast and brightfield microscopy, where generalist fluorescence models underperform. The released TissueNet human-in-the-loop dataset and annotation efficiency benchmarks also serve as a reference for laboratories evaluating how to structure their annotation workflows.

Impact

Cellpose 2.0 extended one of the most widely used cell segmentation tools in biological imaging by dramatically reducing the annotation burden for custom cell types. The paper demonstrated that active learning with a suitable pretrained initialization can close the gap with specialized models trained on orders-of-magnitude more data, a finding with broad implications for how biological imaging laboratories approach model fine-tuning. The framework is distributed through the standard Cellpose Python package (pip install cellpose), with model zoo checkpoints downloaded automatically on first use. Notable limitations include a dependency on GPU hardware for practical retraining speeds and evaluation concentrated on TissueNet, meaning annotation requirements for more challenging scenarios such as bacteria, electron microscopy, or 3D volumetric data remain less well characterized.

Citation

Cellpose 2.0: how to train your own model

Stringer, C. & Pachitariu, M. (2022) Cellpose 2.0: how to train your own model. bioRxiv.

DOI: 10.1038/s41592-022-01663-4

Metrics

GitHub

Stars2.2K
Forks609
Open Issues96
Contributors61
Last Push2d ago
LanguagePython
LicenseBSD-3-Clause

Citations

Total Citations988
Influential136
References60

Tags

instance segmentationsegmentationactive learningtransfer learningcell biologyfluorescence microscopy

Resources

GitHub RepositoryResearch PaperOfficial WebsiteDocumentation