bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Imaging foundation models
Imaging

ScribblePrompt

MIT CSAIL / Massachusetts General Hospital

Interactive foundation model for biomedical image segmentation, prompted with scribbles, clicks, and bounding boxes to segment unseen structures.

Released: July 2024

ScribblePrompt is an interactive segmentation foundation model for biomedical imaging that lets a user delineate anatomical or pathological structures by drawing scribbles, placing clicks, or dragging bounding boxes, rather than training a new model for each task. It addresses a persistent bottleneck in medical image analysis: manual annotation is slow and expensive, and task-specific models do not generalize to the long tail of structures, modalities, and acquisition protocols that clinicians and researchers actually encounter. By treating segmentation as a promptable, iterative process, the model produces accurate masks for structures and image types it never saw during training.

The model was developed by Hallee Wong, Marianne Rakic, John Guttag, and Adrian Dalca at MIT CSAIL, with clinical affiliation to Massachusetts General Hospital, and was presented at ECCV 2024 (preprint December 2023). It is part of a wave of promptable segmentation systems inspired by the Segment Anything Model (SAM), but is purpose-built for the heterogeneity of biomedical data, where SAM and similar natural-image models tend to underperform.

ScribblePrompt is released in two variants: ScribblePrompt-UNet, an efficient fully-convolutional network, and ScribblePrompt-SAM, which adapts the SAM architecture. Both are designed for fast, responsive inference so that a human can refine a prediction in real time.

#Key Features

  • Multi-modal prompting: Accepts scribbles, positive/negative clicks, and bounding boxes, alone or in combination, giving annotators flexible control over how they specify a target region.
  • Generalization to unseen tasks: Segments structures, modalities, and datasets absent from training, outperforming comparable interactive methods on held-out biomedical datasets.
  • Two architectures: A lightweight UNet variant for fast CPU/GPU inference and a SAM-based variant, letting users trade off speed against architecture preference.
  • Trained for iterative use: A simulation algorithm generates realistic scribble, click, and box interactions during training, so the model behaves well across successive correction rounds rather than only on a single prompt.
  • Measured annotation speedup: In a user study with domain experts, it cut annotation time by 28% while improving Dice score by 15% over a prior method.

#Technical Details

ScribblePrompt is trained on a collection of 65 diverse biomedical imaging datasets spanning many modalities (including MRI, CT, ultrasound, X-ray, and microscopy), combining real labels with synthetically generated ones to broaden coverage. A central methodological contribution is the algorithm that simulates human interactions during training: it produces varied, realistic scribbles, clicks, and bounding boxes so the network learns to interpret partial, ambiguous, and iteratively refined prompts. ScribblePrompt-UNet uses an efficient fully-convolutional encoder-decoder, while ScribblePrompt-SAM fine-tunes the Segment Anything Model backbone. Evaluation across unseen datasets and a controlled user study showed it surpassing baselines including SAM and SAM-Med2D on accuracy while remaining fast enough for interactive use.

#Applications

ScribblePrompt is aimed at researchers and clinicians who need to annotate or segment biomedical images at scale, such as building labeled datasets for downstream models, quantifying lesions or organs in research studies, or prototyping segmentation for a new modality without collecting task-specific training data. Its browser-based demo and lightweight UNet variant make it practical for labs without large compute budgets, and its interactive design fits naturally into human-in-the-loop annotation pipelines where an expert verifies and corrects each mask.

#Impact

ScribblePrompt demonstrated that a single promptable model, trained with carefully simulated interactions, can serve as a general-purpose annotation tool across the fragmented landscape of biomedical imaging, where most prior work was narrowly task-specific. By releasing both model weights (Apache 2.0) and an interactive demo, the authors made the approach immediately usable, and the measured reductions in annotation time point to concrete value for dataset creation and clinical research. The accompanying MedScribble dataset of multi-annotator scribble annotations also provides a benchmark resource for the interactive-segmentation community. As a biomedical counterpart to SAM-style promptable models, it remains a reference point for scribble- and click-based medical image segmentation.

Citations

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Wong, H. E., et al. (2023) ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image. European Conference on Computer Vision.

DOI: 10.1007/978-3-031-73661-2_12

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Preprint

Wong, H. E., et al. (2023) ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image. European Conference on Computer Vision.

DOI: 10.48550/arXiv.2312.07381

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations64
Influential10
References145

GitHub

Stars217
Forks21
Open Issues10
Contributors2
Last Push11mo ago
LanguageJupyter Notebook
LicenseApache-2.0

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
63Partial
Usability — can I run it?87
Reproducibility — can I retrain it?38
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cnnfoundation_modelinteractive_segmentationmicroscopyradiologysegmentationvision_transformerzero_shot

Resources

GitHub RepositoryResearch PaperOfficial WebsiteGoogle ColabDemo