ScribblePrompt

MIT CSAIL / Massachusetts General Hospital

Interactive foundation model for biomedical image segmentation, prompted with scribbles, clicks, and bounding boxes to segment unseen structures.

Released: July 2024

ScribblePrompt is an interactive segmentation foundation model for biomedical imaging that lets a user delineate anatomical or pathological structures by drawing scribbles, placing clicks, or dragging bounding boxes, rather than training a new model for each task. It addresses a persistent bottleneck in medical image analysis: manual annotation is slow and expensive, and task-specific models do not generalize to the long tail of structures, modalities, and acquisition protocols that clinicians and researchers actually encounter. By treating segmentation as a promptable, iterative process, the model produces accurate masks for structures and image types it never saw during training.

The model was developed by Hallee Wong, Marianne Rakic, John Guttag, and Adrian Dalca at MIT CSAIL, with clinical affiliation to Massachusetts General Hospital, and was presented at ECCV 2024 (preprint December 2023). It is part of a wave of promptable segmentation systems inspired by the Segment Anything Model (SAM), but is purpose-built for the heterogeneity of biomedical data, where SAM and similar natural-image models tend to underperform.

ScribblePrompt is released in two variants: ScribblePrompt-UNet, an efficient fully-convolutional network, and ScribblePrompt-SAM, which adapts the SAM architecture. Both are designed for fast, responsive inference so that a human can refine a prediction in real time.

Key Features

Multi-modal prompting: Accepts scribbles, positive/negative clicks, and bounding boxes, alone or in combination, giving annotators flexible control over how they specify a target region.
Generalization to unseen tasks: Segments structures, modalities, and datasets absent from training, outperforming comparable interactive methods on held-out biomedical datasets.
Two architectures: A lightweight UNet variant for fast CPU/GPU inference and a SAM-based variant, letting users trade off speed against architecture preference.
Trained for iterative use: A simulation algorithm generates realistic scribble, click, and box interactions during training, so the model behaves well across successive correction rounds rather than only on a single prompt.
Measured annotation speedup: In a user study with domain experts, it cut annotation time by 28% while improving Dice score by 15% over a prior method.

Technical Details

ScribblePrompt is trained on a collection of 65 diverse biomedical imaging datasets spanning many modalities (including MRI, CT, ultrasound, X-ray, and microscopy), combining real labels with synthetically generated ones to broaden coverage. A central methodological contribution is the algorithm that simulates human interactions during training: it produces varied, realistic scribbles, clicks, and bounding boxes so the network learns to interpret partial, ambiguous, and iteratively refined prompts. ScribblePrompt-UNet uses an efficient fully-convolutional encoder-decoder, while ScribblePrompt-SAM fine-tunes the Segment Anything Model backbone. Evaluation across unseen datasets and a controlled user study showed it surpassing baselines including SAM and SAM-Med2D on accuracy while remaining fast enough for interactive use.

Applications

ScribblePrompt is aimed at researchers and clinicians who need to annotate or segment biomedical images at scale, such as building labeled datasets for downstream models, quantifying lesions or organs in research studies, or prototyping segmentation for a new modality without collecting task-specific training data. Its browser-based demo and lightweight UNet variant make it practical for labs without large compute budgets, and its interactive design fits naturally into human-in-the-loop annotation pipelines where an expert verifies and corrects each mask.

Impact

ScribblePrompt demonstrated that a single promptable model, trained with carefully simulated interactions, can serve as a general-purpose annotation tool across the fragmented landscape of biomedical imaging, where most prior work was narrowly task-specific. By releasing both model weights (Apache 2.0) and an interactive demo, the authors made the approach immediately usable, and the measured reductions in annotation time point to concrete value for dataset creation and clinical research. The accompanying MedScribble dataset of multi-annotator scribble annotations also provides a benchmark resource for the interactive-segmentation community. As a biomedical counterpart to SAM-style promptable models, it remains a reference point for scribble- and click-based medical image segmentation.

Citations

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Wong, H. E., et al. (2023) ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image. European Conference on Computer Vision.

DOI: 10.1007/978-3-031-73661-2_12

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Preprint

Wong, H. E., et al. (2023) ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image. European Conference on Computer Vision.

DOI: 10.48550/arXiv.2312.07381

Recent citations

Papers that recently cited this model.

Segment anything model for medical image segmentation: A review
Hanguang Xiao, Shuai Liu, Xingyu Liu, et al.
Computerized Medical Imaging and Graphics · Jun 2026
0
RoadGIE: Towards A Global-Scale Aerial Benchmark for Generalizable Interactive Road Extraction
Chenxu Peng, Chenxu Wang, Yimian Dai, et al.
May 2026
0Influential
SILSM: A Sustainable Interactive Level Set Method for Progressive Refinement
Jiacheng Song, Dazhi Zhang, Fanghui Song, et al.
May 2026
0

Top citations

The most-cited papers that cite this model.

MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation
Taha Koleilat, Hojat Asgariandehkordi, H. Rivaz, et al.
Medical Image Anal. · Sep 2024
51
nnInteractive: Redefining 3D Promptable Segmentation
Fabian Isensee, Maximilian Rokuss, Lars Krämer, et al.
arXiv.org · Mar 2025
50Influential
Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions
Cheng Jin, Zhengrui Guo, Yi Lin, et al.
arXiv.org · Mar 2023
17
Medical Image Analysis
Wenhui Lei, Wei Xu, Kang Li, et al.
16
SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images
Yichi Zhang, Le Xue, Wenbo Zhang, et al.
IEEE International Conference on Computer Vision · Feb 2025
14

Citations

Total Citations66

Influential10

References145

GitHub

Stars220

Forks21

Open Issues10

Contributors2

Last Push1y ago

LanguageJupyter Notebook

LicenseApache-2.0

Fields of citing research

Computer Science98%
Medicine85%
Engineering34%
Biology5%
Physics3%
Environmental Science3%
Geography2%
Art2%

Share of papers citing this model.

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

63Partial

Usability — can I run it?87

Reproducibility — can I retrain it?38

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper Official Website Google Colab Demo

Key Features

Multi-modal prompting: Accepts scribbles, positive/negative clicks, and bounding boxes, alone or in combination, giving annotators flexible control over how they specify a target region.

Generalization to unseen tasks: Segments structures, modalities, and datasets absent from training, outperforming comparable interactive methods on held-out biomedical datasets.

Two architectures: A lightweight UNet variant for fast CPU/GPU inference and a SAM-based variant, letting users trade off speed against architecture preference.

Trained for iterative use: A simulation algorithm generates realistic scribble, click, and box interactions during training, so the model behaves well across successive correction rounds rather than only on a single prompt.

Measured annotation speedup: In a user study with domain experts, it cut annotation time by 28% while improving Dice score by 15% over a prior method.

Technical Details

Applications

Impact

Citations

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Wong, H. E., et al. (2023) ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image. European Conference on Computer Vision.

DOI: 10.1007/978-3-031-73661-2_12

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Preprint

Wong, H. E., et al. (2023) ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image. European Conference on Computer Vision.

DOI: 10.48550/arXiv.2312.07381

Recent citations

Papers that recently cited this model.

Segment anything model for medical image segmentation: A review

Hanguang Xiao, Shuai Liu, Xingyu Liu, et al.

Computerized Medical Imaging and Graphics · Jun 2026

RoadGIE: Towards A Global-Scale Aerial Benchmark for Generalizable Interactive Road Extraction

Chenxu Peng, Chenxu Wang, Yimian Dai, et al.

May 2026

0Influential

SILSM: A Sustainable Interactive Level Set Method for Progressive Refinement

Jiacheng Song, Dazhi Zhang, Fanghui Song, et al.

May 2026

ScribblePrompt

#Key Features

#Technical Details

#Applications

#Impact

Citations

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Recent citations

RoadGIE: Towards A Global-Scale Aerial Benchmark for Generalizable Interactive Road Extraction

SILSM: A Sustainable Interactive Level Set Method for Progressive Refinement

Top citations

Medical Image Analysis

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

ScribblePrompt

#Key Features

#Technical Details

#Applications

#Impact

Citations

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Recent citations

RoadGIE: Towards A Global-Scale Aerial Benchmark for Generalizable Interactive Road Extraction

SILSM: A Sustainable Interactive Level Set Method for Progressive Refinement

Top citations

Medical Image Analysis

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact