MedSAM

Bowang Lab / University Health Network / University of Toronto / Vector Institute / Western University / New York University / Yale University

Promptable foundation model for universal medical image segmentation, fine-tuned from SAM on 1.57M image-mask pairs across 10 imaging modalities.

Released: January 2024

MedSAM is a promptable foundation model for universal medical image segmentation, developed by Jun Ma, Bo Wang, and colleagues at the University Health Network, University of Toronto, the Vector Institute, and collaborating institutions, and published in Nature Communications in January 2024. It adapts Meta AI's Segment Anything Model (SAM) — a general-purpose natural-image segmentation model — to the medical domain, where SAM's zero-shot performance is unreliable because biomedical images differ sharply from web photographs in contrast, texture, and object boundaries.

Image segmentation is a foundational step in nearly every quantitative medical imaging workflow, from delineating tumors for radiotherapy planning to measuring organ volumes. Historically this required a separate specialist model trained per task and modality, each demanding large annotated datasets and brittle to distribution shift. MedSAM instead provides a single model that segments arbitrary structures across modalities when given a bounding-box prompt, collapsing dozens of task-specific pipelines into one interactive tool.

By training on the largest and most diverse medical segmentation corpus assembled at the time, MedSAM demonstrated that the prompt-driven, foundation-model paradigm transfers to medicine. It has become one of the most widely adopted reference points for promptable segmentation in biomedical imaging and seeded a family of follow-up work.

Key Features

Bounding-box promptable segmentation: Users specify a target by drawing a box, and MedSAM returns a mask — enabling interactive, human-in-the-loop annotation rather than fully automated black-box output.
Modality-agnostic coverage: A single model handles 10 imaging modalities, including CT, MRI, endoscopy, ultrasound, pathology, fundus, dermoscopy, mammography, OCT, and chest X-ray.
Strong generalization: On external validation, MedSAM maintained consistent accuracy on unseen datasets where task-specific U-Net and DeepLabV3+ models degraded sharply.
Annotation acceleration: In a human study, MedSAM reduced expert tumor annotation time by roughly 82%, a direct practical benefit for building labeled datasets.
Open and accessible: Code (Apache-2.0) and pretrained ViT-B weights are publicly released, with CLI, Jupyter, and GUI inference paths plus a HuggingFace checkpoint.

Technical Details

MedSAM retains SAM's three-part architecture: a ViT-Base image encoder, a prompt encoder, and a lightweight mask decoder. It was initialized from pretrained SAM weights; during fine-tuning the prompt encoder was frozen while the image encoder and mask decoder were updated, and only bounding-box prompts were used to keep the interface simple and clinically practical. Training used 1,570,263 image-mask pairs curated from publicly available sources, spanning 10 modalities and over 30 cancer types, on 20 A100 GPUs. Evaluation covered 86 internal and 60 external segmentation tasks. Across these, MedSAM substantially outperformed the original SAM (improvements ranging from roughly 15% to over 50% on hard tasks such as nasopharynx cancer) and was competitive with or superior to specialist segmentation networks while generalizing far better to out-of-distribution data.

Applications

MedSAM serves radiologists, pathologists, and medical imaging researchers as a general-purpose segmentation backbone. Typical uses include accelerating manual annotation of tumors and organs, generating training labels for downstream specialist models, supporting radiotherapy and surgical planning, and providing a strong baseline for new segmentation benchmarks. Because it accepts simple box prompts, it integrates cleanly into interactive annotation tools and clinical research pipelines without per-task retraining.

Impact

MedSAM was among the first works to demonstrate that the promptable foundation-model paradigm transfers effectively to medical imaging, and it has been heavily cited and adopted as a reference baseline across the field. Its public code and weights catalyzed a wave of follow-up models — including video- and 3D-oriented successors such as MedSAM2 — and helped establish interactive, prompt-driven segmentation as a practical standard for biomedical image analysis. Limitations remain: the model depends on a human-supplied prompt rather than fully automatic detection, performs best on well-bounded structures, and inherits coverage gaps from its training modalities, leaving fully automated and 3D/temporal segmentation as active areas of extension.

Citation

Segment anything in medical images

Ma, J., et al. (2023) Segment anything in medical images. Nature Communications.

DOI: 10.1038/s41467-024-44824-z

Recent citations

Papers that recently cited this model.

Organ-aware mixture-of-experts framework for generalized pan-tumor segmentation
Hancang Mi, H. Gan, Dong Ma, et al.
Biomedical Signal Processing and Control · Oct 2026
0
Automatic prompt generation via reinforcement learning guided contrastive purification for label-free SAM segmentation
Yansong Zhang, Hangbei Cheng, Jianan Zhang, et al.
Biomedical Signal Processing and Control · Oct 2026
0
Computer tomography image segmentation using Trans-Kronecker encoder and fusion loss function
G. Santoshi, Ratnakar Dash
Biomedical Signal Processing and Control · Oct 2026
0

Top citations

The most-cited papers that cite this model.

Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review
Satoshi Takahashi, Yusuke Sakaguchi, Nobuji Kouno, et al.
Journal of medical systems · Sep 2024
270
Segment Anything for Microscopy
Anwai Archit, Sushmita Nair, Nabeel Khalid, et al.
bioRxiv · Aug 2023
255Influential
A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities
Theodore Zhao, Yu Gu, Jianwei Yang, et al.
Nature Methods · May 2024
162Influential
The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions
Jun Ma, Ronald Xie, Shamini Ayyadhury, et al.
Nature Methods · Aug 2023
151
Development and validation of an autonomous artificial intelligence agent for clinical decision-making in oncology
Dyke Ferber, O. E. El Nahhas, G. Wölflein, et al.
Nature Cancer · Jun 2025
122Influential

Citations

Total Citations1.4K

Influential82

References54

GitHub

Stars4.4K

Forks594

Open Issues17

Contributors9

Last Push1y ago

LanguageJupyter Notebook

LicenseApache-2.0

HuggingFace

Downloads1.8K

Likes24

Last Modified3y ago

Pipelinemask-generation

Fields of citing research

Computer Science28%
Medicine26%
Engineering11%
Environmental Science2%
Materials Science1%
Biology1%
Physics1%
Geology0%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

82Open

Usability — can I run it?100

Reproducibility — can I retrain it?58

Model Openness Framework

Class III

Open Model

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Bounding-box promptable segmentation: Users specify a target by drawing a box, and MedSAM returns a mask — enabling interactive, human-in-the-loop annotation rather than fully automated black-box output.

Modality-agnostic coverage: A single model handles 10 imaging modalities, including CT, MRI, endoscopy, ultrasound, pathology, fundus, dermoscopy, mammography, OCT, and chest X-ray.

Strong generalization: On external validation, MedSAM maintained consistent accuracy on unseen datasets where task-specific U-Net and DeepLabV3+ models degraded sharply.

Annotation acceleration: In a human study, MedSAM reduced expert tumor annotation time by roughly 82%, a direct practical benefit for building labeled datasets.

Open and accessible: Code (Apache-2.0) and pretrained ViT-B weights are publicly released, with CLI, Jupyter, and GUI inference paths plus a HuggingFace checkpoint.

Technical Details

Applications

Impact

MedSAM

#Key Features

#Technical Details

#Applications

#Impact

Citation

Segment anything in medical images

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

MedSAM

#Key Features

#Technical Details

#Applications

#Impact

Citation

Segment anything in medical images

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact