A medical-imaging adaptation of the Segment Anything Model, fine-tuned on ~4.6M images and 19.7M masks for promptable 2D segmentation across modalities.
SAM-Med2D adapts Meta AI's Segment Anything Model (SAM) — a promptable, general-purpose image segmentation foundation model — to the medical imaging domain. While the original SAM was trained on natural images and shows degraded performance on medical scans, where boundaries are subtle, contrast is low, and objects differ markedly from everyday photographs, SAM-Med2D closes this domain gap through large-scale fine-tuning on curated medical data. It was released in August 2023 by researchers at Shanghai AI Laboratory (OpenGVLab) together with academic collaborators.
The core contribution is the assembly of one of the largest medical segmentation corpora to date and a fine-tuning recipe that injects medical domain knowledge into SAM while keeping inference promptable. The authors collected and curated approximately 4.6 million images and 19.7 million masks (the SA-Med2D-20M dataset) spanning 10 imaging modalities, 4 anatomical structure groups plus lesions, and 31 major human organs. Rather than retraining from scratch, they freeze the heavy image encoder and insert lightweight learnable adapter layers, then fine-tune the prompt encoder and mask decoder interactively.
SAM-Med2D fits into the rapidly growing family of medical segmentation foundation models (alongside efforts such as MedSAM) that aim to provide a single, promptable model usable across organs and modalities, reducing the need to train a bespoke network for every new segmentation task.
SAM-Med2D retains SAM's ViT-based image encoder, prompt encoder, and lightweight mask decoder. During adaptation the image encoder is frozen and augmented with learnable adapter modules in each Transformer block, while the prompt encoder and mask decoder are fine-tuned through interactive, multi-prompt training at a default 256×256 input resolution. On evaluation, the model reports roughly 79.3% Dice with bounding-box prompts and about 70.0% Dice with a single point prompt, and the authors validate generalization across 9 MICCAI 2023 challenge datasets. The released checkpoints run at interactive speeds (around 35 FPS). The accompanying SA-Med2D-20M dataset is distributed under CC-BY-NC-SA-4.0, while the model code and weights are Apache-2.0.
SAM-Med2D is suited to interactive and semi-automatic annotation of 2D medical images, helping radiologists, pathologists, and biomedical researchers segment organs, lesions, and structures across CT, MR, X-ray, and other modalities with minimal prompting. It can accelerate the creation of labeled datasets for downstream supervised models, serve as a strong initialization for task-specific segmentation, and act as a general-purpose backbone for medical image analysis pipelines where training a dedicated model per task is impractical.
By pairing SAM with one of the largest curated medical mask collections, SAM-Med2D became a widely referenced benchmark for evaluating how segmentation foundation models transfer to medical imaging, and a practical starting point for groups building promptable annotation tools. Its open code, public checkpoints, and the SA-Med2D-20M dataset lowered the barrier to research on medical segmentation foundation models. Key limitations include its 2D-only scope (it does not natively model 3D volumetric context), reliance on user prompts for best performance, and the non-commercial license on the underlying dataset, which constrains some downstream uses.
Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data