A fully 3D, promptable segmentation foundation model for volumetric medical images, trained on ~22K 3D scans and ~143K masks for general-purpose anatomy segmentation.
SAM-Med3D adapts the promptable segmentation paradigm of Meta AI's Segment Anything Model (SAM) to volumetric medical imaging by rebuilding the architecture to be fully 3D. The original SAM and its medical 2D derivative SAM-Med2D operate slice-by-slice, processing each axial plane independently and requiring a prompt on every slice to segment a 3D structure. This discards the rich spatial context that runs through CT and MR volumes. SAM-Med3D instead encodes whole volumes natively, capturing inter-slice context and allowing a clinician to segment an entire 3D anatomy from as few as one prompt point.
The model was released in October 2023 by researchers at Shanghai AI Laboratory (General Vision Group) and academic collaborators, with code and weights distributed through the uni-medical GitHub organization. Its central contribution is twofold: a fully volumetric re-implementation of SAM's image encoder, prompt encoder, and mask decoder using 3D operations, and the assembly of one of the largest volumetric medical segmentation corpora to date for training it.
SAM-Med3D sits within the family of medical segmentation foundation models (alongside MedSAM and SAM-Med2D) that aim to replace bespoke per-task networks with a single promptable model. By moving from 2D to genuine 3D, it targets the dominant data modalities in radiology, where volumetric reasoning is essential.
SAM-Med3D mirrors SAM's three-component design — image encoder, prompt encoder, and mask decoder — but replaces 2D operations with 3D counterparts so the network reasons over volumetric patches and produces 3D masks directly. It is trained in two stages on the SA-Med3D-140K dataset, aggregating roughly 22K 3D scans and 143K masks across 245 categories from public and licensed private sources. Because a single point prompt propagates spatially through the volume, the model needs far fewer interactions than slice-wise methods, and the authors report substantial Dice improvements over SAM and SAM-Med2D at matched or lower prompt budgets, while running at a fraction of their inference time for 3D targets. A subsequent SAM-Med3D-turbo checkpoint, fine-tuned on 44 datasets, further improves general-purpose accuracy and is the recommended weight in the repository.
SAM-Med3D is well suited to interactive and semi-automatic segmentation of CT and MR volumes, helping radiologists and biomedical researchers delineate organs, tumors, and other structures with minimal prompting. Because it operates on whole volumes, it accelerates 3D annotation pipelines, provides strong initialization for task-specific volumetric segmentation models, and can serve as a general-purpose backbone for medical image analysis workflows where training a dedicated 3D network per task is impractical.
SAM-Med3D demonstrated that a genuinely 3D promptable foundation model can outperform slice-wise adaptations of SAM on volumetric data while drastically reducing the prompting burden, making it a widely cited reference point for medical segmentation foundation models and spawning follow-up work such as SAM-Med3D-MoE. Its open code, public checkpoints, and the SA-Med3D-140K dataset lowered the barrier to research on volumetric promptable segmentation. Key limitations include continued reliance on user prompts for best performance, sensitivity to modalities and structures underrepresented in training, and licensing constraints on portions of the underlying private data.
Wang, H., et al. (2023) SAM-Med3D: Towards General-Purpose Segmentation Models for Volumetric Medical Images. ECCV Workshops.
DOI: 10.1007/978-3-031-91721-9_4Wang, H., et al. (2023) SAM-Med3D: Towards General-Purpose Segmentation Models for Volumetric Medical Images. ECCV Workshops.
DOI: 10.48550/arXiv.2310.15161Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data