A Segment Anything Model (SAM) variant finetuned on diverse medical images, delivering a reusable promptable checkpoint for interactive and automatic medical image segmentation.
MedicoSAM adapts the Segment Anything Model (SAM), Meta AI's prompt-driven vision foundation model, to the medical imaging domain. SAM revolutionized natural-image segmentation by producing high-quality masks from simple prompts such as points or bounding boxes, but its zero-shot performance degrades on medical data, where contrast, modality, and anatomical structures differ sharply from everyday photographs. MedicoSAM closes much of that gap by finetuning SAM on large, diverse collections of publicly available medical images, yielding a reusable promptable checkpoint that practitioners can drop into existing SAM-based annotation tools.
The model was developed by Anwai Archit, Luca Freckmann, and Constantin Pape of the Computational Cell Analytics group at the University of Göttingen, and introduced in a January 2025 preprint that was subsequently published in IEEE Transactions on Medical Imaging in 2025. Beyond releasing weights, the paper is a systematic study of how best to transfer SAM to medical imaging: it compares finetuning strategies across many datasets and evaluates them on both interactive (prompt-based) and automatic semantic segmentation.
A central, candid finding is that medical pretraining yields large, consistent improvements for interactive segmentation but only modest gains for automatic semantic segmentation. This nuance positions MedicoSAM as a practical accelerator for human-in-the-loop annotation rather than a turnkey replacement for task-specific segmentation networks.
micro-sam annotator, so existing SAM-based pipelines can adopt it with minimal change.vit_b_medicosam.pt checkpoint are released under the permissive MIT license for downstream finetuning and deployment.MedicoSAM is built on SAM's ViT-B image encoder paired with SAM's lightweight mask decoder, and adds a separately pretrained semantic-segmentation decoder. Finetuning is performed on a large, heterogeneous corpus of publicly available medical images spanning multiple modalities and anatomies, using the interactive training scheme that simulates user prompts during optimization. The authors benchmark several transfer strategies and report that the resulting interactive model improves markedly over both vanilla SAM and prior medical SAM variants on held-out interactive segmentation tasks, while gains on automatic semantic segmentation remain comparatively limited. The released artifact is a single vit_b_medicosam.pt checkpoint distributed via the group's owncloud server.
MedicoSAM is most valuable as an annotation accelerator for radiologists, clinical researchers, and imaging scientists who must label large volumes of CT, MRI, or X-ray data. By turning a few clicks into accurate masks, it reduces the manual effort of building segmentation ground truth, supports rapid dataset curation for downstream model training, and integrates into interactive tools for both 2D and 3D workflows. Its permissive license also makes it a convenient starting point for groups finetuning their own modality- or task-specific medical segmentation models.
By rigorously characterizing where SAM-style pretraining helps in medicine — strongly for interactive use, weakly for fully automatic semantic segmentation — MedicoSAM provides the community with both a practical checkpoint and an honest evaluation that tempers expectations around medical foundation-model segmentation. Released openly under MIT and compatible with the widely adopted micro-sam ecosystem, it lowers the barrier to high-quality medical image annotation. Its main limitation is the modest improvement it offers for automatic semantic segmentation, where dedicated task-specific architectures still tend to lead.
Archit, A., et al. (2025) MedicoSAM: Robust Improvement of SAM for Medical Imaging. IEEE Transactions on Medical Imaging.
DOI: 10.1109/TMI.2025.3644811Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data