A SAM-based foundation model for promptable ultrasound image segmentation, trained on US-43d, the largest assembled public ultrasound segmentation dataset.
Ultrasound is one of the most widely used clinical imaging modalities, yet it remains comparatively underserved by deep learning because of its low signal-to-noise ratio, speckle artifacts, low contrast, and operator-dependent acquisition. General-purpose segmentation foundation models such as the Segment Anything Model (SAM) and its medical adaptation MedSAM transfer poorly to this domain, where boundaries are diffuse and anatomical appearance varies dramatically across probes and body regions. UltraSam addresses this gap with a deliberately data-centric strategy: rather than designing a bespoke architecture, the authors assemble the largest public ultrasound segmentation corpus to date and use it to specialize a promptable SAM-style model for ultrasound.
UltraSam was developed by the CAMMA group at the University of Strasbourg (ICube, CNRS/INSERM) and IHU Strasbourg, and introduced by Adrien Meyer, Aditya Murali, Didier Mutter, and Nicolas Padoy in a preprint released in November 2024 and subsequently published in the International Journal of Computer Assisted Radiology and Surgery (IJCARS) in 2025. The model accepts point and bounding-box prompts to produce segmentation masks, and its pretrained weights serve as a strong initialization for a wide range of downstream ultrasound tasks.
The central contribution is twofold: US-43d, a unified collection of 43 open-access ultrasound segmentation datasets, and the UltraSam checkpoint trained on it. Together they provide the ultrasound community with both a reusable benchmark dataset and a ready-to-fine-tune foundation model, lowering the barrier for building segmentation and classification systems across diverse anatomical applications.
UltraSam is built on the Segment Anything Model, which couples a Vision Transformer (ViT) image encoder with a prompt encoder and a lightweight mask decoder. The authors fine-tune this architecture end-to-end on US-43d, retraining the encoder so that learned representations reflect ultrasound-specific statistics rather than the natural-image distribution SAM was originally trained on. US-43d itself comprises 43 publicly available ultrasound segmentation datasets, totaling more than 280,000 images and corresponding masks for over 50 anatomical structures, making it the largest public ultrasound segmentation collection assembled at the time of release. In evaluations, UltraSam delivers substantially higher segmentation accuracy under point- and box-prompted settings than SAM and MedSAM, and when used as a pretrained backbone it improves downstream segmentation and classification performance over ImageNet, SAM, and MedSAM initializations across held-out ultrasound tasks.
UltraSam is intended as a general-purpose starting point for ultrasound image analysis. Clinicians and researchers can apply it directly for interactive, prompt-driven delineation of anatomy and lesions, or fine-tune it as a backbone for task-specific segmentation and classification pipelines in cardiology, obstetrics, breast and thyroid screening, regional anesthesia (nerve localization), and musculoskeletal imaging. Because it ships with a harmonized multi-dataset corpus assembled from open-access sources, it is also valuable to the methods community as a benchmark for developing and comparing new ultrasound segmentation approaches.
By demonstrating that a data-centric strategy, aggregating heterogeneous open datasets and specializing a strong general segmentation model, can yield a versatile ultrasound foundation model, UltraSam offers a practical template for under-resourced imaging modalities. The release of US-43d is itself a meaningful community contribution, consolidating fragmented public data into a reusable resource and lowering the cost of entry for ultrasound AI research. As a foundation-model initialization that outperforms widely used SAM and MedSAM baselines, UltraSam is positioned to accelerate development of downstream clinical ultrasound tools. Its principal limitations stem from reliance on assembled public datasets, which may carry uneven annotation quality and demographic or scanner biases, and on prompt-based interaction rather than fully automatic segmentation.
Meyer, A., et al. (2024) Ultrasam: a foundation model for ultrasound using large open-access segmentation datasets. International Journal of Computer Assisted Radiology and Surgery.
DOI: 10.1007/s11548-025-03517-8Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data