MIT CSAIL / Cornell University / Massachusetts General Hospital / Harvard Medical School
An in-context learning model that segments unseen medical imaging tasks from a few labeled examples, with no retraining or fine-tuning.
UniverSeg is a universal medical image segmentation model that solves new segmentation tasks at inference time without any additional training or fine-tuning. Conventional medical segmentation networks are trained for a single task — one anatomy, one modality, one label — and generalize poorly to unseen tasks, forcing practitioners to collect new labeled data and retrain a model for every new problem. UniverSeg instead reframes segmentation as an in-context learning problem: given a query image and a small support set of example image–label pairs, a single frozen model produces a segmentation for the query, adapting on the fly to anatomies, modalities, and labels it never saw during training.
The model was developed by Victor Ion Butoi, Jose Javier Gonzalez Ortiz, Tianyu Ma, Mert R. Sabuncu, John Guttag, and Adrian V. Dalca at MIT CSAIL and Cornell University (with affiliations at Massachusetts General Hospital and Harvard Medical School), and was published at ICCV 2023. It is one of the earliest demonstrations that the in-context learning paradigm popularized in language modeling can be applied to dense biomedical prediction tasks, enabling a single global model to act as a flexible segmentation tool rather than a fixed, task-specific network.
UniverSeg is built on a CNN backbone in an encoder–decoder configuration, extended with the Cross-Block mechanism that jointly processes the query feature map and the feature maps of every support image–label pair, averaging interactions across the support set. Inputs are normalized to the [0,1] range and resized to 128×128 pixels. To achieve generalization, the authors assembled MegaMedical, a standardized collection of 53 open-access medical segmentation datasets totaling over 22,000 scans spanning diverse anatomies and modalities (including MRI, CT, ultrasound, and microscopy). Training uses a task-sampling strategy that exposes the model to many distinct segmentation problems so it learns to condition on support sets rather than memorize specific tasks. On held-out datasets, UniverSeg substantially outperforms few-shot baselines and approaches the performance of task-specific models trained directly on the target task, with accuracy improving as the support set grows.
UniverSeg is aimed at researchers and clinicians who need to segment new structures in medical images but lack the labeled data or engineering resources to train a dedicated model. By supplying a handful of annotated examples, a user can immediately obtain segmentations for a novel anatomy or modality, making it useful for rapid prototyping, annotation acceleration in labeling pipelines, and exploratory analysis in low-resource clinical and research settings. It is particularly valuable for rare conditions or emerging imaging tasks where large annotated training sets do not exist.
UniverSeg helped popularize in-context learning for dense biomedical prediction and has become a widely cited reference point for universal and few-shot medical segmentation. Its release of code, pretrained weights, and the MegaMedical task formulation provided a reusable foundation for subsequent work on promptable and foundation-model-style segmentation. While its fixed 128×128 input resolution and 2D operation limit fine-detail and volumetric tasks, and segmentation quality depends on the relevance of the provided support set, UniverSeg demonstrated that a single frozen model can flexibly generalize across the highly heterogeneous landscape of medical imaging — a capability that has shaped the trajectory of later medical segmentation foundation models.
Butoi, V., et al. (2023) UniverSeg: Universal Medical Image Segmentation. IEEE International Conference on Computer Vision.
DOI: 10.1109/ICCV51070.2023.01960Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data