STU-Net

Scalable and transferable U-Net family (14M–1.4B parameters) for 3D medical image segmentation, supervised-pretrained on TotalSegmentator.

Released: April 2023

Parameters: 1.5 Billion

STU-Net is a family of scalable, transferable convolutional models for 3D medical image segmentation, introduced in April 2023 by researchers at Shanghai AI Laboratory and collaborating institutions (Ziyan Huang, Junjun He, Yu Qiao, and colleagues). It addresses a long-standing gap in medical imaging: while large-scale supervised pre-training had transformed natural-image and language tasks, segmentation models for CT and other volumetric modalities remained small and were typically trained from scratch on each new dataset.

The model builds directly on the widely used nnU-Net framework, whose self-configuring pipeline is a de facto standard for biomedical segmentation. The key contribution is making nnU-Net's convolutional blocks scalable, then systematically growing the network from 14 million up to 1.4 billion parameters—the largest medical image segmentation model reported at the time of release. By pre-training this family on TotalSegmentator, the largest public annotated CT dataset, the authors deliver checkpoints that can be applied off the shelf or fine-tuned, lowering the barrier to strong segmentation on new clinical targets.

STU-Net sits at the intersection of classical segmentation engineering and the foundation-model paradigm, demonstrating that the "scale plus pre-training" recipe transfers to dense volumetric prediction, not only to classification and generative tasks.

Key Features

Scalable architecture: Refined nnU-Net convolutional blocks let the same design span four sizes—S (14.6M), B (58.3M), L (440.3M), and H (1.46B parameters)—matched to different compute budgets.
Joint depth-width scaling: Empirical study shows scaling network depth and width together is optimal, with larger models yielding consistent performance gains.
Large-scale supervised pre-training: All variants are pre-trained on TotalSegmentator (1,204 CT scans, 104 anatomical structures spanning organs, bones, muscles, and vessels).
Strong transferability: Evaluated on 14 downstream datasets for direct inference and 3 for fine-tuning, across multiple modalities and segmentation targets.
Open release: Code and all four pretrained checkpoints are released under the Apache 2.0 license.

Technical Details

STU-Net is a fully convolutional encoder-decoder built on the nnU-Net framework, with the default residual blocks redesigned so that depth and width can be increased without breaking the self-configuring pipeline. Models range from STU-Net-S (14.6M parameters) to STU-Net-H (1,457M parameters). Pre-training is fully supervised for 4,000 epochs on TotalSegmentator—1,204 CT images annotated across 104 structures (27 organs, 59 bones, 10 muscles, 8 vessels)—using mirror data augmentation. The authors find that increasing model size improves accuracy on the upstream task and, importantly, yields better transfer: larger pretrained models reach higher segmentation accuracy on downstream datasets, including in limited-data fine-tuning regimes where data efficiency matters most.

Applications

STU-Net targets researchers and clinical-imaging teams who need accurate 3D segmentation of anatomical structures and lesions in CT and related modalities. The pretrained checkpoints can be used for direct inference on TotalSegmentator-covered anatomy, as initialization for fine-tuning on new organs, tumors, or modalities, or as a strong backbone for benchmarking. Because it inherits nnU-Net's automatic configuration, it slots into existing segmentation workflows with minimal manual tuning, benefiting groups building radiology pipelines, surgical planning tools, and downstream quantitative analyses.

Impact

STU-Net demonstrated that large-scale supervised pre-training scales effectively to volumetric medical segmentation, providing one of the first openly released billion-parameter segmentation backbones for the field. Its pretrained variants have since been adopted as baselines and initialization in subsequent benchmarking efforts, including the Touchstone and SegBook studies, and its Apache-2.0 release has made it a practical starting point for transfer learning. The work helped motivate the broader move toward reusable, pretrained segmentation foundation models rather than per-dataset training from scratch, though its CT-centric pre-training means downstream gains are strongest for anatomy and modalities close to the TotalSegmentator distribution.

Citation

STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

Preprint

Huang, Z., et al. (2023) STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training. arXiv.org.

DOI: 10.48550/arXiv.2304.06716

Recent citations

Papers that recently cited this model.

LETT-NeXt: A Lightweight RECIST-Guided Model for 3D CT Lesion Segmentation
Sebastian Aas, E. Stenhede, A. Ranjbar
Jun 2026
0
MRI-based quantification of intratumoral heterogeneity for predicting recurrence risk in ER+/HER2− breast cancer
Yang Chen, Jie Shi, Jing Chen, et al.
Insights into Imaging · Jun 2026
0
Multi-Granularity 3D Kidney Lesion Characterization from CT Volumes
Renjie Liang, Z. Fan, Jinqian Pan, et al.
Jun 2026
0

Top citations

The most-cited papers that cite this model.

U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
Jun Ma, Feifei Li, Bo Wang
arXiv.org · Jan 2024
821
Customized Segment Anything Model for Medical Image Segmentation
Kaiwen Zhang, Dong Liu
arXiv.org · Apr 2023
474
Medical Image Analysis
Zongwei Zhou, V. Sodha, Jiaxuan Pang, et al.
458Influential
nnU-Net Revisited: A Call for Rigorous Validation in 3D Medical Image Segmentation
Fabian Isensee, Tassilo Wald, Constantin Ulrich, et al.
International Conference on Medical Image Computing and Computer-Assisted Intervention · Apr 2024
403
CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
Jie Liu, Yixiao Zhang, Jieneng Chen, et al.
IEEE International Conference on Computer Vision · Jan 2023
351

Citations

Total Citations159

Influential23

References46

GitHub

Stars372

Forks38

Open Issues25

Contributors3

Last Push1y ago

LanguagePython

LicenseApache-2.0

Fields of citing research

Computer Science95%
Medicine94%
Engineering48%
Biology2%
Environmental Science1%
Physics1%
Materials Science1%

Share of papers citing this model.

Openness

bio.rodeo opennessFully open · usable and reproducible

82Open

Usability — can I run it?100

Reproducibility — can I retrain it?72

Model Openness Framework

Class III

Open Model

Resources

GitHub Repository Research Paper

Key Features

Scalable architecture: Refined nnU-Net convolutional blocks let the same design span four sizes—S (14.6M), B (58.3M), L (440.3M), and H (1.46B parameters)—matched to different compute budgets.

Joint depth-width scaling: Empirical study shows scaling network depth and width together is optimal, with larger models yielding consistent performance gains.

Large-scale supervised pre-training: All variants are pre-trained on TotalSegmentator (1,204 CT scans, 104 anatomical structures spanning organs, bones, muscles, and vessels).

Strong transferability: Evaluated on 14 downstream datasets for direct inference and 3 for fine-tuning, across multiple modalities and segmentation targets.

Open release: Code and all four pretrained checkpoints are released under the Apache 2.0 license.

Technical Details

Applications

Impact

Citation

STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

Preprint

Huang, Z., et al. (2023) STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training. arXiv.org.

DOI: 10.48550/arXiv.2304.06716

Recent citations

Papers that recently cited this model.

LETT-NeXt: A Lightweight RECIST-Guided Model for 3D CT Lesion Segmentation

Sebastian Aas, E. Stenhede, A. Ranjbar

Jun 2026

MRI-based quantification of intratumoral heterogeneity for predicting recurrence risk in ER+/HER2− breast cancer

Yang Chen, Jie Shi, Jing Chen, et al.

Insights into Imaging · Jun 2026

Multi-Granularity 3D Kidney Lesion Characterization from CT Volumes

Renjie Liang, Z. Fan, Jinqian Pan, et al.

Jun 2026

Top citations

The most-cited papers that cite this model.

U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

Jun Ma, Feifei Li, Bo Wang

arXiv.org · Jan 2024

821

Customized Segment Anything Model for Medical Image Segmentation

Kaiwen Zhang, Dong Liu

arXiv.org · Apr 2023

474

Medical Image Analysis

Zongwei Zhou, V. Sodha, Jiaxuan Pang, et al.

458Influential

nnU-Net Revisited: A Call for Rigorous Validation in 3D Medical Image Segmentation

Fabian Isensee, Tassilo Wald, Constantin Ulrich, et al.

International Conference on Medical Image Computing and Computer-Assisted Intervention · Apr 2024

403

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

Jie Liu, Yixiao Zhang, Jieneng Chen, et al.

IEEE International Conference on Computer Vision · Jan 2023

351

STU-Net

#Key Features

#Technical Details

#Applications

#Impact

Citation

STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

Recent citations

LETT-NeXt: A Lightweight RECIST-Guided Model for 3D CT Lesion Segmentation

Multi-Granularity 3D Kidney Lesion Characterization from CT Volumes

Top citations

Medical Image Analysis

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

STU-Net

#Key Features

#Technical Details

#Applications

#Impact

Citation

STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

Recent citations

LETT-NeXt: A Lightweight RECIST-Guided Model for 3D CT Lesion Segmentation

Multi-Granularity 3D Kidney Lesion Characterization from CT Volumes

Top citations

Medical Image Analysis

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact