PCRLv2

The University of Hong Kong / ShanghaiTech University

Self-supervised pretraining framework for medical imaging that unifies pixel restoration with contrastive learning across 2D and 3D image backbones.

Released: January 2023

PCRLv2 (Preservational Contrastive Representation Learning v2) is a self-supervised pre-training framework for medical image analysis developed by Hong-Yu Zhou, Chixiang Lu, Chaoqi Chen, Sibei Yang, and Yizhou Yu at the University of Hong Kong and ShanghaiTech University, published in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) in 2023. It produces transferable visual backbones from unlabeled radiology images, addressing the chronic scarcity of expert annotations in medical imaging by learning general-purpose representations that can be fine-tuned for many downstream tasks.

The core insight is that purely contrastive self-supervised methods, while effective on natural images, tend to discard fine-grained pixel-level and scale information that matters for medical tasks such as lesion segmentation and nodule detection. PCRLv2 unifies three complementary preservation objectives — pixel restoration, siamese feature comparison, and scale preservation over a feature pyramid — so that the learned features retain both high-level semantics and low-level visual detail. It extends the earlier PCRLv1 framework (NeurIPS 2021), which introduced preservational learning via context reconstruction.

By operating natively on both 2D images (chest X-rays) and 3D volumes (CT scans), PCRLv2 offers a single recipe that spans the two dominant data formats in clinical radiology, making it a practical starting point for label-efficient medical imaging pipelines.

Key Features

Unified preservation objective: Combines pixel restoration, contrastive siamese comparison, and multi-scale feature-pyramid learning so representations encode semantics, pixel detail, and scale simultaneously.
Non-skip U-Net backbone: Builds an explicit feature pyramid using a non-skip U-Net so that multi-scale information is preserved and supervised at multiple resolutions.
Sub-crop for 3D: Replaces the multi-crop augmentation used on natural images with a sub-crop strategy better suited to volumetric 3D medical data.
2D and 3D support: A single framework pre-trains on both 2D chest X-rays and 3D CT volumes, with released weights for each modality.
Label efficiency: Outperforms self-supervised baselines on downstream tasks, sometimes by large margins, when annotated data is limited.

Technical Details

PCRLv2 frames pre-training as multi-task optimization over a feature pyramid built by a non-skip U-Net encoder–decoder. The restoration branch reconstructs perturbed input pixels to force pixel-level information into high-level features; the contrastive branch applies siamese feature comparison; and a scale-preservation term operates across pyramid levels. The 3D variant uses a sub-crop scheme in place of multi-crop. Pre-training datasets include LUNA16 (3D chest CT), NIH ChestX-ray14, and CheXpert (2D chest X-rays), with released pretrained weights for each. The authors evaluate transfer performance on brain tumor segmentation (BraTS 2018), pulmonary nodule detection (LUNA), abdominal organ segmentation (LiTS), and chest pathology classification (NIH ChestX-ray, CheXpert), reporting consistent gains over contrastive and restoration-based self-supervised counterparts, with the largest improvements observed under limited-annotation regimes.

Applications

PCRLv2 serves as a pre-trained backbone for medical imaging researchers and developers who need strong models but have few labeled examples. Practitioners can initialize segmentation, detection, or classification networks from the released LUNA16, NIH ChestX-ray, or CheXpert weights and fine-tune on their own datasets, reducing annotation burden for tasks like tumor and organ segmentation, lung nodule detection, and chest disease screening. Because it covers both 2D and 3D, it fits radiology workflows spanning plain-film X-ray and volumetric CT.

Impact

PCRLv2 contributed to the line of medical-imaging self-supervised learning research that demonstrated annotation-efficient pre-training can rival or exceed supervised ImageNet transfer for radiology. Published in a top venue (IEEE TPAMI) with MIT-licensed code and downloadable pretrained weights, it has been adopted as a baseline and starting point in subsequent medical foundation model work. Its main limitations are scope-related: the released models target chest X-ray and CT radiology rather than the full breadth of medical imaging modalities, and the framework predates the larger transformer-based medical foundation models that followed.

Citation

A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis

Zhou, H., et al. (2023) A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence.

DOI: 10.1109/TPAMI.2023.3234002

Recent citations

Papers that recently cited this model.

Nexus: Neuro-guided expert-routed pre-training for brain representation learning from sMRI
Hu Yu, Yiyu Zhang, Si-Yue Fu, et al.
Expert systems with applications · Jun 2026
0
ASAP: Advancing Medical Volumetric Representation Learning with Anatomy-aware Semantically-adaptive Pre-training
Rongsheng Wang, Fenghe Tang, Zihang Jiang, et al.
May 2026
0Influential
Benchmarking transferability of SSL pretraining to same and different modality segmentation tasks
Jue Jiang, H. Veeraraghavan
May 2026
0

Top citations

The most-cited papers that cite this model.

nnFormer: Volumetric Medical Image Segmentation via a 3D Transformer
Hong-Yu Zhou, J. Guo, Yinghao Zhang, et al.
IEEE Transactions on Image Processing · Sep 2021
692
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen, Yushuang Wu, Qiyuan Dai, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence · Sep 2022
156
VoCo: A Simple-Yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis
Linshan Wu, Jiaxin Zhuang, Hao Chen
Computer Vision and Pattern Recognition · Feb 2024
109Influential
Advancing Radiograph Representation Learning with Masked Record Modeling
Hong-Yu Zhou, Chenyu Lian, Lian-cheng Wang, et al.
International Conference on Learning Representations · Jan 2023
101
Enhancing representation in radiography-reports foundation model: a granular alignment algorithm using masked contrastive learning
Weijian Huang, Cheng Li, Hong-Yu Zhou, et al.
Nature Communications · Sep 2023
72

Citations

Total Citations83

Influential11

References50

GitHub

Stars100

Forks9

Open Issues13

Contributors2

Last Push2y ago

LanguagePython

LicenseMIT

Fields of citing research

Computer Science99%
Medicine90%
Engineering21%
Materials Science4%
Psychology1%

Share of papers citing this model.

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

71Open

Usability — can I run it?91

Reproducibility — can I retrain it?48

Model Openness Framework

Class III

Open Model

Resources

GitHub Repository Research Paper Research Paper

Key Features

Unified preservation objective: Combines pixel restoration, contrastive siamese comparison, and multi-scale feature-pyramid learning so representations encode semantics, pixel detail, and scale simultaneously.

Non-skip U-Net backbone: Builds an explicit feature pyramid using a non-skip U-Net so that multi-scale information is preserved and supervised at multiple resolutions.

Sub-crop for 3D: Replaces the multi-crop augmentation used on natural images with a sub-crop strategy better suited to volumetric 3D medical data.

2D and 3D support: A single framework pre-trains on both 2D chest X-rays and 3D CT volumes, with released weights for each modality.

Label efficiency: Outperforms self-supervised baselines on downstream tasks, sometimes by large margins, when annotated data is limited.

Technical Details

Applications

Impact

Citation

A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis

Zhou, H., et al. (2023) A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence.

DOI: 10.1109/TPAMI.2023.3234002

Recent citations

Papers that recently cited this model.

Nexus: Neuro-guided expert-routed pre-training for brain representation learning from sMRI

Hu Yu, Yiyu Zhang, Si-Yue Fu, et al.

Expert systems with applications · Jun 2026

ASAP: Advancing Medical Volumetric Representation Learning with Anatomy-aware Semantically-adaptive Pre-training

Rongsheng Wang, Fenghe Tang, Zihang Jiang, et al.

May 2026

0Influential

Benchmarking transferability of SSL pretraining to same and different modality segmentation tasks

Jue Jiang, H. Veeraraghavan

May 2026

PCRLv2

#Key Features

#Technical Details

#Applications

#Impact

Citation

A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis

Recent citations

ASAP: Advancing Medical Volumetric Representation Learning with Anatomy-aware Semantically-adaptive Pre-training

Benchmarking transferability of SSL pretraining to same and different modality segmentation tasks

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

PCRLv2

#Key Features

#Technical Details

#Applications

#Impact

Citation

A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis

Recent citations

ASAP: Advancing Medical Volumetric Representation Learning with Anatomy-aware Semantically-adaptive Pre-training

Benchmarking transferability of SSL pretraining to same and different modality segmentation tasks

Top citations

Related models

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact