The University of Hong Kong / ShanghaiTech University
Self-supervised pre-training framework for medical image analysis that unifies pixel restoration with contrastive feature comparison across 2D and 3D modalities.
PCRLv2 (Preservational Contrastive Representation Learning v2) is a self-supervised pre-training framework for medical image analysis developed by Hong-Yu Zhou, Chixiang Lu, Chaoqi Chen, Sibei Yang, and Yizhou Yu at the University of Hong Kong and ShanghaiTech University, published in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) in 2023. It produces transferable visual backbones from unlabeled radiology images, addressing the chronic scarcity of expert annotations in medical imaging by learning general-purpose representations that can be fine-tuned for many downstream tasks.
The core insight is that purely contrastive self-supervised methods, while effective on natural images, tend to discard fine-grained pixel-level and scale information that matters for medical tasks such as lesion segmentation and nodule detection. PCRLv2 unifies three complementary preservation objectives — pixel restoration, siamese feature comparison, and scale preservation over a feature pyramid — so that the learned features retain both high-level semantics and low-level visual detail. It extends the earlier PCRLv1 framework (NeurIPS 2021), which introduced preservational learning via context reconstruction.
By operating natively on both 2D images (chest X-rays) and 3D volumes (CT scans), PCRLv2 offers a single recipe that spans the two dominant data formats in clinical radiology, making it a practical starting point for label-efficient medical imaging pipelines.
PCRLv2 frames pre-training as multi-task optimization over a feature pyramid built by a non-skip U-Net encoder–decoder. The restoration branch reconstructs perturbed input pixels to force pixel-level information into high-level features; the contrastive branch applies siamese feature comparison; and a scale-preservation term operates across pyramid levels. The 3D variant uses a sub-crop scheme in place of multi-crop. Pre-training datasets include LUNA16 (3D chest CT), NIH ChestX-ray14, and CheXpert (2D chest X-rays), with released pretrained weights for each. The authors evaluate transfer performance on brain tumor segmentation (BraTS 2018), pulmonary nodule detection (LUNA), abdominal organ segmentation (LiTS), and chest pathology classification (NIH ChestX-ray, CheXpert), reporting consistent gains over contrastive and restoration-based self-supervised counterparts, with the largest improvements observed under limited-annotation regimes.
PCRLv2 serves as a pre-trained backbone for medical imaging researchers and developers who need strong models but have few labeled examples. Practitioners can initialize segmentation, detection, or classification networks from the released LUNA16, NIH ChestX-ray, or CheXpert weights and fine-tune on their own datasets, reducing annotation burden for tasks like tumor and organ segmentation, lung nodule detection, and chest disease screening. Because it covers both 2D and 3D, it fits radiology workflows spanning plain-film X-ray and volumetric CT.
PCRLv2 contributed to the line of medical-imaging self-supervised learning research that demonstrated annotation-efficient pre-training can rival or exceed supervised ImageNet transfer for radiology. Published in a top venue (IEEE TPAMI) with MIT-licensed code and downloadable pretrained weights, it has been adopted as a baseline and starting point in subsequent medical foundation model work. Its main limitations are scope-related: the released models target chest X-ray and CT radiology rather than the full breadth of medical imaging modalities, and the framework predates the larger transformer-based medical foundation models that followed.
Zhou, H., et al. (2023) A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence.
DOI: 10.1109/TPAMI.2023.3234002Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data