RETFound

University College London / Google DeepMind

Self-supervised foundation model for retinal imaging, pretrained on 1.6 million unlabelled fundus and OCT scans to detect ocular and systemic disease.

Released: September 2023

RETFound is a self-supervised foundation model for retinal imaging, developed by researchers at the UCL Institute of Ophthalmology, Moorfields Eye Hospital, and Google DeepMind, and published in Nature in September 2023. It addresses a persistent bottleneck in ophthalmic AI: high-performing disease classifiers traditionally require large, expensively annotated datasets, which limits their applicability to the long tail of rare conditions and to settings where labels are scarce. RETFound sidesteps this by learning rich representations from 1.6 million unlabelled retinal images before any task-specific fine-tuning.

The model is one of the first foundation models built specifically for a medical imaging modality, and it demonstrated that the masked-image-modelling paradigm that powered language and natural-image foundation models transfers cleanly to clinical retinal data. After self-supervised pretraining, RETFound can be adapted with a small number of labelled examples to a wide range of downstream tasks, consistently outperforming models pretrained with conventional supervised transfer learning on ImageNet.

Crucially, RETFound's representations capture signal relevant not only to eye disease but also to systemic conditions. Because the retina offers a non-invasive window onto the body's microvasculature and neural tissue, the model can be fine-tuned to predict the future onset of cardiovascular and neurodegenerative disorders from a single retinal photograph.

Key Features

Self-supervised pretraining at scale: A masked autoencoder learns to reconstruct heavily masked retinal images across 1.6 million scans, building a general-purpose representation without any disease labels.
Two imaging modalities: Separate checkpoints are released for colour fundus photography (CFP) and optical coherence tomography (OCT), covering the two dominant retinal imaging modalities in clinical practice.
High label efficiency: RETFound reaches strong performance after fine-tuning on relatively few labelled examples, reducing the annotation burden for new tasks and rare diseases.
Ocular and systemic prediction: The model is validated on diabetic retinopathy and glaucoma grading, neovascular AMD prognosis, and incident prediction of heart failure, myocardial infarction, ischaemic stroke, and Parkinson's disease.
Free for non-commercial research: The full training and fine-tuning pipeline is on GitHub under a non-commercial CC-BY-NC-4.0 license (not OSI-approved, so not "open" code), and the pretrained CFP/OCT weights are distributed on HuggingFace behind account registration and an access agreement.

Technical Details

RETFound uses a Vision Transformer (ViT-Large) backbone trained with the masked autoencoder (MAE) objective: a large fraction of image patches is masked and the model learns to reconstruct the missing content, forcing it to encode the underlying anatomy. Pretraining used roughly 904,000 CFP images and 736,000 OCT images (about 1.6 million total) drawn from the Moorfields-AlzEye and UK Biobank cohorts. Downstream adaptation replaces the decoder with a task-specific classification head and fine-tunes end to end. On diabetic retinopathy classification, RETFound achieved AUROCs of 0.943, 0.822, and 0.884 on the Kaggle APTOS-2019, IDRiD, and MESSIDOR-2 benchmarks respectively, surpassing supervised ImageNet-21k baselines, and it showed comparable gains on glaucoma detection and systemic-disease prognosis. The released ViT-Large checkpoints are roughly 300 MB each.

Applications

RETFound is designed to be a starting point for clinical and research teams building retinal-image classifiers, particularly where labelled data is limited. Ophthalmologists and AI developers can fine-tune it to screen for diabetic retinopathy, glaucoma, and age-related macular degeneration, or to triage patients in community screening programmes. Beyond the eye, epidemiologists and clinical researchers can adapt it for oculomics — using the retina as a biomarker for cardiovascular and neurodegenerative risk — opening low-cost, non-invasive screening pathways for systemic disease.

Impact

RETFound established a template for medical-imaging foundation models and became a widely cited reference point for the ophthalmic AI field, spurring a wave of follow-on retinal models and benchmark studies. By releasing both code and weights for non-commercial research, the team enabled groups worldwide to build on the representation rather than retrain from scratch, and subsequent work has probed its label efficiency, deployed it in real-world community screening, and trained more compute-efficient successors. Known limitations include the non-commercial CC-BY-NC-4.0 code license and the gated, access-controlled weights — which together fall short of an open release — alongside sensitivity of downstream performance to dataset shift across populations and devices, and the fact that systemic-disease prediction, while promising, requires prospective clinical validation before deployment.

Citation

A foundation model for generalizable disease detection from retinal images

Zhou, Y., et al. (2023) A foundation model for generalizable disease detection from retinal images. Nature.

DOI: 10.1038/s41586-023-06555-x

Recent citations

Papers that recently cited this model.

A comparative analysis of modern CNN and transformer architectures for multi-class retinal disease classification with statistical validation and explainability
M. Balcı, Ahmet Alkan
Neurocomputing · Oct 2026
0
Fundus image-based glaucoma screening via retinal knowledge-oriented dynamic multi-level feature integration
Chi Liu, Yuzhuo Zhou, Sheng Shen, et al.
Knowledge-Based Systems · Sep 2026
0
Classification of Retinal Diseases in Fundus Images Using Hybrid Deep Learning Based on Multiscale Feature Fusion
Hüseyin Yanık, Bensu Beğenilmiş, E. Değirmenci
Black Sea Journal of Engineering and Science · Jul 2026
0

Top citations

The most-cited papers that cite this model.

A whole-slide foundation model for digital pathology from real-world data
Hanwen Xu, N. Usuyama, Jaspreet Bagga, et al.
Nature · May 2024
842
A Pathology Foundation Model for Cancer Diagnosis and Prognosis Prediction
Xiyue Wang, Junhan Zhao, Eliana Marostica, et al.
Nature · Sep 2024
484
On the Challenges and Perspectives of Foundation Models for Medical Image Analysis
Shaoting Zhang, Dimitris N. Metaxas
Medical Image Anal. · Jun 2023
304
A Vision-Language Foundation Model for Precision Oncology
Jinxi Xiang, Xiyue Wang, Xiaoming Zhang, et al.
Nature · Jan 2025
245
Vision–language foundation model for echocardiogram interpretation
M. Christensen, M. Vukadinovic, Neal Yuan, et al.
Nature Medicine · Apr 2024
202

Citations

Total Citations938

Influential106

References69

GitHub

Stars660

Forks159

Open Issues5

Contributors3

Last Push7mo ago

LanguagePython

HuggingFace

Downloads105

Likes30

Last Modified1y ago

Fields of citing research

Medicine47%
Computer Science42%
Engineering13%
Biology2%
Environmental Science1%
Chemistry1%
Physics1%
Mathematics0%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility

30Closed

Usability — can I run it?24

Reproducibility — can I retrain it?21

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper HuggingFace Model

Key Features

Self-supervised pretraining at scale: A masked autoencoder learns to reconstruct heavily masked retinal images across 1.6 million scans, building a general-purpose representation without any disease labels.

Two imaging modalities: Separate checkpoints are released for colour fundus photography (CFP) and optical coherence tomography (OCT), covering the two dominant retinal imaging modalities in clinical practice.

High label efficiency: RETFound reaches strong performance after fine-tuning on relatively few labelled examples, reducing the annotation burden for new tasks and rare diseases.

Ocular and systemic prediction: The model is validated on diabetic retinopathy and glaucoma grading, neovascular AMD prognosis, and incident prediction of heart failure, myocardial infarction, ischaemic stroke, and Parkinson's disease.

Free for non-commercial research: The full training and fine-tuning pipeline is on GitHub under a non-commercial CC-BY-NC-4.0 license (not OSI-approved, so not "open" code), and the pretrained CFP/OCT weights are distributed on HuggingFace behind account registration and an access agreement.

Technical Details

Applications

Impact

RETFound

#Key Features

#Technical Details

#Applications

#Impact

Citation

A foundation model for generalizable disease detection from retinal images

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

RETFound

#Key Features

#Technical Details

#Applications

#Impact

Citation

A foundation model for generalizable disease detection from retinal images

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact