bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Imaging foundation models
Imaging

UniBrain

Shanghai Jiao Tong University / University of Science and Technology of China / Shanghai AI Laboratory / Shanghai Sixth People's Hospital

Hierarchical knowledge-enhanced vision-language pre-training model for universal brain MRI diagnosis across 10+ diseases from multi-modal scans and reports.

Released: September 2023

UniBrain is a vision-language pre-training framework for universal brain MRI diagnosis, developed by researchers at Shanghai Jiao Tong University, the University of Science and Technology of China, Shanghai AI Laboratory, and Shanghai Sixth People's Hospital. First released as a preprint in September 2023 and later published in Computerized Medical Imaging and Graphics in 2025, it targets a central limitation of brain MRI deep learning: most models are trained narrowly on a single disease or modality and fail to generalize across the wide spectrum of conditions encountered in routine clinical practice.

Rather than relying on costly per-disease manual annotation, UniBrain learns directly from 24,770 routinely collected imaging-report pairs, pairing four transverse MRI modalities (T1WI, T2WI, T2FLAIR, and DWI) with their free-text radiology reports. The model addresses the gap between unstructured clinical prose and structured visual features through a hierarchical alignment scheme that links images and reports at multiple levels of granularity, enabling diagnosis across more than ten common brain diseases within a single framework.

The work sits within the broader trend of medical vision-language foundation models (alongside efforts like CheXzero and MedCLIP in chest imaging), but is distinctive in tackling volumetric, multi-modal brain MRI and in deriving diagnostic supervision automatically from radiology reports.

#Key Features

  • Hierarchical alignment: Aligns imaging and report features at the modality, concatenated-feature, and global levels, rather than using a single brute-force vision-language contrast, improving fine-grained correspondence.
  • Automatic Report Decomposition: Extracts structured diagnostic knowledge from free-text clinical reports, removing the need for exhaustive manual disease labels.
  • Multi-modal input: Jointly processes four transverse MRI sequences (T1WI, T2WI, T2FLAIR, DWI), reflecting how radiologists read brain studies.
  • Universal diagnosis: Covers 10+ common brain conditions in one model, with reported average AUC around 90.7% on in-domain evaluation.
  • Released weights: Pre-trained image and text encoder checkpoints and an inference SDK are publicly available for reuse.

#Technical Details

UniBrain combines a convolutional image encoder for volumetric MRI with a transformer-based text encoder, trained with a hierarchical contrastive vision-language objective over 24,770 imaging-report pairs drawn from routine diagnostics. The hierarchical alignment first matches modality-wise imaging-report features, then projects concatenated multi-modal features into a shared vision-language semantic space, and finally aligns global imaging-report representations. An Automatic Report Decomposition module structures the free-text reports into diagnosis-relevant knowledge used as supervision. On in-domain evaluation the model reports an average AUC of roughly 90.7% across its target diseases, and it is additionally validated on out-of-domain datasets, where it consistently surpasses prior state-of-the-art diagnostic baselines and reaches radiologist-level performance on certain disease categories.

#Applications

UniBrain is aimed at automated brain MRI screening and decision support, where a single model can flag multiple candidate diagnoses from a standard multi-sequence study and its accompanying report. Because it learns from routinely generated reports rather than bespoke annotations, it is well suited to institutions with large unlabeled MRI archives, and the released inference SDK lets researchers and clinical informatics teams run predictions on NIfTI inputs across the four supported modalities. Its out-of-domain validation suggests utility as a transferable backbone for downstream neuroimaging tasks.

#Impact

By demonstrating that report-supervised, hierarchically aligned pre-training can deliver universal, multi-disease brain MRI diagnosis at radiologist-comparable accuracy on some conditions, UniBrain contributes to the growing body of medical vision-language foundation models and offers a template for exploiting routine clinical reports as supervision. Its public weights and SDK lower the barrier for follow-on neuroimaging research. Key limitations include reliance on the four specific transverse modalities and on report quality, and validation confined to the disease set and cohorts represented in the training and test data, so generalization to rarer conditions and other scanner protocols remains to be established.

Citation

UniBrain: Universal Brain MRI diagnosis with hierarchical knowledge-enhanced pre-training

Lei, J., et al. (2025) UniBrain: Universal Brain MRI diagnosis with hierarchical knowledge-enhanced pre-training. Computerized Medical Imaging and Graphics.

DOI: 10.1016/j.compmedimag.2025.102516

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References39

GitHub

Stars39
Forks1
Open Issues7
Contributors1
Last Push1y ago
LanguagePython

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
35Closed
Usability — can I run it?54
Reproducibility — can I retrain it?0
not reproducible
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

brain_mricnncontrastive_learningdiagnosismultimodalneuroimagingreport_generationtransformervision_language_pre_trainingzero_shot

Resources

GitHub RepositoryResearch PaperResearch Paper