bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Imaging foundation models
Imaging

NeuroVFM

University of Michigan / University of Cologne

A generalist neuroimaging vision foundation model pretrained on 5.24M clinical MRI and CT volumes for radiologic diagnosis and report generation.

Released: November 2025

NeuroVFM is a generalist vision foundation model for clinical neuroimaging, designed to interpret brain MRI and CT studies across the full diversity of pathology seen in routine practice. Most prior medical imaging foundation models depend on carefully curated, annotation-rich datasets that are expensive to assemble and narrow in scope. NeuroVFM instead embraces "health system learning": it is pretrained directly on uncurated clinical data accumulated through ordinary patient care, allowing it to learn from the messy, heterogeneous, real-world distribution of scans that radiologists actually encounter.

The model was developed by the Machine Learning in Neurosurgery (MLiNS) Lab at the University of Michigan, with collaborators from University of Michigan Radiology, Computational Medicine and Bioinformatics, and Computer Science and Engineering, and the Department of Neurosurgery at the University of Cologne. It was released as a preprint in November 2025 (Kondepudi et al.). The authors demonstrate that frontier general-purpose vision-language models underperform on neuroimaging tasks, whereas a domain-specific model trained at scale on clinical data achieves state-of-the-art results.

NeuroVFM occupies a distinctive niche in the landscape of biomedical foundation models: rather than a protein or genomics model, it is a volumetric imaging encoder that pairs self-supervised visual pretraining with a downstream findings language model, bridging pixel-level brain anatomy and natural-language radiology reporting.

#Key Features

  • Health system learning at scale: Pretrained on 5.24 million clinical MRI and CT volumes drawn from roughly 567,000 imaging studies spanning more than 20 years of routine care, without manual curation or expert annotation.
  • Volumetric self-supervision (Vol-JEPA): Uses a Volumetric Joint-Embedding Predictive Architecture, a 3D extension of the JEPA paradigm that predicts masked regions in latent space rather than reconstructing raw voxels.
  • Broad diagnostic coverage: Study-level diagnostic heads span 74 MRI and 82 CT diagnoses, built with multiple-instance learning over slice-level features.
  • Radiology report generation: A fine-tuned findings language model produces draft radiology reports that exceed frontier models in accuracy while reducing hallucinated findings.
  • Interpretable grounding: Exhibits emergent neuroanatomic understanding with visual grounding that links predictions back to relevant regions of the scan.

#Technical Details

NeuroVFM is built on a 3D Vision Transformer encoder trained with the Vol-JEPA objective, which masks volumetric patches and predicts their representations in embedding space to learn anatomy and pathology without labels. Downstream capabilities are added in stages: multiple-instance learning trains study-level diagnostic heads covering 74 MRI and 82 CT diagnoses, and a Perceiver-style connector links the visual encoder to a large language model, which is supervised-fine-tuned to generate radiology findings. The training corpus comprises 5.24 million MRI and CT volumes from approximately 567,000 studies collected through clinical care. Released model variants include a neurovfm-encoder, CT diagnostic head (neurovfm-dx-ct), and findings LLM (neurovfm-llm); reported evaluations show state-of-the-art diagnostic accuracy and report quality relative to general-purpose frontier models. Code is released under the MIT license, while model weights are distributed under CC-BY-NC-SA 4.0 and gated behind institutional-email access approval.

#Applications

NeuroVFM targets clinical and research neuroimaging workflows. Its encoder provides general-purpose volumetric representations that can be adapted to classification, detection, or retrieval tasks, while its diagnostic heads and findings model support automated triage, draft report generation, and decision support for radiologists and neurosurgeons. Researchers can use the released encoder as a feature backbone for downstream neuroimaging studies, and clinical informatics teams can explore it as a template for building generalist medical AI from existing health-system archives. The authors emphasize research-only use and explicitly note it is not a medical device.

#Impact

NeuroVFM advances the case that uncurated, real-world clinical archives are a viable—and in some respects superior—substrate for training generalist medical AI, challenging the assumption that high-quality models require expensive curated benchmarks. By demonstrating that a domain-specialized volumetric model can outperform frontier vision-language systems on radiologic diagnosis and report generation while reducing hallucinations, it offers a concrete blueprint for "health system learning" in neuroimaging and beyond. As a recent preprint with gated weights, its broader adoption and external validation remain to be established, but it sharpens an important direction for safe, clinically grounded foundation models.

Citations

Health system learning achieves generalist neuroimaging models

Kondepudi, A., et al. (2025) Health system learning achieves generalist neuroimaging models. Research Square.

DOI: 10.21203/rs.3.rs-8166797/v1

Health system learning achieves generalist neuroimaging models

Preprint

Kondepudi, A., et al. (2025) Health system learning achieves generalist neuroimaging models. Research Square.

DOI: 10.48550/arXiv.2511.18640

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations4
Influential1
References0

GitHub

Stars20
Forks3
Open Issues0
Contributors2
Last Push3mo ago
LanguagePython
LicenseMIT

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
57Partial
Usability — can I run it?70
Reproducibility — can I retrain it?36
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

ctfoundation_modeljoint_embedding_predictive_architecturemrimultimodalneuroimagingradiologic_diagnosisreport_generationrepresentation_learningself_supervisedvision_transformer

Resources

GitHub RepositoryResearch PaperOfficial WebsiteHuggingFace ModelDemo