bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Imaging foundation models
Imaging

SpineAgent

University of Washington

Multi-sequence spine-MRI foundation model with paired DINOv3 encoders, supporting 17-condition classification, pathology localization, image-report retrieval, and report generation.

Released: June 2026

Spine MRI is central to diagnosing back pain, spinal stenosis, trauma, and tumors, but its interpretation is slow and complex, requiring radiologists to synthesize findings across multiple imaging sequences (T1-, T2-weighted, STIR, Dixon) and anatomical levels. SpineAgent is a multi-sequence spine-MRI foundation model and accompanying multi-agent system that learns transferable visual representations from routine clinical imaging and applies them across the full spectrum of interpretation tasks, from condition classification to draft report generation.

Developed by Zhiping Xiao, Nathan M. Cross, Sheng Wang, and colleagues at the University of Washington (with collaborators at Peking University, UW–Madison, and NYU) and posted to bioRxiv in June 2026, SpineAgent is built on a self-supervised foundation model pretrained on one of the largest spine-MRI corpora reported to date: 32,047 patients, 453,683 series, and 13.4 million slices from University of Washington Medicine. Its core is a pair of DINOv3-based Vision Transformer encoders trained separately on T1- and T2-weighted data, which produce fixed patient-level embeddings that are reused across many downstream agents.

By decomposing radiology reporting into clinically grounded subtasks, each handled by a specialized agent that draws on the shared encoders, SpineAgent demonstrates that a single imaging foundation model can generalize across manufacturers and external cohorts, a recurring challenge for medical-imaging deep learning.

#Key Features

  • Paired DINOv3 encoders: Two Vision Transformers are pretrained independently with the DINOv3 self-supervised objective (with gram anchoring) on T1- and T2-weighted slices, and a lightweight synthesizer module learns layer-wise fusion to adapt to other sequence types (STIR, Dixon) via continual training.
  • 17-condition classification: Patient-level embeddings drive classification across 17 spinal conditions spanning degenerative changes, alignment abnormalities, lesions/masses, trauma/compression, and canal or foraminal narrowing, with labels distilled from clinical reports under a presence/absence/ambiguity scheme.
  • Pathology localization: A two-phase pipeline first selects clinically relevant slices per condition, then regresses bounding boxes to localize pathology, evaluated against expert-annotated key slices on an RSNA spine-MRI benchmark.
  • Multimodal retrieval: Image-to-text and text-to-image retrieval agents align case-level MRI embeddings with report embeddings via CLIP-style training using a BiomedBERT text encoder, enabling case lookup and concordant-slice retrieval.
  • Draft report generation: Attention-pooled image tokens are concatenated with text tokens and passed to a LLaMA-3.1-8B decoder to produce a draft radiology report, integrating visual features with structured semantic signals.

#Technical Details

SpineAgent pretrains its two ViT encoders with DINOv3 on roughly 4.5 million T1 and 4.5 million T2 slices each, then aligns image and text representations through a CLIP-style stage using a BiomedBERT language encoder. For inference, slice-level embeddings from the sequence-specific encoders (or the synthesizer for other sequences) are concatenated and aggregated by an attention-pooling projector into a fixed set of patient-level image tokens. Across the 17 classification tasks, SpineAgent improves AUROC by 10.8% over the strongest baselines (with a 13.4% AUPRC gain) when using all available sequences, and continual training of the synthesizer yields an 11.1% AUROC improvement on non-T1/T2 sequences. On retrieval, it achieves a 56.4% relative improvement in Recall@5 on the UW Medicine dataset over the next-best method. Cross-manufacturer evaluation (training on one scanner vendor, testing across all) and cross-cohort evaluation on the external RSNA LumbarDISC cohort both show consistent gains, evidence of robustness to scanner and population shift.

#Applications

SpineAgent is aimed at radiology workflows for spine imaging: it can triage and classify spinal pathology, highlight the slices and regions most relevant to a suspected condition, retrieve similar prior cases or matching reports, and generate a structured draft report to accelerate read times. Its patient-level embeddings also serve as a reusable representation for researchers building downstream spine-MRI models without retraining an encoder from scratch. Because the encoders generalize across manufacturers and to an external cohort, the system is relevant to multi-site clinical and research settings where imaging hardware and protocols vary.

#Impact

SpineAgent shows that self-supervised foundation modeling on large routine clinical corpora can unify spine-MRI interpretation tasks that were previously addressed by separate, narrowly trained models, while generalizing across the manufacturer and cohort shifts that often degrade medical imaging models. The training pipeline (DINOv3 encoders, CLIP alignment, synthesizer routing, and the report-generation stack) is released under Apache-2.0 on GitHub. However, the model is a June 2026 bioRxiv preprint and has not yet been peer reviewed; the underlying clinical imaging data cannot be shared for privacy reasons, and pretrained weights are not yet publicly available, which currently limits external reproducibility and direct clinical deployment.

Citation

A multi-agent system for spine MRI report generation from multi-sequence imaging

Xiao, Z., et al. (2026) A multi-agent system for spine MRI report generation from multi-sequence imaging. bioRxiv.

DOI: 10.64898/2026.06.07.730703

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References52

GitHub

Stars4
Forks0
Open Issues0
Contributors1
Last Push2mo ago
LanguageJupyter Notebook
LicenseApache-2.0

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe
55Partial
Usability — can I run it?63
Reproducibility — can I retrain it?48
Model Openness Framework
Unclassified
Missing required components

Tags

image_classificationsegmentationimage_text_retrievalreport_generationvision_transformertransformerfoundation_modelself_supervisedmultimodalcontrastive_learningspine_mriradiology

Resources

GitHub RepositoryResearch Paper