Models (1)
23—12
A medical vision-language model that accepts visual-referring multimodal input and produces pixel-grounded multimodal output, jointly answering and segmenting medical images.
ImagingLanguage model
A medical vision-language model that accepts visual-referring multimodal input and produces pixel-grounded multimodal output, jointly answering and segmenting medical images.