All Competitors

Every biological foundation model, evaluated and ranked by the bio.rodeo team

Showing 124 of 35 filtered models

  • MIMO

    1223
    Peking University +1 otherOctober 11, 2025histologyinstruction_tuningmultimodal+6

    A medical vision-language model that accepts visual-referring multimodal input and produces pixel-grounded multimodal output, jointly answering and segmenting medical images.

    ImagingLanguage model
    11Openness
  • Chongqing University of TechnologyJuly 1, 2025histologyinstruction_tuningmedical_image_understanding+5

    A 4.2B-parameter lightweight biomedical vision-language assistant built on Phi-2 that outperforms larger LLaVA-Med models on medical visual question answering.

    ImagingLanguage model
    15Openness
  • EyeCLIP

    8147
    The Hong Kong Polytechnic University +5 othersJune 21, 2025clipcontrastive_learningcross_modal_retrieval+11

    A CLIP-based visual-language foundation model for multi-modal ophthalmic imaging, enabling zero-shot disease detection across 11 modalities including fundus, OCT, and slit-lamp.

    ImagingLanguage model
    15Openness
  • QoQ-Med

    5232391
    MITMay 31, 2025ecgfoundation_modelhistology+7

    Open multimodal clinical foundation model that jointly reasons over medical images, ECG time-series, and text reports, trained with domain-aware reinforcement learning.

    ImagingBiosignalsLanguage model
    74Openness
  • UniBiomed

    619215
    Hong Kong University of Science and Technology +2 othersApril 30, 2025foundation_modelhistologymultimodal+6

    Universal foundation model that jointly generates diagnostic text and segments the corresponding targets across ten biomedical imaging modalities.

    ImagingLanguage model
    64Openness
  • GMAI-VL-R1

    1828
    Shanghai AI Laboratory +6 othersApril 2, 2025histologylanguage_modelmedical_image_diagnosis+7

    A reinforcement-learning-enhanced general medical vision-language model that adds step-by-step reasoning for medical image diagnosis and visual question answering.

    ImagingLanguage model
    17Openness
  • MedVLM-R1

    29171419
    Technical University of Munich +2 othersFebruary 26, 2025medical_reasoningmultimodalradiology+4

    A 2B-parameter medical vision-language model that uses reinforcement learning (GRPO) to produce explicit, human-interpretable reasoning for radiology visual question answering.

    ImagingLanguage model
    83Openness
  • HealthGPT

    1.6K10022
    Zhejiang University +4 othersFebruary 14, 2025histologyimage_reconstructionmedical_image_generation+7

    Medical large vision-language model unifying image comprehension and generation in one autoregressive framework via heterogeneous LoRA knowledge adaptation.

    PathologyImaging
    68Openness
  • EndoChat

    504219
    Chinese University of Hong Kong +5 othersJanuary 20, 2025endoscopygrounded_dialogueinstruction_tuning+6

    Grounded multimodal large language model for endoscopic surgery, supporting visual dialogue, region-based question answering, and bounding-box grounding across surgical scene understanding tasks.

    ImagingLanguage model
    22Openness
  • MUSK

    229246
    Stanford University +1 otherJanuary 8, 2025cross_modal_retrievalfoundation_modelhistology+9

    A vision-language foundation model for precision oncology that pretrains on 50M pathology images and 1B text tokens via unified masked modeling.

    PathologyLanguage model
    12Openness
  • MedPLIB

    130337
    Baidu +3 othersDecember 12, 2024histologymedical_image_groundingmixture_of_experts+8

    Biomedical multimodal LLM with pixel-level insight, combining visual question answering, pixel-grounded prompts, and segmentation via a mixture-of-experts design.

    ImagingLanguage model
    80Openness
  • BiMediX2

    731620
    Mohamed bin Zayed University of Artificial IntelligenceDecember 10, 2024histologyinstruction_tuninglanguage_model+7

    A bilingual (Arabic-English) bio-medical large multimodal model built on Llama 3.1 for medical image understanding and clinical text conversation.

    Language modelImagingPathology
    11Openness
  • MedRegA

    452613
    Hong Kong University of Science and Technology +1 otherOctober 24, 2024histologyimage_classificationinstruction_tuning+8

    Region-aware bilingual (Chinese-English) medical multimodal LLM that handles image- and region-level vision-language tasks across eight imaging modalities.

    PathologyLanguage model
    65Openness
  • BiomedGPT

    709373
    Lehigh University +9 othersAugust 7, 2024foundation_modelhistologyimage_captioning+7

    Open-source, lightweight generalist vision-language foundation model for diverse biomedical imaging and text tasks.

    Language modelImagingPathology
    33Openness
  • LLaVA-Tri

    409843
    UC Santa Cruz +3 othersAugust 6, 2024foundation_modelhistologymultimodal+5

    A medical multimodal large language model pretrained on the 25M-image MedTrinity-25M dataset, achieving state-of-the-art accuracy on biomedical visual question answering.

    Language modelPathology
    30Openness
  • PathChat

    420
    Mahmood Lab +4 othersJuly 10, 2024cancerdiagnosisfoundation_model+7

    A multimodal vision-language copilot for human pathology that analyzes histology images and answers diverse pathology queries in natural language.

    PathologyLanguage model
    35Openness
  • Shenzhen Research Institute of Big Data +1 otherJune 27, 2024histologyinstruction_tuningmedical_image_understanding+5

    A family of open medical multimodal LLMs (7B and 34B) trained on PubMedVision, a 1.3M-sample medical VQA dataset distilled from PubMed image-text pairs.

    PathologyLanguage model
    52Openness
  • MAIRA-2

    1393.3K
    Microsoft ResearchJune 6, 2024chest_x_rayinstruction_tuningmultimodal+6

    Microsoft Research multimodal LLM for grounded chest X-ray report generation, localizing each described finding with bounding boxes on the image.

    ImagingLanguage model
    35Openness
  • EyeFound

    40
    The Hong Kong Polytechnic University +3 othersMay 18, 2024disease_diagnosisfoundation_modelmasked_autoencoder+7

    A multimodal generalist foundation model for ophthalmic imaging, self-supervised on 2.78M images across 11 modalities for diagnosis, prognosis, and visual question answering.

    Imaging
    4Openness
  • MedDr

    982655
    Hong Kong University of Science and TechnologyApril 23, 2024foundation_modelhistologyinstruction_tuning+7

    A 40B-parameter generalist medical vision-language foundation model spanning radiology, pathology, dermatology, retinography, and endoscopy.

    ImagingLanguage model
    69Openness
  • Med-MoE

    15882
    Zhejiang University +2 othersApril 16, 2024histologyimage_classificationinstruction_tuning+5

    A lightweight mixture-of-experts medical vision-language model that routes between domain-specific experts for VQA and image classification while activating only 30-50% of parameters.

    ImagingLanguage modelPathology
    81Openness
  • M3D

    442159873
    Beijing Academy of Artificial IntelligenceMarch 31, 2024ctimage_text_retrievalinstruction_tuning+9

    A multimodal large language model for 3D medical imaging, handling retrieval, report generation, VQA, positioning, and segmentation on CT volumes.

    ImagingLanguage model
    77Openness
  • CheXagent

    226711.3K
    Stanford UniversityJanuary 22, 2024chest_x_rayfoundation_modelimage_classification+7

    An instruction-tuned vision-language foundation model from Stanford for interpreting and summarizing chest X-rays across eight clinical task types.

    ImagingLanguage model
    32Openness
  • Alibaba GroupOctober 27, 2023histologyinstruction_tuningmedical_image_captioning+5

    The first Chinese medical large vision-language model, pairing a pretrained ViT with an LLM to interpret medical images and answer clinical questions in Chinese.

    Language modelPathology
    44Openness