Industry

Microsoft

Multinational technology corporation

Website
22 models(10 Protein, 8 Language model, 6 Imaging, 3 Small molecule, 3 Biosignals, 2 Pathology, 1 Single-cell)

Labs & Groups (3)

Models (22)

Vermeer

Microsoft Research / Broad Institute / Harvard University

Released June 1, 2026

1

Channel-adaptive autoregressive generative model that synthesizes in-silico fluorescence microscopy of protein subcellular localization from amino-acid sequence and cellular landmark stains.

ImagingProtein

GEMGen

Westlake University / Microsoft Research Asia

Released January 3, 2026

1

A language model that generates small-molecule structures directly from transcriptomic phenotypes — gene up/down-regulation signatures — for phenotype-driven drug discovery.

Small moleculeSingle-cell

LLaVA-Rad

Microsoft Research

Released February 20, 2025

631.2K58

Lightweight 7B vision-language foundation model from Microsoft Research, released research-only under the Microsoft Research License, that generates radiology findings from chest X-rays.

ImagingLanguage model

NatureLM

Microsoft Research AI for Science

Released February 11, 2025

2828

Unified science foundation model from Microsoft Research treating molecules, proteins, RNA, DNA, and materials as a shared sequence language for cross-domain generation.

Language modelSmall moleculeProtein

BioEmu-1

Microsoft

Released December 5, 2024

292830

Generative deep learning model from Microsoft Research that emulates protein equilibrium ensembles at 100,000x the speed of molecular dynamics simulation.

Protein

BiomedParse

Microsoft Research

Released November 18, 2024

1606.8K671

A biomedical foundation model for joint segmentation, detection, and recognition across nine imaging modalities using natural language prompts.

Imaging

SFM-Protein

Microsoft Research

Released October 31, 2024

3

A transformer protein language model using integrative co-evolutionary pre-training to capture both short-range and long-range residue interactions from sequence alone.

Protein

NeuroLM

Shanghai Jiao Tong University / Microsoft

Released August 27, 2024

92156

A multi-task EEG foundation model that treats brain signals as a foreign language, pairing a text-aligned neural tokenizer with a GPT-2 backbone.

BiosignalsLanguage model

BioT5+

Microsoft Research Asia

Released August 1, 2024

165126

An enhanced T5-based encoder-decoder that unifies molecule, protein, and text understanding via IUPAC integration and multi-task instruction tuning.

Language modelSmall moleculeProtein

AlphaFlow-Lit

Microsoft Research

Released July 8, 2024

12

Lightweight variant of AlphaFlow achieving ~47x faster conformational ensemble sampling by fine-tuning only AlphaFold's structure module with frozen Evoformer.

Protein

MAIRA-2

Microsoft Research

Released June 6, 2024

1393.2K

Microsoft Research multimodal LLM for grounded chest X-ray report generation, localizing each described finding with bounding boxes on the image.

ImagingLanguage model

Prov-GigaPath

Microsoft Research

Released May 22, 2024

83861.3K617

Whole-slide pathology foundation model pretrained on 1.3 billion tiles from 171,189 clinical WSIs. Achieves state-of-the-art on 25 of 26 pathology benchmark tasks.

Pathology

Distributional Graphormer

Microsoft Research

Released May 13, 2024

1492.5K

Deep learning framework predicting equilibrium distributions of molecular systems, enabling efficient ensemble generation and conformation sampling.

Protein

EEGFormer

Microsoft / ShanghaiTech University

Released January 11, 2024

60

A transformer EEG foundation model pretrained with vector-quantized self-supervision on 1.7 TB of EEG, yielding transferable, interpretable discrete representations.

Biosignals

MMM

Microsoft / South China University of Technology

Released December 10, 2023

82109

A masked-autoencoder EEG pretraining framework that maps any electrode layout to a unified topology for topology-agnostic, cross-dataset representations.

Biosignals

MAIRA-1

Microsoft Research

Released November 22, 2023

87

Microsoft Research multimodal LLM that generates the findings section of a chest X-ray report from a single frontal image using a CXR-specific vision encoder and Vicuna-7B.

ImagingLanguage model

EvoDiff

Microsoft Research

Released September 12, 2023

209669

Sequence-first protein generation framework using discrete diffusion over evolutionary alignments, enabling controllable de novo design without structure.

Protein

ABGNN

Huazhong University of Science and Technology / Microsoft Research

Released August 6, 2023

2555

Graph neural network framework for antigen-specific antibody CDR design, combining a pre-trained antibody language model with one-shot sequence and structure generation.

Protein

LLaVA-Med

Microsoft Research

Released June 1, 2023

1.8K20.5K2.2K

A biomedical vision-language assistant from Microsoft Research, adapted from LLaVA via curriculum learning on PubMed Central figure-caption pairs and GPT-4-generated instructions.

PathologyLanguage model

BiomedCLIP

Microsoft Research

Released March 1, 2023

598898.5K123

Multimodal biomedical foundation model trained on 15M PubMed Central figure-caption pairs via contrastive learning, achieving state-of-the-art zero-shot performance across imaging modalities.

Imaging

BioGPT

Microsoft Research Asia / Microsoft Research

Released October 19, 2022

1.4K137.9K4.5K

A GPT-2-based generative transformer pretrained on 15M PubMed abstracts for biomedical text generation and mining, including relation extraction and question answering.

Language model

CARP

Microsoft Research

Released May 19, 2022

259

CNN-based protein language model series showing convolutions match transformer performance on sequence pretraining while scaling linearly with sequence length.

Protein