Labs & Groups (1)
Models (11)
A VAE trained on scRNA-seq reference data and applied frozen at inference to impute unmeasured genes and denoise spatial transcriptomics profiles.
Latent flow-matching foundation model that predicts pan-cancer spatially-resolved single-cell gene expression directly from routine H&E histology slides.
A LoRA adapter on ProstT5 that predicts per-residue probability distributions over Foldseek 3Di tokens, capturing sequence-encoded conformational flexibility from MD trajectories.
Scooby
Technical University of Munich / Helmholtz Munich / Harvard Medical School / Broad Institute / Harvard University
Released October 1, 2025
Predicts single-cell-resolution scRNA-seq coverage and scATAC-seq insertion profiles directly from DNA sequence by adapting the Borzoi predictor with a cell-specific decoder.
cxt (Coalescence and Translation LM)
University of Oregon / Technical University of Munich
Released June 24, 2025
Decoder-only transformer that reframes ancestral recombination graph inference as next-token prediction, estimating coalescence times from genetic variation at scale.
EndoChat
Chinese University of Hong Kong / Huawei / Technical University of Munich / University of Strasbourg / Shandong University / Chinese Academy of Sciences
Released January 20, 2025
Grounded multimodal large language model for endoscopic surgery, supporting visual dialogue, region-based question answering, and bounding-box grounding across surgical scene understanding tasks.
Transformer foundation model pretrained on 110M single-cell and spatially resolved transcriptomics profiles, enabling spatial context prediction for dissociated cells.
A bilingual protein language model that translates bidirectionally between amino acid sequences and the 3Di structural alphabet, enabling inverse folding and structure-aware embeddings.
Masked DNA language model trained on 800+ species spanning 500M years of evolution, using explicit species conditioning to capture conserved regulatory elements.
Optimized protein language model that surpasses state-of-the-art performance with fewer than 10% of the parameters of comparable models.
A suite of six protein language models — including ProtBERT and ProtT5 — trained on up to 393 billion amino acids using large-scale HPC infrastructure.