bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Imaging foundation models
ImagingSingle-cellSpatial omics

COSMIC

EPFL

Bidirectional generative framework linking single-cell nuclear morphology and gene expression, built on a morphology foundation model trained on 21M+ segmented nuclei.

Released: January 2026

COSMIC is a bidirectional generative framework that quantifies the relationship between the physical morphology of a single cell and its underlying gene expression program. Developed by Si Wen, Ramon Viñas, Maria Brbić, Bart Deplancke, and colleagues at EPFL and posted to bioRxiv in January 2026, the model addresses a long-standing gap in cell biology: although microscopy and transcriptomics each describe cell state, the field has lacked both large paired datasets and computational tools capable of modeling how transcriptional programs give rise to cellular form, and vice versa.

The central problem COSMIC tackles is cross-modal decomposition. Given a cell's image, how much of its morphological variance is explained by gene expression, and given its transcriptome, how much variance is reflected in morphology? COSMIC answers these questions generatively by coupling a morphology foundation model, pretrained on over 21 million segmented nuclei, with existing transcriptomic embeddings. This bidirectional design lets the model translate in either direction between the two modalities, rather than treating one as a fixed predictor of the other.

A key enabler is IRIS, a measurement technology that captures high-resolution images and matched transcriptomes from the same single cells at scale. The resulting paired dataset provides the supervision needed for COSMIC to learn genuine cross-modal structure rather than spurious correlations, positioning it alongside emerging multimodal cell foundation models while remaining distinct in its explicit focus on nuclear morphology.

#Key Features

  • Bidirectional generation: COSMIC models the flow of information in both directions, decomposing how much transcriptional variance is reflected in morphology and how much morphological variance is explained by gene expression.
  • Morphology foundation model backbone: The framework builds on a foundation model pretrained on more than 21 million segmented nuclei, giving it a rich, transferable representation of cellular form before any transcriptomic coupling.
  • Paired single-cell supervision via IRIS: Training leverages a newly generated dataset acquired with IRIS, which captures matched high-resolution images and transcriptomes from the same individual cells, enabling true cross-modal learning.
  • Continuous state modeling: Beyond discrete cell-type identity, COSMIC captures continuous dynamics such as cell-cycle progression, linking gradual morphological change to gene expression trajectories.
  • Disease-relevant discovery: In prostate cancer cells the model distinguished chemotherapy-responsive from resistant populations and surfaced morphology-associated genes tied to tumor state.

#Technical Details

COSMIC is a generative framework that couples a pretrained nuclear-morphology foundation model with transcriptomic embeddings to learn a shared cross-modal representation. The morphology encoder is pretrained self-supervised on over 21 million segmented single-cell nuclei, and the multimodal coupling is trained on paired image-transcriptome measurements produced by the IRIS platform. The model accurately reconstructs cell-type identity and recovers continuous processes such as cell-cycle progression, and in prostate cancer it separated drug-responsive from drug-resistant cells while nominating morphology-linked genes. Precise architecture details, parameter counts, and the name of the underlying nucleus-morphology foundation model are not disclosed in the preprint.

#Applications

COSMIC is aimed at researchers studying the relationship between cell form and function across basic and translational settings. By generating one modality from the other, it allows morphological imaging, which is cheap and high-throughput, to serve as a proxy for transcriptional state, and conversely lets expression data inform expected morphology. Demonstrated applications include cell-type classification, cell-cycle inference, and oncology use cases such as distinguishing chemotherapy-resistant from responsive prostate cancer cells and identifying candidate genes associated with tumor state, which could inform mechanistic studies and drug-response profiling.

#Impact

COSMIC illustrates that paired single-cell imaging and transcriptomics, combined with a morphology foundation model, can capture the bidirectional information flow between cellular form and gene expression rather than treating morphology as a downstream readout. As a January 2026 preprint its long-term adoption is not yet established, and important caveats apply: the work is not peer reviewed, no public code or model weights are available, the underlying morphology foundation model is unnamed and unlinked, and the preprint is released under a CC BY-NC license that restricts commercial reuse. These openness gaps currently limit independent reproduction, but the framework points toward a promising direction for mechanistic and predictive single-cell biology.

Openness

bio.rodeo opennessClosed · low usability and reproducibility
4Closed
Usability — can I run it?7
Reproducibility — can I retrain it?0
not reproducible
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

generativecell_type_annotationgene_expressiondiffusionautoencoderfoundation_modelmultimodalrepresentation_learningcell_biologyhistology

Resources

Research Paper