bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Spatial omics foundation models
Spatial omicsSingle-cell

MIMYR

Carnegie Mellon University

A generative framework that reconstructs missing spatial transcriptomics tissue regions by jointly predicting cell locations, cell types, and gene expression.

Released: November 2025

MIMYR is a generative framework for reconstructing missing regions of spatial transcriptomics (ST) tissue maps. Spatial transcriptomics measures gene expression while preserving the physical coordinates of cells within a tissue, but real datasets are frequently incomplete: tissue sections tear during handling, regions are excluded for quality reasons, or capture areas simply do not cover the full anatomy of interest. These gaps undermine downstream analyses of tissue organization, cell-cell interaction, and spatial gene-expression gradients. MIMYR addresses this by treating reconstruction as a conditional generation problem, filling in absent regions in a way that is consistent with the surrounding observed tissue.

Developed by Ajinkya Deshpande, Zhilei Bei, Jian Ma, and Spencer Krieger in the Jian Ma lab at Carnegie Mellon University and released as a bioRxiv preprint in November 2025, MIMYR decomposes the reconstruction task into three coupled sub-problems that mirror the structure of ST data itself: where cells sit, what type each cell is, and what genes each cell expresses. Rather than predicting a single modality in isolation, the framework chains together specialized models so that downstream predictions are conditioned on upstream ones, producing spatially coherent and biologically plausible reconstructions.

This staged, modality-aware design distinguishes MIMYR from imputation methods that operate purely on expression matrices without explicitly modeling cellular spatial arrangement, and positions it as a tool for completing partial tissue atlases.

#Key Features

  • Guided diffusion for cell placement: A diffusion model generates the spatial coordinates of cells in the missing region, guided by the observed surrounding tissue so that reconstructed cell densities and arrangements match the local context.
  • Cell-type classification: A classifier assigns a cell type to each generated location, anchoring the reconstruction in the discrete cellular identities that organize real tissue.
  • Conditioned expression model: A transformer predicts gene expression for each cell conditioned on its inferred location and cell type, completing the full ST profile rather than coordinates or labels alone.
  • Staged, composable pipeline: The three modules run in sequence so each prediction informs the next, allowing the framework to capture dependencies between spatial position, identity, and molecular state.
  • Plug-and-play checkpoints: Pretrained weights download automatically at inference time, letting users reproduce results without retraining, with optional fine-tuning available for adapting to new tissue samples.

#Technical Details

MIMYR is built from three integrated components. The location module is a denoising diffusion probabilistic model (DDPM) that generates two-dimensional cell coordinates for the missing region and can incorporate kernel-density biological priors to bias placement toward realistic spatial distributions. The cell-type module is a neural-network classifier that labels each generated position. The expression module is a transformer that produces a per-cell gene expression vector conditioned on the predicted location and cell type. Inputs and outputs follow the standard .h5ad AnnData format containing spatial coordinates, cluster labels, and expression matrices. The released pretrained checkpoints are currently scoped to mouse brain tissue, and the framework automatically downloads the required data and model weights when run in inference mode. Evaluation in the repository reports per-slice reconstruction metrics such as soft accuracy; users can fine-tune on their own samples by running the training mode to produce new checkpoints.

#Applications

MIMYR is aimed at researchers working with spatial transcriptomics atlases who need to recover incomplete or damaged tissue sections. By reconstructing missing regions with coherent cell positions, types, and expression, it can help restore continuity in tissue maps for studies of spatial organization, regional gene-expression patterns, and cellular neighborhoods, and can support the assembly of more complete reference atlases. The current mouse-brain checkpoints make it immediately useful for neuroscience-oriented ST work, while the fine-tuning pathway extends it to other tissues.

#Impact

MIMYR contributes a modality-decomposed generative approach to a practical and underaddressed problem in spatial transcriptomics: completing tissue maps that are partial by accident or design. As a recent (November 2025) bioRxiv preprint, its broader adoption remains to be established, and the released weights are limited to mouse brain, so generalization to other tissues depends on fine-tuning with suitable data. Its explicit modeling of cell location, identity, and expression as a coupled generative pipeline offers a template that may influence future work on spatially aware reconstruction and imputation. The code is openly available on GitHub, though no software license has been declared, and the preprint is released under CC BY-NC-ND.

Citation

MIMYR: Generative modeling of missing tissue in spatial transcriptomics

Preprint

Deshpande, A., et al. (2025) MIMYR: Generative modeling of missing tissue in spatial transcriptomics. bioRxiv.

DOI: 10.1101/2025.11.24.690239

Recent citations

Papers that recently cited this model.

  • Reconstructing True 3D Spatial Omics at Single-Cell Resolution

    Yuhang Yang, Yiming Luo, Kai Zhang, et al.

    bioRxiv · May 2026

    0
  • MORPHE: Bridging Image Generation and Spatial Omics for Tissue Synthesis

    Yuan Feng, Zachary Robers, Leyla Rasheed, et al.

    bioRxiv · Mar 2026

    2

Top citations

The most-cited papers that cite this model.

  • MORPHE: Bridging Image Generation and Spatial Omics for Tissue Synthesis

    Yuan Feng, Zachary Robers, Leyla Rasheed, et al.

    bioRxiv · Mar 2026

    2
  • Reconstructing True 3D Spatial Omics at Single-Cell Resolution

    Yuhang Yang, Yiming Luo, Kai Zhang, et al.

    bioRxiv · May 2026

    0

Citations

Total Citations2
Influential0
References28

GitHub

Stars0
Forks0
Open Issues0
Contributors1
Last Push1mo ago
LanguagePython

Fields of citing research

  • Biology100%
  • Computer Science100%
  • Medicine50%

Share of papers citing this model.

Openness

bio.rodeo opennessClosed · low usability and reproducibility
16Closed
Usability — can I run it?16
Reproducibility — can I retrain it?18
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cell_biologycell_type_annotationdata_imputationdiffusiongene_expression_predictiongenerativeself_supervisedspatial_transcriptomicstransformer

Resources

GitHub RepositoryResearch Paper