bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Single-cell foundation models
Single-cellSpatial omics

OmniCell

BGI Research

A foundation model jointly pretrained on 67M single-cell and spatial transcriptomic profiles to model intra-cellular expression and inter-cellular spatial dependencies.

Released: December 2025

OmniCell is a transcriptomic foundation model developed by BGI Research (Shenzhen) and released as a preprint in December 2025. It addresses a structural gap in the single-cell foundation-model landscape: most existing models treat each cell in isolation, learning representations from gene-expression vectors while discarding the spatial context in which cells actually reside. Yet tissue function emerges from how cells are arranged and how they communicate with their neighbors. OmniCell is presented as the first foundation model to jointly model intra-cellular gene expression and inter-cellular spatial dependencies within a single unified architecture.

The model is pretrained on a corpus of 67 million combined single-cell and spatial transcriptomic profiles, spanning dissociated single-cell RNA sequencing data and spatially resolved transcriptomics. By learning from both data types together, OmniCell aims to capture not only the regulatory and co-expression structure inside individual cells but also the organizational logic of how cells of different types are positioned relative to one another in tissue. This dual objective is intended to produce representations that transfer across both dissociated and spatial assays.

OmniCell fits alongside single-cell foundation models such as scGPT, Geneformer, and UCE on the expression side, and spatial-transcriptomics methods on the tissue side, but distinguishes itself by unifying the two regimes rather than specializing in either. It targets zero-shot deployment across several downstream tasks without task-specific retraining.

#Key Features

  • Joint intra- and inter-cellular modeling: OmniCell simultaneously learns within-cell gene-expression structure and between-cell spatial dependencies, rather than treating cells as independent observations as most single-cell foundation models do.
  • Large-scale unified pretraining: The model is pretrained on 67 million single-cell and spatial transcriptomic profiles, drawing on both dissociated and spatially resolved assays within one corpus.
  • Zero-shot cell-type deconvolution: OmniCell can deconvolve mixed-cell spatial spots into their constituent cell types without dataset-specific training.
  • Spatial domain delineation: The learned representations support identification of spatially coherent tissue domains, organizing a tissue section into regions of shared molecular identity.
  • Co-expression reconstruction: The model reconstructs gene co-expression relationships, recovering regulatory and interaction structure that can be degraded by sparse or noisy measurements.

#Technical Details

OmniCell is a transcriptomic foundation model pretrained in a self-supervised fashion on 67 million single-cell and spatial transcriptomic profiles. Its defining design choice is the joint treatment of intra-cellular expression and inter-cellular spatial relationships, allowing it to serve as a shared backbone for both dissociated single-cell and spatially resolved data. The preprint reports zero-shot performance across cell-type deconvolution, spatial domain delineation, and gene co-expression reconstruction, positioning OmniCell as a general-purpose representation learner for transcriptomics rather than a single-task model. Detailed architecture specifications, parameter counts, and full benchmark tables are described in the preprint; precise figures should be confirmed against the published version as the work is peer reviewed.

#Applications

OmniCell is intended for researchers working across single-cell and spatial transcriptomics who need a single pretrained backbone that operates in both regimes. Practical use cases include deconvolving spatial spots into cell-type composition, mapping spatial tissue domains in development and disease, and reconstructing co-expression networks from sparse data. Because the reported tasks are zero-shot, the model could lower the barrier for groups that lack the labeled data or compute needed to train task-specific models, particularly in spatial-omics settings where annotated references are scarce.

#Impact

OmniCell stakes out a notable position as a foundation model that unifies dissociated and spatially resolved transcriptomics under one pretraining objective, a direction the field has been moving toward as spatial assays proliferate. Its real influence will depend on independent validation and broader adoption. A significant caveat is availability: at the time of writing, the preprint provides no public code or model weights, and it is released under an all-rights-reserved license, which constrains reproducibility and reuse until the authors release artifacts or relax the terms.

Openness

bio.rodeo opennessClosed · low usability and reproducibility
9Closed
Usability — can I run it?7
Reproducibility — can I retrain it?10
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cell_type_annotationspatial_domain_identificationgene_expressiontransformerfoundation_modelself_supervisedzero_shottranscriptomicsspatial_transcriptomics

Resources

Research Paper