OKR-CELL

Cross-modal single-cell foundation model that aligns gene-expression profiles with LLM-enriched cell descriptions in a shared embedding space.

Released: January 2026

OKR-CELL is a cross-modal single-cell foundation model that aligns gene-expression profiles with natural-language descriptions of cells, released as a preprint in January 2026 by researchers at BGI Research. Most single-cell foundation models learn purely from expression data; OKR-CELL instead builds a shared embedding space between cells and text, so that a cell's transcriptome can be matched to a description of its type, tissue of origin, and biological context, and vice versa. The "OKR" in the name reflects two ideas central to the model: incorporating Open-world Knowledge and achieving Robust alignment across the two modalities.

The model's first innovation is how it produces the text side of each cell-text pair. Cell metadata alone is sparse, so OKR-CELL uses a large-language-model workflow with retrieval-augmented generation (RAG) to enrich each cell's textual description with open-world biological knowledge drawn from the literature, turning a handful of metadata fields into a richer narrative description. The second innovation addresses the noise this introduces: automatically generated descriptions and metadata are imperfect, so naively aligning cells to text would propagate errors.

To handle that noise, OKR-CELL introduces a Cross-modal Robust Alignment (CRA) objective that combines sample-reliability assessment, curriculum learning, and coupled-momentum contrastive learning. This makes the alignment resistant to mislabeled or low-quality pairs. Pretrained on roughly 32 million cell-text pairs, OKR-CELL reports state-of-the-art results across a broad suite of single-cell tasks and adds genuinely cross-modal capabilities, such as retrieving cells from a text query.

Key Features

Cell-language alignment: Learns a shared embedding space between single-cell expression profiles and textual descriptions, enabling bidirectional cell-text retrieval in addition to standard single-cell tasks.
Open-world knowledge enrichment: Uses an LLM-based retrieval-augmented generation workflow to expand sparse cell metadata into knowledge-rich textual descriptions grounded in the biomedical literature.
Cross-modal Robust Alignment (CRA): A noise-tolerant training objective combining sample-reliability assessment, curriculum learning, and coupled-momentum contrastive learning to guard against imperfect, automatically generated text labels.
Broad task coverage: A single pretrained model supports cell clustering, cell-type annotation, batch-effect correction, few-shot annotation, zero-shot annotation, and bidirectional cell-text retrieval.

Technical Details

OKR-CELL pairs a transformer-based cell encoder, which represents each cell as a sequence of gene tokens with expression and gene-identity embeddings, with a text encoder over the LLM-generated descriptions; the two are aligned by the contrastive CRA objective. The text pipeline retrieves relevant literature and synthesizes enriched descriptions for each cell, filtered for semantic consistency, while the CRA objective weights training pairs by an estimated reliability score and schedules them via curriculum learning, with a coupled-momentum memory bank providing stable negative samples. The model is pretrained on approximately 32 million cell-text pairs curated from public single-cell repositories together with their associated metadata.

Across six downstream tasks — cell clustering, cell-type annotation, batch-effect correction, few-shot annotation, zero-shot annotation, and bidirectional cell-text retrieval — the authors report state-of-the-art performance relative to existing single-cell foundation models and cross-modal baselines, including robustness to gene dropout. Exact parameter counts and full per-task tables are given in the paper. As a preprint, no public weights, code, or license were available at the time of writing, which limits independent reproduction; a separately posted bioRxiv version of this work was subsequently withdrawn, so the arXiv preprint is treated as the reference here.

Applications

OKR-CELL is aimed at single-cell transcriptomics workflows where labeled reference atlases are scarce or where annotation must transfer to new datasets. Its zero-shot and few-shot annotation capabilities let researchers assign cell types without a matched, fully labeled reference, and its batch-effect correction supports integrating data across studies and platforms. The cell-text retrieval capability is distinctive: a researcher can query the cell space with a natural-language description, or retrieve a textual characterization for an observed cell state, which is useful for hypothesis generation and for navigating large, heterogeneous single-cell collections.

Impact

OKR-CELL extends single-cell foundation models into the cross-modal, language-aware regime, joining a broader trend of grounding biological representations in text and open-world knowledge. Its emphasis on noise-robust alignment is a practical contribution, since LLM-generated and metadata-derived labels are inherently imperfect, and the CRA objective offers one recipe for learning from them without overfitting to errors. The model's longer-term influence will depend on release of artifacts and on independent benchmarking: as of the preprint, no code, weights, or license were public, evaluation rests on the authors' own task suite, and a parallel preprint version was withdrawn — all of which leave reproducibility as the key open question.

Citation

Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training

Preprint

Wang, H., et al. (2026) Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training. arXiv.org.

DOI: 10.48550/arXiv.2601.05648

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References59

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

23Closed

Usability — can I run it?15

Reproducibility — can I retrain it?18

Model Openness Framework

Unclassified

Missing required components

Resources

Research Paper

Key Features

Cell-language alignment: Learns a shared embedding space between single-cell expression profiles and textual descriptions, enabling bidirectional cell-text retrieval in addition to standard single-cell tasks.

Open-world knowledge enrichment: Uses an LLM-based retrieval-augmented generation workflow to expand sparse cell metadata into knowledge-rich textual descriptions grounded in the biomedical literature.

Cross-modal Robust Alignment (CRA): A noise-tolerant training objective combining sample-reliability assessment, curriculum learning, and coupled-momentum contrastive learning to guard against imperfect, automatically generated text labels.

Broad task coverage: A single pretrained model supports cell clustering, cell-type annotation, batch-effect correction, few-shot annotation, zero-shot annotation, and bidirectional cell-text retrieval.

Technical Details

Applications

Impact

Citation

Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training

Preprint

Wang, H., et al. (2026) Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training. arXiv.org.

DOI: 10.48550/arXiv.2601.05648

OKR-CELL

Key Features

Technical Details

Applications

Impact

Citation

Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

OKR-CELL

Key Features

Technical Details

Applications

Impact

Citation

Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

OKR-CELL

#Key Features

#Technical Details

#Applications

#Impact

Citation

Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

OKR-CELL

#Key Features

#Technical Details

#Applications

#Impact

Citation

Open World Knowledge Aided Single-Cell Foundation Model with Robust Cross-Modal Cell-Language Pre-training

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact