CLM-X

Multimodal single-cell foundation model whose multiway Transformer jointly models scRNA-seq and scATAC-seq from RNA-only, ATAC-only, or paired inputs.

Released: February 2026

CLM-X is a multimodal single-cell foundation model that jointly models single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq) within a single architecture. Transformer-based cell language models (CLMs) have become powerful tools for learning transferable cell representations, but most operate on a single modality. As multimodal single-cell profiling grows, the field has lacked a unified, flexible foundation model that can handle gene-expression and chromatin-accessibility data together—and gracefully handle datasets where only one modality is available.

Developed by Bowen Li and colleagues at the Hangzhou Institute of Medicine, Chinese Academy of Sciences, and posted to bioRxiv in February 2026, CLM-X is built on a multiway Transformer architecture. It uses a harmonized tokenization design and a stage-wise masked-reconstruction pretraining strategy so that RNA-only, ATAC-only, and paired RNA-ATAC inputs can all be processed within one framework. The multiway design lets the model route different modalities through shared and modality-specific pathways, learning representations that transfer across data types.

CLM-X is pretrained on million-scale unimodal and multimodal datasets and evaluated on five downstream tasks across ten benchmark datasets. The work is distributed under a CC BY-NC 4.0 license.

Key Features

Multiway Transformer: A flexible multiway architecture jointly models scRNA-seq and scATAC-seq, accommodating RNA-only, ATAC-only, and paired inputs in one framework.
Harmonized tokenization: A unified tokenization design represents expression and accessibility consistently, enabling shared modeling across modalities.
Stage-wise masked pretraining: A staged masked-reconstruction self-supervised strategy pretrains the model on million-scale data.
Five downstream tasks: Evaluated on batch correction, modality integration, cross-modal translation, cell type annotation, and perturbation prediction.
Strength in translation and perturbation: Reports particularly clear advantages in RNA-ATAC cross-modal translation and genetic-perturbation-response prediction.

Technical Details

CLM-X is a multiway Transformer foundation model pretrained with a stage-wise masked-reconstruction objective on million-scale unimodal and multimodal single-cell datasets. A harmonized tokenization scheme lets the model encode scRNA-seq and scATAC-seq consistently, while the multiway design supports RNA-only, ATAC-only, and paired RNA-ATAC inputs without separate models. The authors benchmark CLM-X on ten datasets across five tasks—batch correction, modality integration, cross-modal translation, cell type annotation, and perturbation prediction—and report that it consistently outperforms existing multimodal methods and unimodal foundation models, with the clearest gains in RNA-ATAC cross-modal translation and genetic-perturbation-response prediction. The preprint does not disclose a specific parameter count, and code/weights availability is not specified at the time of writing.

Applications

CLM-X targets integrative single-cell analysis where researchers combine transcriptomic and epigenomic measurements. Its unified modeling supports common workflows—correcting batch effects, integrating modalities, annotating cell types, translating between RNA and ATAC, and predicting responses to genetic perturbations—and is especially useful when datasets are partially paired or single-modality, a frequent situation in real multimodal atlases. Computational biologists building or analyzing single-cell multi-omic atlases are the primary beneficiaries.

Impact

CLM-X extends the single-cell foundation-model paradigm from unimodal expression toward unified RNA-plus-ATAC modeling, addressing an underexplored gap in flexible multimodal pretraining. Its reported advantages in cross-modal translation and perturbation prediction point toward foundation models that reason jointly about gene regulation and expression. As a February 2026 bioRxiv preprint, released code and weights are not yet confirmed, and independent benchmarking against established multimodal integration tools will determine how broadly it is adopted; the CC BY-NC license also restricts commercial use.

Citation

CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis

Li, B., et al. (2026) CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis. bioRxiv.

DOI: 10.64898/2026.02.17.704943

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References52

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

4Closed

Usability — can I run it?6

Reproducibility — can I retrain it?0

not reproducible

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

Research Paper

Key Features

Multiway Transformer: A flexible multiway architecture jointly models scRNA-seq and scATAC-seq, accommodating RNA-only, ATAC-only, and paired inputs in one framework.

Harmonized tokenization: A unified tokenization design represents expression and accessibility consistently, enabling shared modeling across modalities.

Stage-wise masked pretraining: A staged masked-reconstruction self-supervised strategy pretrains the model on million-scale data.

Five downstream tasks: Evaluated on batch correction, modality integration, cross-modal translation, cell type annotation, and perturbation prediction.

Strength in translation and perturbation: Reports particularly clear advantages in RNA-ATAC cross-modal translation and genetic-perturbation-response prediction.

Technical Details

Applications

Impact

Citation

CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis

Li, B., et al. (2026) CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis. bioRxiv.

DOI: 10.64898/2026.02.17.704943

CLM-X

Key Features

Technical Details

Applications

Impact

Citation

CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

CLM-X

Key Features

Technical Details

Applications

Impact

Citation

CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

CLM-X

#Key Features

#Technical Details

#Applications

#Impact

Citation

CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

CLM-X

#Key Features

#Technical Details

#Applications

#Impact

Citation

CLM-X: A multimodal single-cell foundation model with flexible multi-way Transformer for unified scRNA-seq and scATAC-seq analysis

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact