An 800M-parameter single-cell foundation model pre-trained on 100 million human cells via a RetNet architecture for cell annotation, perturbation prediction, and gene analysis.
CellFM is a large-scale single-cell foundation model developed by researchers at Sun Yat-sen University and collaborating Chinese institutions. At 800 million parameters trained on approximately 102 million human cells, it represents an eight-fold increase in scale over prior single-species single-cell models and is among the largest models trained exclusively on human transcriptomics data.
The central design argument is human specificity. Prior single-cell foundation models such as UCE and GeneCompass were trained on multi-species datasets; CellFM's authors hypothesize that mixing human and non-human data dilutes the representation of human-specific gene programs and cellular states. By restricting the training corpus to human scRNA-seq data, the model devotes its full capacity to the structure of human gene expression.
CellFM was first posted to bioRxiv in June 2024 and published in Nature Communications in May 2025. Pre-trained weights are available on HuggingFace, and fine-tuning code is available via the project's GitHub repository.
CellFM is built on an ERetNet backbone — a modified Retentive Network that adapts the retention mechanism of RetNet for single-cell transcriptomics. Two key architectural changes distinguish CellFM from the base RetNet: a gated bilinear network replaces the standard feedforward sublayer to improve representational capacity for sparse gene expression profiles, and DeepNorm normalization substitutes conventional LayerNorm to stabilize training at depth. Ablation studies in the published paper confirm each modification contributes independently, with their removal degrading average AUPR by 0.8% and 0.9% respectively on gene function prediction benchmarks.
Genes are treated as tokens with expression levels encoded as continuous input features. The model was pre-trained on Huawei's MindSpore framework using distributed training across a large compute cluster; PyTorch-compatible weights are available via HuggingFace. On downstream benchmarks, CellFM outperforms Geneformer, scGPT, scFoundation, UCE, and GeneCompass on cell type annotation (1.6–1.94% AUPR improvement over nearest competitors), perturbation prediction (~1% PCC improvement over scFoundation), and gene function prediction, while also achieving top performance on gene-gene relationship inference tasks.
CellFM is suited for computational biologists analyzing human single-cell RNA sequencing data. Its primary use cases include automated cell type annotation for large-scale atlas projects and rare cell type identification, perturbation response prediction for drug discovery and functional genomics screens, gene ontology function inference from expression context, and gene regulatory network reconstruction from learned co-expression embeddings. The model can also generate unified cell embeddings across heterogeneous datasets for batch-corrected comparison of cell states.
CellFM's publication in Nature Communications and the availability of pre-trained weights on HuggingFace have made it accessible to the broader single-cell community. Its scale and human specificity offer a meaningful benchmark advance over prior models, particularly on perturbation and gene function tasks where biological signal is subtle. Notable limitations constrain its scope: the model is not applicable to non-human organisms; it targets scRNA-seq specifically and does not natively handle ATAC-seq, spatial transcriptomics, or protein-level data; and performance may degrade with very low-depth sequencing where dropout effects are severe. The original MindSpore training environment adds friction for PyTorch-native workflows, though this is mitigated by the HuggingFace weight release.
Zeng, Y., et al. (2024) CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells. bioRxiv.
DOI: 10.1038/s41467-025-59926-5