GRN-informed single-cell foundation model combining gene regulatory hierarchy priors with long-sequence Mamba modeling for clustering, batch integration, perturbation modeling, and drug response prediction.
RegFormer is a single-cell foundation model developed by BGI Research and first posted to bioRxiv in January 2025 (later published in Nature Communications in 2026) that combines gene regulatory network (GRN) priors with long-sequence Mamba state-space modeling. By incorporating regulatory-hierarchy priors derived from public GRN databases, RegFormer biases its representations toward biologically meaningful regulatory dependencies rather than relying purely on data-driven attention.
The Mamba backbone enables efficient long-sequence modeling — operating over thousands of gene tokens per cell — at lower computational cost than full self-attention transformers. Across clustering, batch integration, perturbation modeling, and drug response prediction benchmarks, RegFormer consistently outperforms scGPT, Geneformer, scFoundation, and scBERT.
RegFormer uses a Mamba-based state-space backbone with GRN-derived priors integrated through gene-token embeddings. Pretraining is self-supervised over a large pan-tissue scRNA-seq corpus. The published paper reports architectural details, training schedule, GRN preprocessing, and comprehensive benchmark comparisons against prior single-cell FMs.
RegFormer is suited for single-cell research groups working on perturbation response prediction, drug response modeling, and integrated multi-batch analysis. The GRN-priored architecture is particularly valuable when ground-truth regulatory knowledge is available for the system under study and when interpretable representations are desired.
RegFormer is among the first single-cell foundation models to combine state-space architectures (Mamba) with explicit biological priors (GRNs), establishing a useful template for knowledge-augmented single-cell FMs. The consistent improvements over scGPT, Geneformer, scFoundation, and scBERT on multiple downstream tasks suggest that informative biological priors continue to provide meaningful signal even at the foundation-model scale.
Hu, L., et al. (2026) RegFormer: a single-cell foundation model powered by gene regulatory hierarchies. Nature Communications.
DOI: 10.1038/s41467-026-72198-xHu, L., et al. (2025) RegFormer: A Single-Cell Foundation Model Powered by Gene Regulatory Hierarchies. bioRxiv.
DOI: 10.1101/2025.01.24.634217Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data