bio.rodeo
HomeCompetitorsLeaderboardOrganizations
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

© 2026 bio.rodeo. All rights reserved.
Single-cell

RegFormer

Chinese Academy of Sciences

GRN-informed single-cell foundation model combining gene regulatory hierarchy priors with long-sequence Mamba modeling for clustering, batch integration, perturbation modeling, and drug response prediction.

Released: 2026

Overview

RegFormer is a single-cell foundation model published in Nature Communications in 2026 that combines gene regulatory network (GRN) priors with long-sequence Mamba state-space modeling. By incorporating regulatory-hierarchy priors derived from public GRN databases, RegFormer biases its representations toward biologically meaningful regulatory dependencies rather than relying purely on data-driven attention.

The Mamba backbone enables efficient long-sequence modeling — operating over thousands of gene tokens per cell — at lower computational cost than full self-attention transformers. Across clustering, batch integration, perturbation modeling, and drug response prediction benchmarks, RegFormer consistently outperforms scGPT, Geneformer, scFoundation, and scBERT.

Key Features

  • Gene regulatory network priors: Regulatory hierarchies from public GRN databases shape model architecture and training, biasing learned representations toward known regulatory dependencies.
  • Mamba long-sequence backbone: State-space architecture enables efficient processing of thousands of gene tokens per cell without quadratic attention cost.
  • Multi-task SOTA: Consistently outperforms scGPT, Geneformer, scFoundation, and scBERT on clustering, batch integration, perturbation modeling, and drug response prediction.
  • Knowledge-data integration: Demonstrates that regulatory priors provide signal beyond what scale alone delivers.
  • Open code and weights: Published in Nature Communications with code and model weights released for community use.

Technical Details

RegFormer uses a Mamba-based state-space backbone with GRN-derived priors integrated through gene-token embeddings. Pretraining is self-supervised over a large pan-tissue scRNA-seq corpus. The published paper reports architectural details, training schedule, GRN preprocessing, and comprehensive benchmark comparisons against prior single-cell FMs.

Applications

RegFormer is suited for single-cell research groups working on perturbation response prediction, drug response modeling, and integrated multi-batch analysis. The GRN-priored architecture is particularly valuable when ground-truth regulatory knowledge is available for the system under study and when interpretable representations are desired.

Impact

RegFormer is among the first single-cell foundation models to combine state-space architectures (Mamba) with explicit biological priors (GRNs), establishing a useful template for knowledge-augmented single-cell FMs. The consistent improvements over scGPT, Geneformer, scFoundation, and scBERT on multiple downstream tasks suggest that informative biological priors continue to provide meaningful signal even at the foundation-model scale.

Citation

RegFormer: a single-cell foundation model powered by gene regulatory hierarchies

Hu, L., et al. (2026) RegFormer: a single-cell foundation model powered by gene regulatory hierarchies. Nature Communications.

DOI: 10.1038/s41467-026-72198-x

Metrics

Citations

Total Citations0
Influential0
References0

Tags

cell clusteringbatch integrationperturbation modelingdrug response predictionstate-space modelMambaself-supervisedfoundation modelsingle-cell transcriptomegene regulatory network

Resources

Research Paper