Bayesian deep generative model that integrates gene regulatory networks into RNA velocity inference, enabling cell fate mapping and in silico perturbation of transcription factors.
Understanding how cells choose their fates during development requires not just knowing which genes are active, but understanding the regulatory logic that drives gene expression changes over time. RNA velocity methods have offered a way to estimate the direction and speed of transcriptional change from a single-cell snapshot by comparing unspliced to spliced RNA ratios, but they treat each gene independently, ignoring the gene regulatory networks (GRNs) that actually coordinate transcriptional programs. RegVelo addresses this gap by embedding a GRN directly into the dynamics of a Bayesian generative model, producing velocity estimates that are mechanistically grounded in how transcription factors regulate their targets.
RegVelo was developed through a collaboration between the Computational Health Center at Helmholtz Munich (led by Prof. Fabian J. Theis) and the Stowers Institute for Medical Research (led by Dr. Tatjana Sauka-Spengler), with first authorship from doctoral researcher Weixu Wang. The work was published in Cell on May 11, 2026. The method builds on the scVelo and veloVI lineage but departs fundamentally from the one-gene-at-a-time ODE approach: instead of inferring independent splicing kinetics per gene, RegVelo parameterizes the transcription rate of each gene as a function of regulator expression and a learned regulatory weight matrix, coupling all gene dynamics through a shared GRN.
This design makes RegVelo both more biologically realistic and more actionable. Because the regulatory connections are explicit parameters of the model, the framework supports counterfactual reasoning: a transcription factor can be silenced in silico by zeroing out its regulatory edges, and the resulting perturbation to the velocity field reveals which cell fates are suppressed or redirected. Applied to zebrafish neural crest development, RegVelo uncovered tfec as an early driver of pigment cell fate and identified elf1 as a novel pigment lineage regulator, predictions subsequently validated experimentally.
RegVelo is a variational autoencoder that takes unspliced and spliced RNA counts per cell as input. An encoder network maps each cell's expression profile to the parameters of a low-dimensional latent representation (following the veloVI architecture). A decoder then samples this representation to infer gene- and cell-specific latent time, as well as the splicing rate (βg) and degradation rate (γg) for each gene. The key innovation is in the transcription rate: rather than treating αg as a free parameter per cell, RegVelo parameterizes it as αg = h([Ws(t) + b]g), where W is a learned regulatory weight matrix encoding the GRN, s(t) is the spliced abundance vector at latent time t, b is a base transcription bias term, and h is a nonlinear activation function. This makes transcription a function of regulator activity, explicitly coupling all gene dynamics through the GRN structure.
Training uses a Gaussian likelihood over predicted spliced and unspliced abundances, optimized by ELBO maximization via stochastic variational inference. The GRN weight matrix W can be initialized from prior knowledge (e.g., ChIP-seq or ATAC-seq-derived networks) or learned de novo from expression data. Across benchmarks on human hematopoiesis and mouse pancreatic endocrinogenesis, RegVelo produced more consistent fate mapping and identification of putative driver genes than both scVelo and veloVI. The package is distributed as a Python library (pip install regvelo, requiring Python ≥3.10) built on the scverse ecosystem and integrates directly with AnnData objects.
RegVelo is designed for single-cell transcriptomics studies where understanding the regulatory basis of cell fate decisions is the primary goal. It is particularly suited to developmental biology datasets where investigators want to move beyond descriptive trajectory analysis to mechanistic questions: which transcription factors initiate a lineage commitment, and what happens to cell fate allocation if those factors are removed? By providing calibrated perturbation scores for every regulon in the model, RegVelo enables systematic in silico screens that prioritize candidates for experimental validation, reducing the number of costly and time-consuming knockout experiments. The framework is also applicable to hematopoiesis research, reprogramming studies, and cancer biology wherever GRN-driven state transitions are of interest.
RegVelo's publication in Cell represents a significant conceptual advance for the RNA velocity field, demonstrating that integrating prior regulatory knowledge into the dynamical model yields more biologically interpretable and experimentally predictive results. Its joint modeling of GRN structure and splicing kinetics closes a longstanding gap between trajectory inference methods and gene regulatory network models, two areas of the single-cell toolkit that had largely developed in parallel. The experimental validation in zebrafish neural crest development — identifying both known regulators and novel candidates such as elf1 — provides a compelling proof of concept that in silico perturbation predictions carry genuine mechanistic content. Built on the widely adopted scverse ecosystem and released with comprehensive documentation and tutorials, RegVelo is positioned for broad adoption in developmental and cell biology laboratories already working with scVelo or CellRank pipelines.
Wang, W., et al. (2024) RegVelo: gene-regulatory-informed dynamics of single cells. bioRxiv.
DOI: 10.1101/2024.12.11.627935