Columbia University / Chan Zuckerberg Biohub New York
A causal multimodal Transformer that embeds the do-operator within attention to predict single-cell responses to gene perturbations, including unseen ones.
DoFormer is a causal multimodal Transformer for predicting how single cells respond to genetic perturbations. Single-cell transcriptomics combined with high-throughput perturbation screens (such as Perturb-seq) can measure the effects of knocking out or activating individual genes, but the combinatorial space of possible perturbations is far larger than any experiment can cover. The central challenge is to predict the transcriptional consequences of perturbations that were never directly measured, including in cell types or contexts outside the training data. DoFormer addresses this by learning a generalizable map from a perturbation to its downstream effect on gene expression.
The model's defining idea is to embed Pearl's causal do-operator directly into the attention mechanism, allowing the network to distinguish observational data (what is seen in unperturbed cells) from interventional data (what happens when a gene is experimentally forced to a new state). Rather than requiring an explicit causal graph (DAG) to be specified or inferred in advance, DoFormer represents interventions natively within its architecture, sidestepping the strong structural assumptions that limit many causal-inference methods in genomics. It is trained once on broad perturbational scRNA-seq data and then applied to new inputs, supporting in-silico perturbation prediction for previously unseen perturbations.
DoFormer was introduced in a 2026 bioRxiv preprint by Karbalayghareh, Paull, and Califano from the Califano lab at Columbia University, affiliated with the Chan Zuckerberg Biohub New York.
DoFormer is a Transformer architecture whose attention is modified to encode the causal do-operator, separating observational from interventional conditioning. It is trained on broad perturbational single-cell RNA-seq data that pairs applied gene perturbations with their measured transcriptional outcomes, and learns to predict expression responses to new perturbations at single-cell resolution. According to the preprint, DoFormer outperforms established single-cell and perturbation-modeling baselines including scGPT, Geneformer, and GEARS on perturbation-response prediction tasks. As a bioRxiv preprint (released 2026-05-04, CC BY-NC 4.0), these results have not yet undergone peer review, and detailed hyperparameters and training-corpus composition should be confirmed against the source.
DoFormer is aimed at computational and experimental biologists who use perturbation screens to dissect gene-regulatory networks and disease mechanisms. By predicting the effects of perturbations that have not been experimentally tested, it can prioritize candidate targets, guide the design of follow-up Perturb-seq experiments, and support in-silico exploration of intervention effects across cell states. This is particularly valuable for target discovery and mechanistic studies where exhaustive experimental coverage of the perturbation space is infeasible.
By formalizing interventions through the do-operator inside a Transformer, DoFormer offers a causally grounded alternative to purely correlational single-cell foundation models for perturbation prediction. Its reported gains over scGPT, Geneformer, and GEARS suggest that explicitly modeling the observational/interventional distinction can improve generalization to unseen perturbations. As of this writing the work is a preprint with no public code or weights confirmed, and it is released under a non-commercial license (CC BY-NC 4.0), so independent reproduction and broader adoption remain to be established.
Karbalayghareh, A., et al. (2026) DoFormer: Causal Transformer for Gene Perturbation. bioRxiv.
DOI: 10.64898/2026.05.02.722054