Large causal cell model trained on cancer perturbation data that generalizes zero-shot to patient-derived cells for therapeutic target prioritization.
A central challenge in drug discovery is bridging the gap between perturbation experiments performed in laboratory cell lines and the biology of real patients. Virtual cell models that predict how cells respond to genetic or chemical perturbations could accelerate target identification, but most are trained and evaluated within the same data distribution and struggle to generalize to cell types they have never seen. Identifying which molecular regulators drive a cell from a diseased to a healthy state, in a way that transfers to patient-derived contexts, remains difficult.
TwinCell, developed by the Paris-based biotech DeepLife and posted in early 2026, is a "large causal cell model" trained on in vitro cancer perturbation data and designed to generalize zero-shot to patient-derived cell types for therapeutic target prioritization. Rather than only fitting observed responses, the model is framed around recovering causal regulators of cellular state transitions. It conditions a multiomics interactome on single-cell foundation model (scFM) embeddings, combining a regulatory-network view with learned representations of cell state.
To assess this kind of model, DeepLife also introduces TwinBench, a benchmarking framework that evaluates virtual cell models using recommendation-system metrics, treating target prioritization as a ranking problem. Across five therapeutic areas, TwinCell recovers known drug targets and disease pathways without disease-specific training, including interferon signaling in lupus.
TwinCell is a causal cell model that conditions a multiomics interactome on single-cell foundation model (scFM) embeddings and is trained on in vitro cancer perturbation data. It is evaluated with TwinBench, which casts therapeutic target prioritization as a ranking task and applies recommendation-system metrics to score how well a model surfaces true targets. In zero-shot transfer to patient-derived cell types across five therapeutic areas, the model recovers established drug targets and disease pathways, with interferon signaling in lupus highlighted as a recovered example, without disease-specific training. The preprint is released under a CC-BY license; no public code or model weights accompany the release.
TwinCell is aimed at target discovery and prioritization in pharmaceutical and biotech research, particularly where teams want to translate cell-line perturbation screens into hypotheses about patient biology. Because it generalizes zero-shot to patient-derived cell types, it can suggest candidate regulators for diseases without requiring perturbation data specific to that disease. The TwinBench framework additionally gives practitioners a quantitative way to compare virtual cell models on target ranking, which is useful for benchmarking internal and external methods.
TwinCell reflects a broader push toward "virtual cell" models that move beyond fitting observed expression to reasoning about causal interventions and their translation to patients. Pairing the model with TwinBench addresses a recurring weakness in the field, the lack of standardized, decision-relevant evaluation, by scoring target prioritization with ranking metrics. The main limitations for external adopters are practical: as a recent preprint from an industry group, it ships without public code or weights, so its zero-shot claims and benchmark results await independent validation.