A signed heterogeneous graph foundation model pretrained on the SIGMA-KG knowledge graph for zero-shot drug mode-of-action, clinical response, and drug-drug interaction prediction.
FLASH (Fast Lightweight Architecture for Signed Heterogeneous GNN) is a graph foundation model designed to predict how drugs act on biological systems without task-specific fine-tuning. Drug discovery increasingly relies on integrating many incompatible data modalities — chemogenomic perturbation, transcriptomics, proteomics, and clinical records — into a single reasoning substrate. FLASH addresses this by learning general-purpose representations over a large, signed, heterogeneous biomedical knowledge graph, then applying them zero-shot to several distinct downstream tasks.
The model was developed by Mottaqi, Zhang, Adoremos, Zhang, and Xie in the Lei Xie lab at Hunter College, CUNY, and released as a bioRxiv preprint in May 2026. Its central contribution is the pairing of a lightweight signed heterogeneous graph-neural-network architecture with SIGMA-KG (SIGned Multi-omics Atlas Knowledge Graph), a purpose-built training substrate that encodes the directionality of biological effects — for example, whether a perturbation up- or down-regulates a target — rather than treating all relationships as undirected and unsigned.
FLASH sits within the emerging class of biomedical graph foundation models that aim to replace narrow, single-task predictors with one pretrained model transferable across drug-related problems. By emphasizing signed edges and a compact architecture, it targets both biological fidelity and practical efficiency.
FLASH is pretrained via self-supervised learning on SIGMA-KG, which integrates 16 heterogeneous biomedical sources spanning chemogenomic perturbation, transcriptomics, proteomics, and clinical data into roughly 127,000 nodes and 3.8 million signed edges. The signed, heterogeneous structure lets the network distinguish edge polarity and type during message passing. Evaluated zero-shot against nine baselines across drug mode-of-action, clinical response, and drug-drug interaction prediction, FLASH matched or exceeded competing methods without any fine-tuning. For inductive drug repurposing across four diseases, its predictions reached a 69.6% external clinical validation rate, indicating that a substantial fraction of prioritized drug-disease pairs were supported by independent clinical evidence.
FLASH supports computational drug discovery workflows, including hypothesizing a candidate's mode of action, anticipating clinical response, flagging potential drug-drug interactions, and prioritizing repurposing candidates for specific diseases. Because it operates zero-shot, it can be applied to new drugs and indications without assembling labeled training sets for each task, which benefits pharmacology, translational research, and systems-biology groups seeking mechanism-aware predictions grounded in multi-omics and clinical evidence.
FLASH demonstrates that a single signed heterogeneous graph foundation model, pretrained on an integrated multi-omics knowledge graph, can serve several distinct drug-prediction tasks competitively without fine-tuning, and that its inductive repurposing predictions can be corroborated by external clinical evidence at a meaningful rate. As a recent bioRxiv preprint, its results await peer review, and as of release no public code or model weights have been confirmed; the work is distributed under a non-commercial CC BY-NC 4.0 license, which constrains commercial reuse. These factors temper, but do not negate, its contribution toward unified, mechanism-aware foundation models for drug discovery.
Mottaqi, M., et al. (2026) A Scalable Sign-Aware Multi-Omics Knowledge Graph Foundation Model for Mechanistic Drug Action and Clinical Response Predictions. bioRxiv.
DOI: 10.64898/2026.04.29.721775