bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Single-cell foundation models
Single-cellProtein

Shusi

Zhejiang University

An LLM-enhanced variational graph autoencoder pretrained on pan-cancer single-cell networks to predict context-specific protein-protein interactions at single-cell resolution.

Released: April 2025

Protein-protein interactions (PPIs) are highly context dependent: the same two proteins may associate in one cell state and not in another, and these rewirings are central to how tumors progress and resist therapy. Most reference PPI databases, however, are static and aggregated across tissues, leaving the cell-state-specific networks that drive cancer largely unmapped. Shusi addresses this gap by inferring context-specific protein networks directly at single-cell resolution, turning single-cell transcriptomes into testable hypotheses about which interactions are active in a given cell.

Developed by Jiajun Yu and colleagues at Zhejiang University and released as a bioRxiv preprint in April 2025, Shusi is an LLM-enhanced variational graph autoencoder. It couples a graph neural network over cell-specific interaction graphs with protein representations distilled from a large language model, allowing the model to reason about interactions using both expression context and prior biological knowledge encoded in text.

Shusi is pretrained once on a large pan-cancer corpus and then applied without per-dataset retraining, positioning it as a reusable foundation model for single-cell network biology rather than a one-off classifier fit to a single study.

#Key Features

  • Context-specific PPI inference: Predicts which protein interactions are active in individual cell states rather than returning a single static consensus network, capturing rewiring across the tumor microenvironment.
  • LLM-enhanced graph model: Combines a graph isomorphism network over cell-level interaction graphs with protein and gene representations derived from a large language model, fusing expression context with prior knowledge.
  • Pan-cancer pretraining: Trained on 71,575 networks spanning 23 cancer types, giving broad coverage of malignant and microenvironmental cell states.
  • Two inference modes: A "discovery mode" generates de novo PPI predictions, while a "benchmark mode" masks known edges to evaluate reconstruction accuracy (reported as precision@10,000).
  • No per-dataset retraining: Inference loads a single pretrained checkpoint (shusi.pth) and runs directly on new samples, lowering the barrier to applying the model to fresh datasets.

#Technical Details

Shusi is built as a variational graph autoencoder. Each cell is represented as a graph whose nodes are genes or proteins; node features integrate expression values with two precomputed embedding maps distilled from a large language model (a gene-level and a sentence/annotation-level embedding). A graph isomorphism network encodes these graphs into a latent space from which the decoder reconstructs edges, framing PPI prediction as a self-supervised link-reconstruction task. The model was pretrained on 71,575 pan-cancer single-cell networks drawn from 23 cancer types, and the released pipeline ships a single pretrained checkpoint plus the two embedding feature maps. The implementation uses PyTorch and PyTorch Geometric, runs on GPU with automatic CPU fallback, and reports benchmark performance via a precision@10,000 metric on masked edges; the preprint does not report a single headline accuracy number, so quantitative comparisons should be read from the paper directly.

#Applications

Shusi is aimed at cancer biologists and computational researchers who want to move from cell-type catalogs to mechanism. By surfacing the protein interactions that are specifically active within tumor or microenvironmental cell states, it can nominate candidate signaling axes, prioritize potential therapeutic targets, and generate hypotheses about how interaction networks differ between responders and non-responders or across tumor subtypes. Because it consumes standard single-cell expression inputs and runs without retraining, it can be layered onto existing single-cell analysis pipelines as a network-inference step.

#Impact

Shusi contributes to a growing effort to make protein interaction networks dynamic and cell-state aware rather than static, and it is notable for marrying graph-based single-cell modeling with LLM-derived protein knowledge in a single pretrained framework. As a 2025 preprint, its real-world adoption and independent validation are still emerging, and several practical caveats apply: the work has not yet been peer reviewed, the code repository does not declare a license, and the pretrained weights are distributed via Google Drive rather than a persistent model registry, which may complicate long-term reproducibility. Researchers should treat its predictions as hypotheses for experimental follow-up.

Citation

Systematic discovery of single-cell protein networks in cancer with Shusi

Preprint

Zhang, T., et al. (2025) Systematic discovery of single-cell protein networks in cancer with Shusi. bioRxiv.

DOI: 10.1101/2025.04.27.649905

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References78

GitHub

Stars1
Forks0
Open Issues0
Contributors1
Last Push5mo ago
LanguagePython

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
20Closed
Usability — can I run it?23
Reproducibility — can I retrain it?6
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cancerfoundation_modelgraph_neural_networkprotein_protein_interaction_predictionself_supervisedsingle_cell_transcriptomicsvariational_graph_autoencoder

Resources

GitHub RepositoryResearch Paper