Stoic

Predicts protein complex stoichiometry from amino acid sequence alone, ranking copy numbers in seconds and exporting AlphaFold3-ready JSON files.

Released: March 2026

Modern structure-prediction systems such as AlphaFold-Multimer and AlphaFold3 have transformed protein complex modeling, but they require the stoichiometry of a complex — the copy number of each distinct protein entity — to be specified in advance. For the many complexes whose composition is unknown, the standard workaround is a brute-force search that runs structure prediction across many candidate stoichiometry combinations, an approach that is both computationally expensive and frequently inaccurate.

Stoic, developed by Daniil Litvinov, Janani Durairaj, Torsten Schwede and colleagues at the University of Basel (Biozentrum and SIB), addresses this gap by predicting complex stoichiometry directly from amino acid sequence, with no structure prediction in the loop. Posted to bioRxiv in March 2026, it reframes stoichiometry as a sequence-level learning problem and produces ranked copy-number predictions in seconds, along with AlphaFold3-ready JSON files that can be fed directly into downstream structure prediction.

By learning to recognize interface-relevant features rather than relying on global sequence statistics, Stoic offers a fast, accessible front end for assembling protein complexes whose composition was previously a bottleneck.

Key Features

Sequence-only prediction: Estimates per-entity copy numbers from amino acid sequences alone, removing the need for expensive brute-force structure prediction over candidate stoichiometries.
Interface-aware representation: Learns to identify residues that participate in protein-protein interactions rather than depending on global sequence features, improving discrimination of homomeric versus heteromeric assemblies.
AlphaFold3-ready output: Exports JSON specifying predicted stoichiometries so results plug directly into AF3 structure-prediction pipelines.
Ranked top-N predictions: Returns multiple ranked stoichiometry hypotheses, supporting downstream evaluation when the top prediction is uncertain.
Open and hosted: Released under the MIT license with pretrained weights on HuggingFace, a hosted web demo, and a Colab notebook for no-install use.

Technical Details

Stoic uses ESM2-650M to compute residue-level embeddings for each unique protein entity in a complex, then aggregates them into fixed-length per-entity representations via a learned weighted pooling mechanism. These pooled embeddings serve as node features in a fully connected graph that is passed to a graph convolutional network (GCN), which outputs copy numbers as node labels. The task is cast as multi-class classification over 14 copy-number classes, allowing prediction of both homomeric and heteromeric stoichiometries. The pipeline is available as a command-line tool (stoic_predict_stoichiometry), a Python API, and a HuggingFace Space, accepting FASTA input and emitting top-N ranked predictions plus AF3-ready JSON.

Applications

Stoic is aimed at structural biologists and computational researchers who need to model protein complexes whose composition is not known a priori. It can serve as a rapid pre-processing step ahead of AlphaFold3 or AlphaFold-Multimer, narrowing the space of stoichiometries to evaluate and avoiding combinatorial structure-prediction sweeps. Use cases include interpreting interactomics and cross-linking data, prioritizing assembly hypotheses for cryo-EM or crystallography, and large-scale annotation of complexes across proteomes.

Impact

By decoupling stoichiometry inference from structure prediction, Stoic targets a long-standing practical limitation of complex modeling that becomes acute at scale. Its lightweight, openly licensed implementation — with hosted inference and AF3-compatible exports — lowers the barrier for routine use within existing structure-prediction workflows. As one of several 2025–2026 efforts tackling stoichiometry prediction, it contributes to a maturing toolkit for moving from sequence to assembled complex structures with less manual trial and error.

Citation

Stoic: Fast and accurate protein stoichiometry prediction

Litvinov, D., et al. (2026) Stoic: Fast and accurate protein stoichiometry prediction. bioRxiv.

DOI: 10.64898/2026.03.13.711535

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations6

Influential1

References0

GitHub

Stars15

Forks0

Open Issues0

Contributors2

Last Push4mo ago

LanguagePython

LicenseMIT

HuggingFace

Downloads160

Likes0

Last Modified4mo ago

Pipelinegraph-ml

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

59Partial

Usability — can I run it?93

Reproducibility — can I retrain it?29

open weights, closed recipe

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

GitHub Repository Research Paper HuggingFace Model Demo

Key Features

Sequence-only prediction: Estimates per-entity copy numbers from amino acid sequences alone, removing the need for expensive brute-force structure prediction over candidate stoichiometries.

Interface-aware representation: Learns to identify residues that participate in protein-protein interactions rather than depending on global sequence features, improving discrimination of homomeric versus heteromeric assemblies.

AlphaFold3-ready output: Exports JSON specifying predicted stoichiometries so results plug directly into AF3 structure-prediction pipelines.

Ranked top-N predictions: Returns multiple ranked stoichiometry hypotheses, supporting downstream evaluation when the top prediction is uncertain.

Open and hosted: Released under the MIT license with pretrained weights on HuggingFace, a hosted web demo, and a Colab notebook for no-install use.

Technical Details

Applications

Impact

Stoic

Key Features

Technical Details

Applications

Impact

Citation

Stoic: Fast and accurate protein stoichiometry prediction

Recent citations

Top citations

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Stoic

Key Features

Technical Details

Applications

Impact

Citation

Stoic: Fast and accurate protein stoichiometry prediction

Recent citations

Top citations

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Stoic

#Key Features

#Technical Details

#Applications

#Impact

Citation

Stoic: Fast and accurate protein stoichiometry prediction

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Stoic

#Key Features

#Technical Details

#Applications

#Impact

Citation

Stoic: Fast and accurate protein stoichiometry prediction

Recent citations

Top citations

Related models

Citations

GitHub

HuggingFace

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact