bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

AI-IDP

German Center for Neurodegenerative Diseases (DZNE)

A sequence-to-ensemble predictor that generates experiment-consistent conformational ensembles of intrinsically disordered proteins by pairing deep-learning fragment prediction with physics-aware assembly.

Released: March 2026

Intrinsically disordered proteins (IDPs) and disordered regions defy the single-structure paradigm that powers tools like AlphaFold: rather than folding into one dominant conformation, they sample broad, dynamic ensembles of interconverting states. Capturing this heterogeneity is essential for understanding signaling, phase separation, and the aggregation processes implicated in neurodegeneration, yet ensemble generation has traditionally required slow molecular dynamics simulations or laborious experiment-restrained modeling on a protein-by-protein basis.

AI-IDP, introduced in 2026 by Anton Abyzov and Markus Zweckstetter at the German Center for Neurodegenerative Diseases (DZNE), reframes ensemble prediction as a sequence-to-ensemble problem. The framework combines deep-learning prediction of local fragment structure with a physical-restraints-aware assembly step that stitches fragments into full-length, experiment-consistent conformational ensembles. Crucially, it operates zero-shot: a single trained model is applied to new sequences without any per-sequence retraining or fitting.

In the accompanying preprint, "Decoding conformational heterogeneity across disordered proteomes," the authors apply AI-IDP at scale to more than 3,000 disordered regions spanning human and non-human proteomes. This proteome-wide view distinguishes it from per-target ensemble methods and positions it as a foundation for surveying conformational behavior across entire organisms.

#Key Features

  • Sequence-to-ensemble prediction: Maps an amino acid sequence directly to a population of conformations, rather than a single static structure, capturing the dynamic nature of disordered proteins.
  • Hybrid deep-learning plus physics assembly: Pairs a deep-learning fragment structure predictor with a physical-restraints-aware assembly framework, so generated ensembles respect physical plausibility rather than relying on learned priors alone.
  • Zero-shot, proteome-scale application: Generalizes to thousands of new disordered regions across human and non-human proteomes without per-sequence retraining, enabling proteome-wide conformational surveys.
  • Experimentally validated observables: Ensembles are benchmarked against solution-state measurements including NMR chemical shifts and SAXS, grounding predictions in experimental data rather than structure alone.
  • Resolves transient secondary structure: Recovers residue-level secondary structure propensities, identifying pervasive transient alpha-helices and polyproline-II conformations that the authors report are evolutionarily tuned.

#Technical Details

AI-IDP decomposes the ensemble-generation task into deep-learning prediction of short fragment conformations followed by a physics-aware assembly stage that combines fragments under physical restraints into full-length conformer ensembles. Validation focuses on agreement with experimental observables: the ensembles reproduce NMR-derived measures (such as chemical shift–based secondary structure propensities, exemplified on disordered proteins like c-Myc and ACTR) and small-angle X-ray scattering (SAXS) data reporting on global dimensions. The preprint applies the method to over 3,000 disordered regions to characterize the prevalence and evolutionary conservation of transient helical and polyproline-II structure across disordered proteomes. The work is currently a bioRxiv preprint (v1 March 2026; v2 June 2026) and has not yet been peer reviewed.

#Applications

AI-IDP supports researchers studying the structure–function relationships of disordered proteins, which make up a large fraction of eukaryotic proteomes and are central to transcription, signaling, and biomolecular condensate formation. By producing experiment-consistent ensembles directly from sequence, it can prioritize candidate regions with functionally relevant transient structure, generate hypotheses for binding and aggregation studies, and provide structural context for variants in proteins that conventional folding predictors leave largely featureless. Its proteome-scale, zero-shot operation makes it suited to comparative and evolutionary analyses of disorder across organisms, of particular relevance to the neurodegeneration research mission at DZNE.

#Impact

AI-IDP extends the reach of sequence-based structure prediction into the disordered fraction of the proteome that remains poorly served by single-state predictors, complementing generative ensemble approaches such as IDPForge and sequence-design methods built on protein language models. Its emphasis on physical restraints and validation against NMR and SAXS observables aims to keep predicted ensembles experimentally grounded at scale. As a recent preprint, its downstream adoption is still emerging, and uptake may be limited by its restrictive preprint license and the absence of publicly released code or model weights at the time of writing; independent benchmarking and peer review will be important for assessing how reliably its ensembles generalize across diverse disordered sequences.

Citation

Decoding conformational heterogeneity across disordered proteomes

Abyzov, A. & Zweckstetter, M. (2026) Decoding conformational heterogeneity across disordered proteomes. bioRxiv.

DOI: 10.64898/2026.03.13.711260

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References57

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
4Closed
Usability — can I run it?7
Reproducibility — can I retrain it?0
not reproducible
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

conformational_ensemble_generationintrinsically_disordered_proteinsproteomicsstructure_predictiontransformerzero_shot

Resources

Research Paper