bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

RigidSSL

Chinese University of Hong Kong

Rigidity-aware geometric pretraining framework that front-loads SE(3) geometry learning to improve protein backbone generation, motif scaffolding, and conformational ensemble modeling.

Released: March 2026

RigidSSL is a geometric pretraining framework for protein structure generation that front-loads geometry learning before generative finetuning. Presented at ICLR 2026 and released as a March 2026 bioRxiv preprint by researchers at the Chinese University of Hong Kong and collaborators, it addresses a gap in protein backbone generators: most diffusion and flow-matching models learn geometry implicitly during generative training, which can limit designability and physical realism.

The framework learns from residue-level rigid-body representations in SE(3) space using a two-phase, self-supervised strategy. Phase I (RigidSSL-Perturb) learns geometric priors from 432K structures in the AlphaFold Protein Structure Database with simulated perturbations, and Phase II (RigidSSL-MD) refines those representations on 1.3K molecular dynamics trajectories to capture physically realistic structural transitions. The learned representations then serve as a better starting point for downstream generative protein design.

By treating geometry as an explicit pretraining objective rather than a byproduct of generation, RigidSSL connects to backbone generators such as FrameDiff (on whose codebase it builds, alongside OpenFold) and extends them toward improved designability and conformational modeling.

#Key Features

  • Rigidity-aware flow matching: A bi-directional flow-matching objective jointly optimizes the translational and rotational dynamics of residue-level rigid bodies in SE(3) space to maximize mutual information between conformations.
  • Two-phase pretraining: Phase I learns geometric priors from 432K AFDB structures with simulated perturbations; Phase II refines them on 1.3K molecular dynamics trajectories for physically realistic transitions.
  • Improved designability: Reports up to 43% improvement in designability and a 5.8% gain in zero-shot motif scaffolding success rate over baselines.
  • Conformational ensembles: Captures more biophysically realistic conformational ensembles, demonstrated on GPCR conformational states.
  • Open implementation: Code is released under the MIT license with pretrained checkpoints and processed datasets on HuggingFace.

#Technical Details

RigidSSL operates on residue-level rigid-body frames in SE(3), using a bi-directional, rigidity-aware flow-matching objective that jointly models translation and rotation to maximize mutual information between conformations. Pretraining proceeds in two phases: RigidSSL-Perturb learns geometric priors from 432K AlphaFold Protein Structure Database structures with simulated perturbations, and RigidSSL-MD refines representations on 1.3K molecular dynamics trajectories. The implementation builds on the OpenFold and FrameDiff codebases. Empirically, RigidSSL variants improve designability by up to 43%, raise zero-shot motif scaffolding success by 5.8%, and enhance novelty and diversity in unconditional generation while improving biophysical realism in GPCR conformational ensembles. Pretrained checkpoints and processed datasets are available on HuggingFace under an MIT-licensed repository.

#Applications

RigidSSL benefits protein designers working on de novo backbone generation, motif scaffolding for functional-site grafting, and modeling of conformational ensembles. As a pretraining framework, it can supply improved geometric initialization for downstream generative pipelines, helping produce more designable and diverse backbones. Its demonstrated GPCR conformational modeling is particularly relevant for researchers studying flexible or multi-state proteins where single static structures are insufficient.

#Impact

RigidSSL contributes a self-supervised, geometry-first perspective to protein backbone generation, showing that explicitly pretraining on rigid-body geometry and MD-derived dynamics can yield substantial designability and scaffolding gains. Its acceptance at ICLR 2026 and release of code, checkpoints, and datasets support reproducibility and downstream adoption. The reliance on a relatively small set of 1.3K MD trajectories for the dynamics phase is a noted scope limitation that future work may expand.

Tags

protein_designmotif_scaffoldingconformational_ensemble_generationflow_matchingself_supervisedrepresentation_learninggenerativeprotein_structure