University of Illinois Urbana-Champaign
Physics-informed generative model that pairs flow matching with a Mamba state-space backbone for linear-time protein backbone design, scaling to 2,000+ residues.
PI-Mamba (Physics-Informed Mamba) is a generative model for protein backbone design that targets a persistent trade-off in the field: existing methods tend to rely on iterative refinement, quadratic-attention transformers, or post-hoc geometry correction, forcing a choice between computational efficiency and structural fidelity. PI-Mamba instead enforces exact local covalent geometry by construction while running inference in linear time, allowing it to generate very long proteins on modest hardware. It was introduced by Tianyu Wu (Center for Biophysics and Quantitative Biology) and Lin Zhu (School of Information Sciences) at the University of Illinois Urbana-Champaign in a preprint posted to arXiv in March 2026.
The model sits at the intersection of two recent trends in structural generative modeling: flow matching, which has become a popular alternative to diffusion for SE(3) backbone generation (as in FrameFlow and FoldFlow), and state-space sequence models such as Mamba, whose linear-time recurrence offers a scalable substitute for self-attention. PI-Mamba's distinguishing idea is to make the generative process physics-informed — initializing the state-space dynamics from a Rouse polymer model and embedding differentiable geometric constraints directly in the sampler so that produced structures respect covalent bond geometry without a separate relaxation step.
PI-Mamba couples a 16-layer Mamba backbone (model dimension 512, state dimension 32, expansion factor 2) with a structure module of four Invariant Point Attention layers using eight heads. The state-transition matrices are spectrally initialized from the Rouse eigenvalues λp = 4·sin²(pπ/2L) as Ap = exp(−λp·Δt/τ(x)), where the learned relaxation parameter τ(x) stratifies by secondary structure (τhelix < τsheet < τloop, p < 0.01). Training uses structures from CATH 4.2 filtered at 40% sequence identity (18,205 training, 1,200 validation, and 1,200 test domains, excluding structures above 3.0 Å resolution or with more than 10% missing residues), with a length curriculum expanding from L∈[50,100] to [50,500] over the first 20,000 optimizer steps. On the L=100 benchmark (n=100), PI-Mamba reaches designability of scTM = 0.91 ± 0.03 with 0.0% final violations in 9.4 s, compared to RFdiffusion (scTM 0.97, 37.2 s), Genie2 (0.93, 60.0 s), FrameFlow (0.70), and Chroma (0.60). Per-sample inference runs at 2.3 s (L=100), 10.5 s (L=500), and 21.3 s (L=1000), a roughly 21× speedup over RFdiffusion at L=500, while diffusion baselines show superlinear scaling.
PI-Mamba is aimed at researchers generating de novo protein backbones, particularly when target proteins are long enough that attention-based generators become memory- or time-prohibitive. Its linear-time inference and low memory footprint (well under 1 GB even at L=1000) make large-scaffold sampling feasible on a single commodity GPU rather than a multi-GPU cluster, lowering the hardware barrier for backbone generation. As with other backbone generators, designs are intended to feed a downstream pipeline — sequence design (e.g., ProteinMPNN) followed by structure-prediction filtering — before experimental testing, and the guaranteed-valid local geometry removes a relaxation step that such pipelines typically require.
PI-Mamba is an early demonstration that state-space models can serve as the generative backbone for protein structure design, pairing the linear-time scaling of Mamba with the geometric guarantees usually obtained only through diffusion plus post-processing. Its physics-informed initialization is a notable bridge between polymer-physics theory and modern deep generative modeling. As a single-version preprint, it has not yet accumulated independent benchmarking or downstream adoption, and important caveats remain: its 0.91 scTM trails RFdiffusion's 0.97 on the short-length benchmark, and the work is not yet peer-reviewed. The code and distilled dataset are referenced in the paper as forthcoming but were not publicly released at the time of writing, so reproduction currently depends on the architectural details reported in the preprint.