SE(3)-invariant masked autoencoder pretrained on ~370K AlphaFold-DB structures for protein fold representation learning, enabling frozen-feature and zero-shot fold classification.
MiAE (Masked Invariant Autoencoders) is a self-supervised framework for learning representations of protein structure, introduced by Dexiong Chen, Andrei Manolache, Mathias Niepert, and Karsten Borgwardt of the Borgwardt Lab at ETH Zurich in the paper "Protein Fold Classification at Scale: Benchmarking and Pretraining" (arXiv:2605.18552, ICML 2026 Spotlight). The model addresses a core scalability problem in structural biology: with hundreds of millions of predicted structures now available, classifying proteins into their evolutionary folds requires representation learning methods that generalize beyond small, curated training sets.
The central idea is to pretrain an SE(3)-invariant encoder by reconstructing backbone geometry from heavily masked inputs, then reuse the resulting fixed checkpoints as feature extractors for downstream fold classification. Because the encoder is invariant to global rotations and translations, it learns geometry-aware embeddings that transfer well without per-task retraining. The authors release MiAE alongside TEDBench, a companion large-scale, non-redundant benchmark for protein fold classification.
A notable finding is that frozen MiAE features, used in a zero-shot or frozen-feature setting, outperform supervised baselines and prior state-of-the-art methods on TEDBench and on a CATH v4.4 transfer set spanning 965 CATH topology classes — demonstrating that self-supervised structural pretraining yields representations strong enough to compete with task-specific supervised models.
tedbench.load_model("miae-b"), with code released under a BSD-3-Clause license.MiAE pairs an SE(3)-invariant transformer encoder with a lightweight decoder that reconstructs backbone N/Cα/C coordinates. The four variants scale from 6 layers, 512 hidden dim, 8 heads (MiAE-S, 29M) through 12 layers, 768 hidden dim, 12 heads (MiAE-B/MiAE-B+seq, 102M) to 24 layers, 1,024 hidden dim, 16 heads (MiAE-L, 339M); the B+seq variant additionally conditions on amino-acid sequence. Pretraining uses ~369,740 AlphaFold-DB structures (Foldseek-clustered, pLDDT-filtered), with 46,217 validation and 46,218 test structures. The downstream task is classification over 965 CATH topology (T-level) classes, with an external transfer set of 27,638 experimental structures from CATH v4.4. Across these benchmarks, frozen MiAE features exceed both supervised counterparts and existing structure-representation baselines.
MiAE is aimed at researchers performing large-scale structural bioinformatics: annotating the fold of newly predicted or experimentally solved structures, clustering structural space, detecting remote homologs, and curating non-redundant structural datasets. Because frozen embeddings work without fine-tuning, the model is practical as a drop-in feature extractor for proteome-scale pipelines, while the fine-tuned classifier variants serve users who want an off-the-shelf fold predictor. The TEDBench benchmark itself provides a standardized, leakage-controlled evaluation for groups developing competing structure-representation methods.
By showing that self-supervised, geometry-invariant pretraining produces frozen features that outpace supervised fold classifiers at the scale of the AlphaFold Database, MiAE offers a strong recipe for structural representation learning in the post-AlphaFold era. The paired release of open weights across four sizes, a simple inference API, BSD-3-Clause code, and a rigorously deduplicated benchmark (TEDBench) lowers the barrier for reproducible fold-classification research. Its ICML 2026 Spotlight selection reflects the field's interest in scalable, transferable structural embeddings; as with any AlphaFold-DB-pretrained model, performance on highly novel or low-confidence structures should be validated against experimental references.