PLM-SAE

Shanghai Smart Logic Technology Co., Ltd.

Sparse autoencoders trained on protein language model embeddings to expose interpretable features and drive zero-shot variant effect prediction.

Released: May 2026

PLM-SAE (Protein Language Model Sparse Autoencoder) is a mechanistic-interpretability framework that applies sparse autoencoders (SAEs) to the dense internal representations of protein language models such as the ESM series, including ESM-3. Modern protein language models encode rich biological signal in high-dimensional embeddings, but those representations are entangled and difficult to interpret, which limits both scientific insight and fine-grained control. PLM-SAE addresses this by decomposing the dense embeddings into a much larger set of discrete, sparsely activating features, each of which tends to correspond to a more human-interpretable concept.

Beyond interpretation, the framework uses these disentangled features for variant effect prediction (VEP) by steering or intervening on individual features and measuring how the model's outputs shift in response. Crucially, the SAE is trained once as a fixed, pretrained checkpoint and then applied zero-shot to new sequences, so no labeled mutational data is required at inference time. The work was developed by researchers at Shanghai Smart Logic Technology Co., Ltd. (corresponding author Zhixiang Ren) and posted as a bioRxiv preprint in May 2026.

PLM-SAE sits within the growing line of work that brings mechanistic interpretability—originally developed for large language models—into computational biology, connecting the protein-LM literature (ESM, ESM-3) with sparse dictionary learning as a tool for both understanding and steering biological foundation models.

Key Features

Sparse feature disentanglement: Trains sparse autoencoders on ESM-series embeddings to convert dense, entangled representations into discrete features that are more interpretable than the raw activations.
Zero-shot variant effect prediction: The pretrained SAE checkpoint is applied unsupervised to new sequences, predicting variant effects without task-specific labels or retraining.
Feature steering and intervention: Identified sparse features can be directly activated or suppressed to probe and modulate model behavior, linking interpretation to actionable control.
Broad empirical validation: Evaluated across 114 deep mutational scanning (DMS) datasets, demonstrating generality beyond a single protein or assay.
Optional target-specific gating: A differentiable gating mechanism can be added as an enhancement to tailor features to a specific target, but it is not required for the core zero-shot method.

Technical Details

The core method couples a frozen protein language model with a sparse autoencoder trained to reconstruct the model's internal embeddings under a sparsity constraint, yielding an overcomplete dictionary of features. These features are used for variant effect prediction by intervention, comparing model behavior with and without specific features active. In unsupervised zero-shot evaluation, PLM-SAE was benchmarked on 114 DMS datasets, reporting an 80.8% relative improvement on the HECD1 dataset when built on ESM-3 representations, and establishing new state-of-the-art results on 17 VenusMutHub datasets. The optional target-specific differentiable gating module is presented as an enhancement layered on top of the fixed SAE rather than a requirement of the baseline approach.

Applications

PLM-SAE is most directly useful for researchers studying protein function and stability who need to predict the effects of mutations from sequence alone, particularly in settings where labeled deep mutational scanning data is unavailable. Because it operates zero-shot, it can be applied to new proteins and assays without retraining, supporting variant prioritization in protein engineering, enzyme characterization, and the interpretation of clinically relevant variants. The interpretability and steering capabilities also make it a tool for scientists who want to understand what protein language models have learned and to intervene on those representations in a controlled way.

Impact

By bringing sparse-autoencoder interpretability to protein language models, PLM-SAE contributes to a broader effort to make biological foundation models more transparent and controllable rather than treating them as black boxes. The reported zero-shot gains—an 80.8% relative improvement on HECD1 and new state-of-the-art results across 17 VenusMutHub datasets—suggest that disentangled sparse features can sharpen variant effect prediction while also yielding mechanistic insight. As a recent preprint, its findings await peer review and broader independent validation. No public code repository or trained weights have been confirmed at the time of writing, and while the preprint is released under CC BY, any accompanying software license remains unconfirmed.

Citation

Improving Variant Effect Prediction by Steering Sparse Mechanistic Features in Protein Language Models

Wang, M., et al. (2026) Improving Variant Effect Prediction by Steering Sparse Mechanistic Features in Protein Language Models. bioRxiv.

DOI: 10.64898/2026.05.12.724472

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References21

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

22Closed

Usability — can I run it?15

Reproducibility — can I retrain it?14

Model Openness Framework

Unclassified

Missing required components

Resources

Research Paper

Key Features

Sparse feature disentanglement: Trains sparse autoencoders on ESM-series embeddings to convert dense, entangled representations into discrete features that are more interpretable than the raw activations.

Zero-shot variant effect prediction: The pretrained SAE checkpoint is applied unsupervised to new sequences, predicting variant effects without task-specific labels or retraining.

Feature steering and intervention: Identified sparse features can be directly activated or suppressed to probe and modulate model behavior, linking interpretation to actionable control.

Broad empirical validation: Evaluated across 114 deep mutational scanning (DMS) datasets, demonstrating generality beyond a single protein or assay.

Optional target-specific gating: A differentiable gating mechanism can be added as an enhancement to tailor features to a specific target, but it is not required for the core zero-shot method.

Technical Details

Applications

Impact

PLM-SAE

Key Features

Technical Details

Applications

Impact

Citation

Improving Variant Effect Prediction by Steering Sparse Mechanistic Features in Protein Language Models

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

PLM-SAE

Key Features

Technical Details

Applications

Impact

Citation

Improving Variant Effect Prediction by Steering Sparse Mechanistic Features in Protein Language Models

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

PLM-SAE

#Key Features

#Technical Details

#Applications

#Impact

Citation

Improving Variant Effect Prediction by Steering Sparse Mechanistic Features in Protein Language Models

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

PLM-SAE

#Key Features

#Technical Details

#Applications

#Impact

Citation

Improving Variant Effect Prediction by Steering Sparse Mechanistic Features in Protein Language Models

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact