A sequence-based learning-to-rank variant effect predictor that aligns and aggregates ~1,100 overlapping deep mutational scanning assays into an assay-agnostic measure of mutational tolerance.
ESMRank is a sequence-based variant effect predictor developed at the Telethon Institute of Genetics and Medicine (TIGEM) and posted to bioRxiv in February 2026. It targets a fundamental obstacle in learning from multiplexed assays of variant effect (MAVEs): the thousands of available deep mutational scanning (DMS) experiments measure different molecular phenotypes—stability, abundance, binding, enzymatic activity—on different scales, making their scores difficult to compare or combine directly. ESMRank reconciles this heterogeneity by reframing variant effect prediction as a learning-to-rank problem rather than a regression onto raw, assay-specific scores.
The core idea is an overlap-aware framework the authors call variant soundness. Many proteins are covered by more than one DMS assay, and the shared variants between overlapping assays provide anchors for aligning their internal rankings. ESMRank uses these overlaps to align within-assay rankings and then aggregate them across experiments, deriving an assay-agnostic measure of mutational tolerance that does not depend on any single assay's units or readout. The predictor itself integrates protein language model representations—from the ESM family—with physicochemical descriptors of residues and substitutions.
Applied to roughly 1,100 MAVEdb score sets spanning over 2 million variants, ESMRank recovers a coherent, transferable constraint landscape. The learned axis of mutational constraint is enriched for structural-stability determinants such as residue burial, packing-perturbation magnitude, and domain architecture, suggesting the model captures a biophysically meaningful and generalizable signal rather than overfitting to individual assays.
ESMRank is a sequence-based learning-to-rank predictor that integrates protein language model representations with physicochemical descriptors of residues and substitutions. Its variant-soundness framework exploits the overlap structure of multiplexed assays: where multiple MAVEs measure the same protein, common variants are used to align rankings within each assay before aggregating across assays into a single, assay-agnostic mutational-tolerance scale. The model is applied to approximately 1,100 MAVEdb score sets encompassing over 2 million variants. The resulting constraint landscape is enriched for structural-stability determinants—including residue burial, the magnitude of packing perturbation introduced by a substitution, and domain architecture—indicating that the recovered axis reflects biophysical determinants of mutational tolerance. The preprint is released under a CC BY-NC-ND license; the authors do not report publicly released model weights at the time of posting.
ESMRank is intended for interpreting the functional impact of protein-coding variants, a central task in clinical genetics, protein engineering, and basic protein science. By producing an assay-agnostic measure of mutational tolerance, it can help prioritize variants of uncertain significance, guide stability-focused protein design, and provide a common reference frame for combining evidence across the many DMS assays now deposited in MAVEdb. Its enrichment for structural-stability determinants also makes it useful for studying how sequence position and biophysical context shape tolerance to mutation across diverse proteins.
ESMRank contributes a principled answer to a growing data-integration problem: as MAVEdb accumulates hundreds of heterogeneous DMS assays, methods that can align and pool them become increasingly valuable. The variant-soundness approach offers a reusable strategy for turning many incomparable assays into a single transferable constraint axis, and the model's biophysical interpretability strengthens confidence that the learned signal generalizes. As a February 2026 preprint released under a non-commercial license and without yet-reported public weights, its downstream adoption and head-to-head benchmarking against established variant effect predictors remain to be demonstrated.