Overview

ESM-1v is a 650-million parameter protein language model developed by Meta AI (then Facebook AI Research) and released in July 2021 alongside the paper "Language models enable zero-shot prediction of the effects of mutations on protein function" (Meier et al., NeurIPS 2021). The model addresses a long-standing challenge in protein biology: predicting how single amino acid changes alter a protein's function, without requiring expensive experimental assays for every variant of interest.

The core insight behind ESM-1v is that a language model trained on the evolutionary record of protein sequences implicitly encodes the fitness landscape of proteins. Because natural selection has acted on sequences over billions of years, the probability a model assigns to a given amino acid at a given position reflects functional constraint at that site. A mutation that the model considers unlikely — inconsistent with the patterns learned from evolution — is predicted to be deleterious, while a high-probability substitution is predicted to be tolerated. This allows ESM-1v to score mutation effects through a simple log-odds calculation at inference time, with no additional training on experimental data.

ESM-1v shares its transformer architecture with the earlier ESM-1b model but was retrained on UniRef90, a database clustering proteins at 90% sequence identity. This choice proved critical: training at higher sequence diversity (compared to the 50% clustering used for ESM-1b) significantly improved the model's ability to capture functionally relevant variation. Meta AI released five independently trained ESM-1v models with different random seeds to enable ensemble scoring, which provides the strongest zero-shot performance.

Key Features

Zero-shot variant scoring: Predicts the functional effect of any single amino acid mutation using only a forward pass through the model — no per-protein training, no experimental labels, and no multiple sequence alignment (MSA) generation are required.
Masked marginal scoring: The recommended scoring method masks each mutated position independently and computes a log-odds ratio between the mutant and wild-type amino acid probabilities, capturing the model's learned constraint at that site.
5-model ensemble: Five independently trained 650M-parameter models are provided; ensemble scoring by averaging log-odds across models consistently outperforms any single model on variant effect benchmarks.
MSA-free inference: Unlike MSA-based methods such as EVMutation and DeepSequence, ESM-1v requires only the target protein sequence, eliminating the computational cost of generating and processing multiple sequence alignments.
Multiple scoring strategies: In addition to masked marginal scoring, the model supports wildtype marginal (single forward pass, 1% lower accuracy but ~10x faster), mutant marginal, and pseudolikelihood approaches to accommodate different speed-accuracy trade-offs.

Technical Details

ESM-1v is a 650-million parameter transformer with 33 layers, trained using masked language modeling (MLM) on the UniRef90 2020-03 release (approximately 98 million diverse protein sequences). The model uses the same architecture as ESM-1b — including multi-head self-attention with rotary positional embeddings — but its training dataset and objective are tuned specifically for variant effect prediction. Perplexity on held-out sequences is 7.29, reflecting strong generalization to unseen protein families.

Benchmark evaluation across 41 deep mutational scanning (DMS) datasets — spanning fluorescent proteins, enzymes, antibodies, and viral proteins — showed that ESM-1v achieves an average Spearman rank correlation of approximately 0.51, matching MSA-based state-of-the-art methods including EVMutation and DeepSequence (both ~0.51 average Spearman ρ), without any task-specific model training. The ensemble of five models outperforms single-model scoring and exceeds DeepSequence on 17 of the 41 DMS datasets. ESM-1v substantially outperforms earlier single-sequence protein language models including TAPE, UniRep, ProtBERT-BFD, and ESM-1b (ρ ≈ 0.46 average), demonstrating the benefit of training at the 90% sequence identity clustering level.

Applications

ESM-1v is broadly applicable wherever researchers need to prioritize protein variants for experimental testing. In protein engineering, it allows rapid in silico screening of large mutant libraries — identifying which substitutions are most likely to preserve or improve function before any wet-lab work is performed. In clinical genetics, it can help interpret the pathogenicity of missense variants in disease-relevant proteins, complementing tools like SIFT and PolyPhen. In antibody engineering, ESM-1v scoring can identify stability-maintaining mutations in CDR regions. The model also serves as a pretraining foundation for supervised variant effect predictors: fine-tuning ESM-1v on a small set of labeled DMS measurements for a protein of interest further boosts accuracy beyond zero-shot performance, making it a practical starting point for targeted protein optimization campaigns.

Impact

ESM-1v established that large protein language models trained solely on sequence data could match the variant effect prediction accuracy of methods that require explicit co-evolutionary modeling via MSA. This finding challenged the prevailing assumption that MSA generation was a necessary preprocessing step for unsupervised fitness prediction. The NeurIPS 2021 paper has been widely cited in the protein engineering and variant effect prediction literature, and ESM-1v remains a standard baseline in benchmarks such as ProteinGym. Its release as part of the broader ESM model family (alongside ESM-1b and later ESM-2 and ESMFold) contributed to Meta AI's position as a leading contributor to protein language model research. A notable limitation is that ESM-1v was designed and benchmarked for single amino acid substitutions; its additive scoring assumption becomes less reliable for combinations of many mutations or in highly epistatic fitness landscapes.

Overview

Key Features

Zero-shot variant scoring: Predicts the functional effect of any single amino acid mutation using only a forward pass through the model — no per-protein training, no experimental labels, and no multiple sequence alignment (MSA) generation are required.

Masked marginal scoring: The recommended scoring method masks each mutated position independently and computes a log-odds ratio between the mutant and wild-type amino acid probabilities, capturing the model's learned constraint at that site.

5-model ensemble: Five independently trained 650M-parameter models are provided; ensemble scoring by averaging log-odds across models consistently outperforms any single model on variant effect benchmarks.

MSA-free inference: Unlike MSA-based methods such as EVMutation and DeepSequence, ESM-1v requires only the target protein sequence, eliminating the computational cost of generating and processing multiple sequence alignments.

Multiple scoring strategies: In addition to masked marginal scoring, the model supports wildtype marginal (single forward pass, 1% lower accuracy but ~10x faster), mutant marginal, and pseudolikelihood approaches to accommodate different speed-accuracy trade-offs.

Technical Details

Applications

Impact

ESM-1v

Overview

Key Features

Technical Details

Applications

Impact

Citation

Language models enable zero-shot prediction of the effects of mutations on protein function

Metrics

GitHub

Citations

Tags

Resources

ESM-1v

Overview

Key Features

Technical Details

Applications

Impact

Citation

Language models enable zero-shot prediction of the effects of mutations on protein function

Metrics

GitHub

Citations

Tags

Resources