ESM-1v is a 650-million parameter protein language model developed by Meta AI (then Facebook AI Research) and released in July 2021 alongside the paper "Language models enable zero-shot prediction of the effects of mutations on protein function" (Meier et al., NeurIPS 2021). The model addresses a long-standing challenge in protein biology: predicting how single amino acid changes alter a protein's function, without requiring expensive experimental assays for every variant of interest.
The core insight behind ESM-1v is that a language model trained on the evolutionary record of protein sequences implicitly encodes the fitness landscape of proteins. Because natural selection has acted on sequences over billions of years, the probability a model assigns to a given amino acid at a given position reflects functional constraint at that site. A mutation that the model considers unlikely — inconsistent with the patterns learned from evolution — is predicted to be deleterious, while a high-probability substitution is predicted to be tolerated. This allows ESM-1v to score mutation effects through a simple log-odds calculation at inference time, with no additional training on experimental data.
ESM-1v shares its transformer architecture with the earlier ESM-1b model but was retrained on UniRef90, a database clustering proteins at 90% sequence identity. This choice proved critical: training at higher sequence diversity (compared to the 50% clustering used for ESM-1b) significantly improved the model's ability to capture functionally relevant variation. Meta AI released five independently trained ESM-1v models with different random seeds to enable ensemble scoring, which provides the strongest zero-shot performance.
ESM-1v is a 650-million parameter transformer with 33 layers, trained using masked language modeling (MLM) on the UniRef90 2020-03 release (approximately 98 million diverse protein sequences). The model uses the same architecture as ESM-1b — including multi-head self-attention with rotary positional embeddings — but its training dataset and objective are tuned specifically for variant effect prediction. Perplexity on held-out sequences is 7.29, reflecting strong generalization to unseen protein families.
Benchmark evaluation across 41 deep mutational scanning (DMS) datasets — spanning fluorescent proteins, enzymes, antibodies, and viral proteins — showed that ESM-1v achieves an average Spearman rank correlation of approximately 0.51, matching MSA-based state-of-the-art methods including EVMutation and DeepSequence (both ~0.51 average Spearman ρ), without any task-specific model training. The ensemble of five models outperforms single-model scoring and exceeds DeepSequence on 17 of the 41 DMS datasets. ESM-1v substantially outperforms earlier single-sequence protein language models including TAPE, UniRep, ProtBERT-BFD, and ESM-1b (ρ ≈ 0.46 average), demonstrating the benefit of training at the 90% sequence identity clustering level.
ESM-1v is broadly applicable wherever researchers need to prioritize protein variants for experimental testing. In protein engineering, it allows rapid in silico screening of large mutant libraries — identifying which substitutions are most likely to preserve or improve function before any wet-lab work is performed. In clinical genetics, it can help interpret the pathogenicity of missense variants in disease-relevant proteins, complementing tools like SIFT and PolyPhen. In antibody engineering, ESM-1v scoring can identify stability-maintaining mutations in CDR regions. The model also serves as a pretraining foundation for supervised variant effect predictors: fine-tuning ESM-1v on a small set of labeled DMS measurements for a protein of interest further boosts accuracy beyond zero-shot performance, making it a practical starting point for targeted protein optimization campaigns.
ESM-1v established that large protein language models trained solely on sequence data could match the variant effect prediction accuracy of methods that require explicit co-evolutionary modeling via MSA. This finding challenged the prevailing assumption that MSA generation was a necessary preprocessing step for unsupervised fitness prediction. The NeurIPS 2021 paper has been widely cited in the protein engineering and variant effect prediction literature, and ESM-1v remains a standard baseline in benchmarks such as ProteinGym. Its release as part of the broader ESM model family (alongside ESM-1b and later ESM-2 and ESMFold) contributed to Meta AI's position as a leading contributor to protein language model research. A notable limitation is that ESM-1v was designed and benchmarked for single amino acid substitutions; its additive scoring assumption becomes less reliable for combinations of many mutations or in highly epistatic fitness landscapes.
Meier, J., et al. (2021) Language models enable zero-shot prediction of the effects of mutations on protein function. bioRxiv.
DOI: 10.1101/2021.07.09.450648