All Competitors

AlphaGenome

Google DeepMind

Google DeepMind model that predicts thousands of functional genomic tracks at single base-pair resolution from megabase-scale DNA sequences.

1.9K50

Evo 2

Arc Institute

Genomic foundation model trained on 9.3 trillion DNA base pairs spanning all domains of life, with 40B parameters and a 1-million-token context window.

3.8K227

Borzoi

Calico Life Sciences

Deep learning model predicting cell-type-specific RNA-seq coverage at 32 bp resolution from 524 kb of DNA sequence, jointly modeling transcription, splicing, and polyadenylation.

236207

SPIRED-Fitness

Tsinghua University

End-to-end framework predicting protein structure and mutational fitness from a single sequence, with 5x faster inference than ESMFold at comparable accuracy.

5032

Caduceus

Kuleshov Lab

Bidirectional, reverse-complement equivariant DNA language models built on Mamba SSMs. Outperforms models 10x larger on long-range variant effect prediction.

232

GPN

Song Lab

A DNA language model for unsupervised genome-wide variant effect prediction, trained on multispecies genomes via masked language modeling without functional annotation labels.

339

GPN-MSA

UC Berkeley

Transformer-based DNA language model using whole-genome multispecies alignments for genome-wide variant effect prediction across coding and non-coding regions.

33965

SaProt

Westlake University

Protein language model combining amino acid and Foldseek 3Di structural tokens, outperforming ESM-2 across 10 downstream tasks including mutation effect prediction.

5912768K

seq2cells

GSK.ai

Transfer learning framework that predicts single-cell gene expression from ~200kb DNA sequences using Enformer embeddings and a lightweight MLP.

1215

Nucleotide Transformer

InstaDeep

A family of DNA foundation models (500M–2.5B parameters) trained on 3,200+ human genomes and 850 species for genomic sequence understanding and variant effect prediction.

858173

CARP

Microsoft Research

CNN-based protein language model series showing convolutions match transformer performance on sequence pretraining while scaling linearly with sequence length.

259

Enformer

Google DeepMind

Transformer model that predicts gene expression and regulatory activity from 200kb DNA sequences, capturing enhancer-promoter interactions up to 100kb away.

14.9K1.1K

ESM-1v

Meta AI

Protein language model for zero-shot prediction of mutation effects, achieving state-of-the-art accuracy on deep mutational scanning benchmarks without MSA generation.

4K807

ESM-1b

Meta AI

Transformer protein language model trained on 250 million protein sequences that learns structural and functional representations without supervision.

4K3.7K