All Competitors
Every biological foundation model, evaluated and ranked by the bio.rodeo team
Applications
Architectures
Learning Paradigms
Biological Subjects
Showing 1–13 of 13 filtered models
AlphaGenome
Google DeepMind
Google DeepMind model that predicts thousands of functional genomic tracks at single base-pair resolution from megabase-scale DNA sequences.
Evo 2
Arc Institute
Genomic foundation model trained on 9.3 trillion DNA base pairs spanning all domains of life, with 40B parameters and a 1-million-token context window.
Evo
Arc Institute
A 7B parameter genomic foundation model using StripedHyena architecture to model prokaryotic DNA, RNA, and proteins at single-nucleotide resolution with 131k token context.
gLM
Harvard University / MIT
Genomic language model trained on metagenomic scaffolds that learns protein co-regulation and function by modeling gene context and operon structure.
Caduceus
Kuleshov Lab
Bidirectional, reverse-complement equivariant DNA language models built on Mamba SSMs. Outperforms models 10x larger on long-range variant effect prediction.
GPN
Song Lab
A DNA language model for unsupervised genome-wide variant effect prediction, trained on multispecies genomes via masked language modeling without functional annotation labels.
GPN-MSA
UC Berkeley
Transformer-based DNA language model using whole-genome multispecies alignments for genome-wide variant effect prediction across coding and non-coding regions.
HyenaDNA
HazyResearch
Genomic foundation model using the Hyena operator to process DNA at single-nucleotide resolution with context lengths up to 1 million tokens, 500x longer than transformer-based predecessors.
DNABERT-2
MAGICS Lab
Multi-species genomic foundation model replacing k-mer tokenization with BPE, achieving state-of-the-art performance with 21x fewer parameters than prior leading models.
GENA-LM
AIRI Institute
A family of transformer-based DNA language models supporting context lengths up to 36,000 bp via BPE tokenization and BigBird sparse attention.
Nucleotide Transformer
InstaDeep
A family of DNA foundation models (500M–2.5B parameters) trained on 3,200+ human genomes and 850 species for genomic sequence understanding and variant effect prediction.
MoDNA
University of Texas at Arlington
Motif-oriented DNA pre-training framework using an ELECTRA-style generator-discriminator architecture to learn biologically informed genomic representations.
DNABERT
Northwestern University
BERT-based pre-trained model for DNA sequences using k-mer tokenization. Achieves state-of-the-art performance on promoter, splice site, and transcription factor binding prediction.