All Competitors
Every biological foundation model, evaluated and ranked by the bio.rodeo team
Showing 1–21 of 21 filtered models
AlphaGenome
Google DeepMind
Google DeepMind model that predicts thousands of functional genomic tracks at single base-pair resolution from megabase-scale DNA sequences.
Evo 2
Arc Institute
Genomic foundation model trained on 9.3 trillion DNA base pairs spanning all domains of life, with 40B parameters and a 1-million-token context window.
Borzoi
Calico Life Sciences
Deep learning model predicting cell-type-specific RNA-seq coverage at 32 bp resolution from 524 kb of DNA sequence, jointly modeling transcription, splicing, and polyadenylation.
Evo
Arc Institute
A 7B parameter genomic foundation model using StripedHyena architecture to model prokaryotic DNA, RNA, and proteins at single-nucleotide resolution with 131k token context.
gLM
Harvard University / MIT
Genomic language model trained on metagenomic scaffolds that learns protein co-regulation and function by modeling gene context and operon structure.
Caduceus
Kuleshov Lab
Bidirectional, reverse-complement equivariant DNA language models built on Mamba SSMs. Outperforms models 10x larger on long-range variant effect prediction.
DNABERT-S
MAGICS Lab
Species-aware DNA embedding model built on DNABERT-2, using contrastive learning to cluster and differentiate genomic sequences by species without labeled data.
GPN
Song Lab
A DNA language model for unsupervised genome-wide variant effect prediction, trained on multispecies genomes via masked language modeling without functional annotation labels.
GPN-MSA
UC Berkeley
Transformer-based DNA language model using whole-genome multispecies alignments for genome-wide variant effect prediction across coding and non-coding regions.
seq2cells
GSK.ai
Transfer learning framework that predicts single-cell gene expression from ~200kb DNA sequences using Enformer embeddings and a lightweight MLP.
EpiGePT
Tsinghua University
Transformer model predicting context-specific epigenomic signals across cell types using DNA sequence and transcription factor activity profiles.
DNAGPT
TencentAILabHealthcare
A GPT-based foundation model pre-trained on 200B+ base pairs from mammalian genomes, supporting DNA sequence generation, classification, and regression.
HyenaDNA
HazyResearch
Genomic foundation model using the Hyena operator to process DNA at single-nucleotide resolution with context lengths up to 1 million tokens, 500x longer than transformer-based predecessors.
DNABERT-2
MAGICS Lab
Multi-species genomic foundation model replacing k-mer tokenization with BPE, achieving state-of-the-art performance with 21x fewer parameters than prior leading models.
GENA-LM
AIRI Institute
A family of transformer-based DNA language models supporting context lengths up to 36,000 bp via BPE tokenization and BigBird sparse attention.
Species-Aware DNA Language Model
Technical University of Munich
Masked DNA language model trained on 800+ species spanning 500M years of evolution, using explicit species conditioning to capture conserved regulatory elements.
Nucleotide Transformer
InstaDeep
A family of DNA foundation models (500M–2.5B parameters) trained on 3,200+ human genomes and 850 species for genomic sequence understanding and variant effect prediction.
Microbial Gene NLP
Burstein Lab
A word2vec-based language model trained on 360 million microbial genes that predicts gene function from genomic context without sequence homology.
MoDNA
University of Texas at Arlington
Motif-oriented DNA pre-training framework using an ELECTRA-style generator-discriminator architecture to learn biologically informed genomic representations.
Enformer
Google DeepMind
Transformer model that predicts gene expression and regulatory activity from 200kb DNA sequences, capturing enhancer-promoter interactions up to 100kb away.
DNABERT
Northwestern University
BERT-based pre-trained model for DNA sequences using k-mer tokenization. Achieves state-of-the-art performance on promoter, splice site, and transcription factor binding prediction.