All Competitors
Every biological foundation model, evaluated and ranked by the bio.rodeo team
Applications
Architectures
Learning Paradigms
Biological Subjects
Showing 1–14 of 14 filtered models
Evo 2
Arc Institute
Genomic foundation model trained on 9.3 trillion DNA base pairs spanning all domains of life, with 40B parameters and a 1-million-token context window.
Evo
Arc Institute
A 7B parameter genomic foundation model using StripedHyena architecture to model prokaryotic DNA, RNA, and proteins at single-nucleotide resolution with 131k token context.
DNABERT-S
MAGICS Lab
Species-aware DNA embedding model built on DNABERT-2, using contrastive learning to cluster and differentiate genomic sequences by species without labeled data.
CaLM
Oxpig
Codon-level BERT model that captures genomic signals invisible to amino acid models, outperforming billion-parameter PLMs with just 86M parameters.
GPN
Song Lab
A DNA language model for unsupervised genome-wide variant effect prediction, trained on multispecies genomes via masked language modeling without functional annotation labels.
GPN-MSA
UC Berkeley
Transformer-based DNA language model using whole-genome multispecies alignments for genome-wide variant effect prediction across coding and non-coding regions.
DNAGPT
TencentAILabHealthcare
A GPT-based foundation model pre-trained on 200B+ base pairs from mammalian genomes, supporting DNA sequence generation, classification, and regression.
DNABERT-2
MAGICS Lab
Multi-species genomic foundation model replacing k-mer tokenization with BPE, achieving state-of-the-art performance with 21x fewer parameters than prior leading models.
GENA-LM
AIRI Institute
A family of transformer-based DNA language models supporting context lengths up to 36,000 bp via BPE tokenization and BigBird sparse attention.
Nucleotide Transformer
InstaDeep
A family of DNA foundation models (500M–2.5B parameters) trained on 3,200+ human genomes and 850 species for genomic sequence understanding and variant effect prediction.
MoDNA
University of Texas at Arlington
Motif-oriented DNA pre-training framework using an ELECTRA-style generator-discriminator architecture to learn biologically informed genomic representations.
BioSeq-BLM
Beijing Institute of Technology
An integrated platform implementing 155 biological language models for analyzing DNA, RNA, and protein sequences across residue-level and sequence-level tasks.
Enformer
Google DeepMind
Transformer model that predicts gene expression and regulatory activity from 200kb DNA sequences, capturing enhancer-promoter interactions up to 100kb away.
DNABERT
Northwestern University
BERT-based pre-trained model for DNA sequences using k-mer tokenization. Achieves state-of-the-art performance on promoter, splice site, and transcription factor binding prediction.