All Competitors

Every biological foundation model, evaluated and ranked by the bio.rodeo team

Applications

Architectures

Learning Paradigms

Biological Subjects

Showing 1–14 of 14 filtered models

DNA & Gene

Evo 2

Arc Institute

Genomic foundation model trained on 9.3 trillion DNA base pairs spanning all domains of life, with 40B parameters and a 1-million-token context window.

3.8K227
See the scorecard
DNA & Gene

Evo

Arc Institute

A 7B parameter genomic foundation model using StripedHyena architecture to model prokaryotic DNA, RNA, and proteins at single-nucleotide resolution with 131k token context.

1.5K1958.1K
See the scorecard
DNA & Gene

DNABERT-S

MAGICS Lab

Species-aware DNA embedding model built on DNABERT-2, using contrastive learning to cluster and differentiate genomic sequences by species without labeled data.

12634285.6K
See the scorecard
Protein

CaLM

Oxpig

Codon-level BERT model that captures genomic signals invisible to amino acid models, outperforming billion-parameter PLMs with just 86M parameters.

5440
See the scorecard
DNA & Gene

GPN

Song Lab

A DNA language model for unsupervised genome-wide variant effect prediction, trained on multispecies genomes via masked language modeling without functional annotation labels.

339
See the scorecard
DNA & Gene

GPN-MSA

UC Berkeley

Transformer-based DNA language model using whole-genome multispecies alignments for genome-wide variant effect prediction across coding and non-coding regions.

33965
See the scorecard
DNA & Gene

DNAGPT

TencentAILabHealthcare

A GPT-based foundation model pre-trained on 200B+ base pairs from mammalian genomes, supporting DNA sequence generation, classification, and regression.

46
See the scorecard
DNA & Gene

DNABERT-2

MAGICS Lab

Multi-species genomic foundation model replacing k-mer tokenization with BPE, achieving state-of-the-art performance with 21x fewer parameters than prior leading models.

47837595.5K
See the scorecard
DNA & Gene

GENA-LM

AIRI Institute

A family of transformer-based DNA language models supporting context lengths up to 36,000 bp via BPE tokenization and BigBird sparse attention.

221
See the scorecard
DNA & Gene

Nucleotide Transformer

InstaDeep

A family of DNA foundation models (500M–2.5B parameters) trained on 3,200+ human genomes and 850 species for genomic sequence understanding and variant effect prediction.

858173
See the scorecard
DNA & Gene

MoDNA

University of Texas at Arlington

Motif-oriented DNA pre-training framework using an ELECTRA-style generator-discriminator architecture to learn biologically informed genomic representations.

25
See the scorecard
Multimodalities

BioSeq-BLM

Beijing Institute of Technology

An integrated platform implementing 155 biological language models for analyzing DNA, RNA, and protein sequences across residue-level and sequence-level tasks.

14209
See the scorecard
DNA & Gene

Enformer

Google DeepMind

Transformer model that predicts gene expression and regulatory activity from 200kb DNA sequences, capturing enhancer-promoter interactions up to 100kb away.

14.9K1.1K
See the scorecard
DNA & Gene

DNABERT

Northwestern University

BERT-based pre-trained model for DNA sequences using k-mer tokenization. Achieves state-of-the-art performance on promoter, splice site, and transcription factor binding prediction.

74918.7K
See the scorecard