bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
RNADNA & Gene

MOJO

InstaDeep

A 52.3M-parameter bimodal masked language model that jointly learns representations of bulk RNA-seq expression and DNA methylation for cancer genomics.

Released: June 2025
Parameters: 52.3 Million

MOJO (MultiOmics JOint representation learning) is a bimodal foundation model developed by InstaDeep that jointly learns representations of two complementary omics layers: bulk RNA-seq gene expression and DNA methylation. Cancer is driven by both transcriptional dysregulation and aberrant epigenetic states, yet most omics foundation models operate on a single modality. MOJO addresses this gap by training a shared encoder that captures the coordinated signal across expression and methylation, producing patient-level embeddings useful for downstream oncology tasks.

Released in June 2025 as a bioRxiv preprint and presented at the ICML 2025 Workshop on Generative AI and Biology, MOJO builds directly on InstaDeep's earlier BulkRNABert, a unimodal bulk RNA-seq encoder. Where BulkRNABert demonstrated that masked language modeling over binned expression profiles yields transferable embeddings, MOJO extends that recipe to a second modality and tackles the practical reality that paired multi-omics data is often incomplete in clinical settings.

The model sits in the growing landscape of transcriptomics and epigenomics foundation models, but is distinguished by its explicit bimodal training objective and its engineering for robustness when one modality is missing at inference time.

#Key Features

  • Bimodal joint training: MOJO is trained with simultaneous masked language modeling across both bulk RNA-seq and DNA methylation, learning a unified representation rather than concatenating two independently trained encoders.
  • Missing-modality robustness: A mutual-information minimization term applied during fine-tuning encourages modality-specific and shared information to be disentangled, allowing the model to produce useful embeddings even when only expression or only methylation is available.
  • Cancer-focused benchmarks: The model is demonstrated on cancer-type classification and survival analysis, two clinically relevant tasks where multi-omics integration is expected to add value over single-modality baselines.
  • Drop-in embeddings: Pretrained weights load through the standard Hugging Face Transformers API, so practitioners can extract patient embeddings without any re-training.
  • Open code and weights: Both MOJO and its BulkRNABert predecessor are released with code and downloadable checkpoints.

#Technical Details

MOJO is a 52.3M-parameter transformer encoder trained with a BERT-style masked language modeling objective applied jointly to bulk RNA-seq and DNA methylation inputs. Expression and methylation values are discretized into bins and tokenized, and the model reconstructs masked tokens across both modalities simultaneously, forcing the shared encoder to integrate cross-modal structure. Pretraining uses paired multi-omics samples from The Cancer Genome Atlas (TCGA); the BulkRNABert predecessor (approximately 6M parameters) was additionally pretrained on GTEx and ENCODE expression data. At fine-tuning, a mutual-information minimization regularizer is added to improve resilience to missing modalities. Evaluation centers on TCGA cancer-type classification and survival analysis, where the bimodal representation is compared against unimodal baselines including BulkRNABert.

#Applications

MOJO is aimed at computational oncology and translational research groups working with TCGA-style multi-omics cohorts. Its patient-level embeddings support cancer-type classification, survival and risk modeling, and exploratory analyses such as patient stratification or subtype discovery. Because the model degrades gracefully when methylation or expression data is absent, it is well suited to real-world clinical datasets where complete paired profiling is the exception rather than the rule. The Transformers-compatible interface lets researchers integrate MOJO embeddings into existing scikit-learn or PyTorch pipelines as features for downstream predictors.

#Impact

MOJO contributes to the emerging class of multi-omics foundation models by showing that joint masked modeling across expression and methylation produces representations that outperform single-modality encoders on cancer tasks, while remaining robust to incomplete inputs. As an open release with accessible weights and code, alongside its widely cited BulkRNABert predecessor, it lowers the barrier for groups seeking pretrained embeddings for oncology genomics. As a 2025 preprint, its benchmark comparisons and downstream adoption are still maturing, and broader validation across cohorts beyond TCGA remains an open direction. The model's license has not been formally confirmed; users should verify terms before deployment.

Citations

Bimodal masked language modeling for bulk RNA-seq and DNA methylation representation learning

Preprint

Gélard, M., et al. (2026) Bimodal masked language modeling for bulk RNA-seq and DNA methylation representation learning. bioRxiv.

DOI: 10.1101/2025.06.25.661237

BulkRNABert: Cancer prognosis from bulk RNA-seq based language models

Preprint

Gélard, M., et al. (2024) BulkRNABert: Cancer prognosis from bulk RNA-seq based language models. bioRxiv.

DOI: 10.1101/2024.06.18.599483

Openness

Unclassified
Restrictive license on core components

Tags

bertdna_methylationfoundation_modelgene_expressionmasked_language_modelmultimodalrepresentation_learningsurvival_analysistranscriptomicstransformer

Resources

GitHub RepositoryResearch PaperHuggingFace ModelHuggingFace Model