bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
DNA & Gene foundation models
DNA & Gene

Melody

Shandong University / University of Electronic Science and Technology of China / Chinese Academy of Sciences

A deep learning framework that predicts locus-specific DNA methylation across 39 human tissues from genomic sequence, with a scRNA-seq-augmented variant for unseen cell types.

Released: November 2025

DNA methylation is a fundamental epigenetic modification in which methyl groups are added at CpG dinucleotides, helping to regulate gene expression, maintain genome stability, and establish tissue-specific cellular identity across processes such as embryonic development, differentiation, and aging. A long-standing question is how much of this methylation landscape is determined by the genomic sequence itself, and whether that sequence-to-methylation relationship can be learned well enough to predict methylation at specific loci in specific tissues without measuring it directly.

Melody, developed by Junru Jin, Leyi Wei, and colleagues at Shandong University (Joint SDU-NTU Centre for Artificial Intelligence Research) together with collaborators at the University of Electronic Science and Technology of China and the Chinese Academy of Sciences, is a deep learning framework that predicts locus-specific DNA methylation from genomic sequence across 39 human tissues. First posted to bioRxiv in November 2025, Melody takes 10-kb sequence windows as input — far longer than the short windows (often ~41 bp) used by earlier methods such as DeepCpG, CPGenie, and iDNA-ABF — allowing it to capture long-range regulatory dependencies that influence methylation state.

Unlike prior tools designed for a single cell line or a handful of samples, Melody is built around the extensive cell-type heterogeneity of methylation. The framework includes several variants tuned to different scenarios, including an extended model, Melody-G, that augments sequence input with embeddings derived from a single-cell RNA-seq (scRNA-seq) foundation model, enabling zero-shot generalization of methylation prediction to cell types not seen during training.

#Key Features

  • Locus-specific, tissue-aware prediction: Melody predicts cell-type-specific methylation profiles across 39 human tissues directly from genomic sequence, reportedly outperforming existing state-of-the-art approaches.
  • Long-range sequence context: By operating on 10-kb input windows rather than short flanking sequences, the model captures distal regulatory signals that short-window predictors miss.
  • scRNA-seq-augmented generalization (Melody-G): An extended variant integrates embeddings from a single-cell RNA-seq foundation model, enabling zero-shot inference of methylation states for unseen cell types directly from transcriptomic context.
  • Cross-task transfer to meQTLs: Melody generalizes beyond methylation prediction to methylation quantitative trait loci (meQTL) variant effect prediction, estimating how sequence variants alter methylation.
  • Interpretable sequence motifs: The framework surfaces key sequence motifs associated with methylation variability, offering mechanistic insight into the genomic logic of methylation specificity.

#Technical Details

Melody is a 1D fully convolutional, U-Net-style encoder–decoder trained on DNA methylation profiles spanning 39 human tissues, taking 10-kb genomic sequence windows as input to predict locus-specific methylation. Its encoder is built from inverted residual blocks and is aggressively downsampled so the receptive field expands rapidly to capture long-range genomic dependencies — including distal regulatory elements — while residual learning keeps training stable; the decoder upsamples back to base resolution. The framework is organized into multiple variants optimized for distinct tasks: the single-track configuration (Melody-ST) uses the 1D U-Net backbone with a single output channel in its final 1×1 convolution and retains auxiliary heads that predict CpG counts and regional methylation levels at 100-bp resolution, while multi-task formulations and the scRNA-seq-augmented Melody-G reuse the same convolutional backbone for per-tissue prediction, generalization to unseen cell types, and variant effect estimation. For zero-shot extension to new cell types, Melody-G conditions methylation prediction on embeddings produced by a single-cell RNA-seq foundation model, effectively using transcriptomic state as a proxy for cellular identity. As a bioRxiv preprint (v2, released under a CC BY-NC license), exact hyperparameters such as parameter count are described in the manuscript, and at the time of cataloging no public code repository or trained weights had been released; reported benchmark gains therefore await independent reproduction and peer review.

#Applications

Melody is aimed at epigenomics, regulatory-genomics, and statistical-genetics researchers who need methylation estimates where direct measurement is unavailable or incomplete. By predicting methylation from sequence alone, it can impute tissue- and cell-type-specific methylation for loci or cell types that were never assayed, and — through Melody-G — extend those predictions to new cell types using only scRNA-seq context. Its cross-task transfer to meQTL effect prediction makes it useful for interpreting non-coding variants and prioritizing candidate regulatory mutations, supporting fine-mapping and functional annotation efforts in disease genetics.

#Impact

Melody contributes to a growing class of models that treat the genome as a learnable code for the epigenome, and it advances the field by combining long-range sequence context, broad tissue coverage, and transcriptome-conditioned generalization in a single framework. If its reported improvements over prior methylation predictors and its zero-shot cell-type generalization hold up under peer review, Melody could make cell-type-specific methylation estimates broadly accessible and strengthen the interpretation of methylation-associated variants. As a preprint without released code or weights, its results currently require independent validation, but its emphasis on locus-specific, cross-tissue, and cross-task prediction marks it as a notable entry in epigenomic sequence modeling.

Citation

Melody: Decoding the Sequence Determinants of Locus-Specific DNA Methylation Across Human Tissues

Preprint

Jin, J., et al. (2025) Melody: Decoding the Sequence Determinants of Locus-Specific DNA Methylation Across Human Tissues. bioRxiv.

DOI: 10.1101/2025.11.23.689975

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0
Influential0
References32

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
8Closed
Usability — can I run it?7
Reproducibility — can I retrain it?10
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

cnndna_methylationepigenomicsmethylation_predictionmultimodalunetvariant_effect_predictionzero_shot

Resources

Research PaperOfficial Website