bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
DNA & Gene foundation models
DNA & Gene

MethylSeqNet

University of California, Berkeley / University of Washington

Conditions a pretrained DNA sequence embedding on CpG methylation to predict gene regulation across cell types and alleles, generalizing zero-shot to imprinting, X-inactivation, and accessibility.

Released: June 2026

MethylSeqNet is a model of gene regulation that conditions a fixed, pretrained DNA sequence embedding on CpG methylation signal, learning how the same genomic sequence behaves differently depending on its epigenetic state. Rather than treating methylation as a target to impute or a binary modification to classify, MethylSeqNet uses it as a conditioning input that modulates a sequence representation, allowing a single model to capture cell-type-specific and allele-specific regulation directly from the interaction of sequence and methylation. It was introduced in a 2026 bioRxiv preprint from collaborators at the University of California, Berkeley (the Ioannidis and Streets labs) and the University of Washington (the Stergachis lab).

The central claim is a foundation-model one: a single trained checkpoint generalizes across distinct regulatory phenomena without per-task retraining. The authors apply the same model to at least five held-out biological settings — cell-type-specific chromatin accessibility, transcription, parent-of-origin genomic imprinting, random monoallelic expression, and X-chromosome inactivation — and find that it recovers each from the shared sequence-plus-methylation representation. This cross-phenomenon transfer is the key signal that MethylSeqNet has learned a general grammar of methylation-dependent regulation rather than a narrow, task-specific mapping.

This positions MethylSeqNet distinctly within the DNA-methylation modeling landscape. It is not an expression-to-methylation imputation tool (as in MethylProphet) and not a modification-site classifier (as in MuLan-Methyl); instead it treats methylation as a regulatory variable that reshapes how sequence is read, unifying allelic and cell-type regulation under one framework.

#Key Features

  • Methylation as a conditioning input: MethylSeqNet conditions a fixed pretrained DNA sequence embedding on CpG methylation, modeling how identical sequence produces different regulatory outcomes depending on epigenetic state.
  • Single checkpoint, many phenomena: One trained model is applied without retraining to chromatin accessibility, transcription, imprinting, random monoallelic activity, and X-inactivation, demonstrating cross-phenomenon generalization.
  • Allele-resolved regulation: By coupling sequence with allele-specific methylation, the model captures monoallelic regulatory behavior, including parent-of-origin imprinting and X-chromosome inactivation.
  • Cell-type-specific predictions: Because methylation patterns differ across cell types, conditioning on them lets the model predict cell-type-specific accessibility and transcription from a shared backbone.
  • Foundation-model framing: The zero-shot transfer across regulatory tasks is treated as the primary evidence of a learned, general representation of methylation-dependent gene regulation.

#Technical Details

MethylSeqNet builds on a fixed, pretrained DNA sequence embedding (the underlying base model is not named in the preprint abstract) and adds a conditioning pathway that injects per-CpG methylation signal so that the sequence representation is modulated by epigenetic state. Critically, evaluation uses the same trained checkpoint across all reported settings — there is no per-task fine-tuning — so performance on each of the five-plus held-out phenomena (cell-type-specific chromatin accessibility, transcription, parent-of-origin imprinting, random monoallelic expression, and X-chromosome inactivation) reflects zero-shot transfer from a single model. The work was released as a bioRxiv preprint on June 7, 2026 under a CC BY license. At the time of writing, no public code or model weights had been released, and the specific parameter count and the identity of the pretrained sequence backbone are not stated in the abstract.

#Applications

MethylSeqNet is aimed at researchers studying epigenetic gene regulation who want to connect DNA methylation to functional outcomes such as accessibility, transcription, and allele-specific expression. Because a single model spans cell-type-specific and allele-specific regulation, it can serve as a unified interpretive tool for analyzing imprinted loci, X-inactivation, and random monoallelic genes, and for asking how methylation changes might reshape regulatory state. It is most relevant to computational epigenomics and regulatory-genomics groups working with paired sequence and methylation data.

#Impact

By reframing methylation as a conditioning variable that modulates a sequence model — and showing that one checkpoint generalizes zero-shot across imprinting, X-inactivation, monoallelic expression, accessibility, and transcription — MethylSeqNet offers a unifying account of methylation-dependent regulation that prior tools, which target a single task such as methylation imputation or modification-site classification, do not provide. As a 2026 preprint without released code or weights, its downstream adoption remains to be established, but the cross-phenomenon generalization result is a notable proof of concept for foundation-model approaches to epigenetic regulation.

Citation

Epigenetic conditioning improves sequence-based modeling of gene regulation across cell types and alleles

Dixon-Luinenburg, O., et al. (2026) Epigenetic conditioning improves sequence-based modeling of gene regulation across cell types and alleles. openRxiv.

DOI: 10.64898/2026.06.02.729723

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
18Closed
Usability — can I run it?15
Reproducibility — can I retrain it?10
Model Openness Framework
Unclassified
Missing required components

Tags

chromatin_accessibility_predictiondna_methylationepigeneticsfoundation_modelgene_regulationself_supervisedtransformervariant_effect_predictionzero_shot

Resources

Research Paper