University of California, Berkeley / University of Washington
Conditions a pretrained DNA sequence embedding on CpG methylation to predict gene regulation across cell types and alleles, generalizing zero-shot to imprinting, X-inactivation, and accessibility.
MethylSeqNet is a model of gene regulation that conditions a fixed, pretrained DNA sequence embedding on CpG methylation signal, learning how the same genomic sequence behaves differently depending on its epigenetic state. Rather than treating methylation as a target to impute or a binary modification to classify, MethylSeqNet uses it as a conditioning input that modulates a sequence representation, allowing a single model to capture cell-type-specific and allele-specific regulation directly from the interaction of sequence and methylation. It was introduced in a 2026 bioRxiv preprint from collaborators at the University of California, Berkeley (the Ioannidis and Streets labs) and the University of Washington (the Stergachis lab).
The central claim is a foundation-model one: a single trained checkpoint generalizes across distinct regulatory phenomena without per-task retraining. The authors apply the same model to at least five held-out biological settings — cell-type-specific chromatin accessibility, transcription, parent-of-origin genomic imprinting, random monoallelic expression, and X-chromosome inactivation — and find that it recovers each from the shared sequence-plus-methylation representation. This cross-phenomenon transfer is the key signal that MethylSeqNet has learned a general grammar of methylation-dependent regulation rather than a narrow, task-specific mapping.
This positions MethylSeqNet distinctly within the DNA-methylation modeling landscape. It is not an expression-to-methylation imputation tool (as in MethylProphet) and not a modification-site classifier (as in MuLan-Methyl); instead it treats methylation as a regulatory variable that reshapes how sequence is read, unifying allelic and cell-type regulation under one framework.
MethylSeqNet builds on a fixed, pretrained DNA sequence embedding (the underlying base model is not named in the preprint abstract) and adds a conditioning pathway that injects per-CpG methylation signal so that the sequence representation is modulated by epigenetic state. Critically, evaluation uses the same trained checkpoint across all reported settings — there is no per-task fine-tuning — so performance on each of the five-plus held-out phenomena (cell-type-specific chromatin accessibility, transcription, parent-of-origin imprinting, random monoallelic expression, and X-chromosome inactivation) reflects zero-shot transfer from a single model. The work was released as a bioRxiv preprint on June 7, 2026 under a CC BY license. At the time of writing, no public code or model weights had been released, and the specific parameter count and the identity of the pretrained sequence backbone are not stated in the abstract.
MethylSeqNet is aimed at researchers studying epigenetic gene regulation who want to connect DNA methylation to functional outcomes such as accessibility, transcription, and allele-specific expression. Because a single model spans cell-type-specific and allele-specific regulation, it can serve as a unified interpretive tool for analyzing imprinted loci, X-inactivation, and random monoallelic genes, and for asking how methylation changes might reshape regulatory state. It is most relevant to computational epigenomics and regulatory-genomics groups working with paired sequence and methylation data.
By reframing methylation as a conditioning variable that modulates a sequence model — and showing that one checkpoint generalizes zero-shot across imprinting, X-inactivation, monoallelic expression, accessibility, and transcription — MethylSeqNet offers a unifying account of methylation-dependent regulation that prior tools, which target a single task such as methylation imputation or modification-site classification, do not provide. As a 2026 preprint without released code or weights, its downstream adoption remains to be established, but the cross-phenomenon generalization result is a notable proof of concept for foundation-model approaches to epigenetic regulation.
Dixon-Luinenburg, O., et al. (2026) Epigenetic conditioning improves sequence-based modeling of gene regulation across cell types and alleles. openRxiv.
DOI: 10.64898/2026.06.02.729723Papers that recently cited this model.
The most-cited papers that cite this model.
Not enough data