Wisteria

DNA language model combining Mamba state-space layers, gated dilated convolutions, and Fourier attention to capture multi-scale regulatory patterns.

Released: May 2026

Wisteria is a pretrained DNA language model introduced in May 2026 by researchers at Inner Mongolia University, spanning its College of Computer Science and School of Life Sciences. It tackles a persistent tension in genomic sequence modeling: regulatory information in DNA is distributed across radically different length scales, from short sequence motifs spanning a handful of bases to long-range interactions that stretch across hundreds of thousands of base pairs. Most prior genomic foundation models lean on a single inductive bias — transformer attention, convolution, or a state-space backbone — that captures one end of this spectrum well while underperforming on the other.

The model's central contribution is a unified multi-scale feature-learning framework that combines three complementary mechanisms in a single architecture. Mamba state-space layers provide efficient long-range sequence modeling, gated dilated convolutions capture local motifs and regulatory patterns, and a Fourier-based attention mechanism adds frequency-domain modeling to support periodic structure and length generalization. The design goal is a single pretrained backbone whose learned representations transfer across diverse genomic tasks without task-specific retraining.

Wisteria sits alongside recent genomic language models such as DNABERT-2, the Nucleotide Transformer, HyenaDNA, and Caduceus, but distinguishes itself by hybridizing state-space, convolutional, and spectral components rather than committing to one. It is evaluated in a frozen, transfer-learning setting across four benchmark families.

Key Features

Multi-scale hybrid architecture: Mamba state-space layers, gated dilated convolutions, and Fourier-based attention are combined so that local motifs and long-range regulatory dependencies are modeled jointly rather than by a single mechanism.
Fourier-based attention: A frequency-domain attention mechanism supports periodic extension and length generalization, helping the model handle sequence lengths and periodic patterns beyond those emphasized during pretraining.
Large-scale human pretraining: Pretrained on roughly 35 billion nucleotide tokens drawn from the human reference genome (hg38), partitioned into 34,021 segments of up to approximately one million base pairs.
Transfer without retraining: Evaluated as a frozen feature extractor across multiple benchmark suites, demonstrating that its pretrained representations transfer broadly rather than requiring task-specific architectural changes.
Strong cross-benchmark results: Reported as the best method on 6 of 8 Genomic Benchmarks tasks and 16 of 18 Nucleotide Transformer tasks, with additional evaluation on BEND and variant effect prediction.

Technical Details

Wisteria augments a Mamba-based backbone with gated dilated convolutions to capture local motifs and regulatory patterns, while gated multilayer perceptrons refine global dependencies; a Fourier-based attention mechanism adds frequency-domain modeling for periodic extension and length generalization. Pretraining uses self-supervised learning over approximately 35 billion nucleotide tokens from hg38, organized into 34,021 segments of up to roughly one million base pairs.

The model is assessed across four benchmark families — Genomic Benchmarks, the Nucleotide Transformer task suite, BEND, and variant effect prediction — using its pretrained representations without retraining. On Genomic Benchmarks it is reported as the top method on 6 of 8 tasks, and on the Nucleotide Transformer suite as the top method on 16 of 18 tasks. The preprint does not report a single production parameter count; configurations evaluated across the benchmarks span a wide range, from roughly 550,000 to about 100 million parameters, so the operative model size depends on the benchmark setting rather than a fixed checkpoint.

Applications

Wisteria targets the standard downstream tasks of genomic foundation models: classifying regulatory elements such as promoters and enhancers, predicting histone modifications and other epigenetic marks, splice-site and regulatory annotation, and scoring the functional impact of sequence variants. Because it is designed to be used as a frozen feature extractor, computational biologists can apply its embeddings to new labeling tasks without retraining the backbone, and the long pretraining context makes it relevant for tasks where regulatory signal spans large genomic windows.

Impact

Wisteria contributes to an active line of research arguing that genomic foundation models benefit from architectures that explicitly span multiple length scales rather than relying on a single modeling primitive. Its strong reported results across Genomic Benchmarks and the Nucleotide Transformer suite — under a frozen transfer-learning protocol — suggest that combining state-space, convolutional, and spectral components is a promising direction for DNA representation learning. Important caveats apply: the work is a preprint under review, and as released it provides no public model weights or code, with the paper distributed under a CC BY-NC-ND 4.0 license. The absence of a single reported parameter count and of an open implementation currently limits independent reproduction and direct head-to-head comparison with openly released genomic models.

Citations

Wisteria: A unified multi-scale feature learning framework for DNA language model

Wang, W., et al. (2026) Wisteria: A unified multi-scale feature learning framework for DNA language model. Pattern Recognition.

DOI: 10.1016/j.patcog.2026.114289

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model

Preprint

Wang, W., et al. (2026) Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model.

DOI: 10.48550/arXiv.2605.05913

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations8

Influential1

References0

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility

10Closed

Usability — can I run it?7

Reproducibility — can I retrain it?14

Model Openness Framework

Unclassified

Restrictive license on core components

Resources

Research Paper

Key Features

Multi-scale hybrid architecture: Mamba state-space layers, gated dilated convolutions, and Fourier-based attention are combined so that local motifs and long-range regulatory dependencies are modeled jointly rather than by a single mechanism.

Fourier-based attention: A frequency-domain attention mechanism supports periodic extension and length generalization, helping the model handle sequence lengths and periodic patterns beyond those emphasized during pretraining.

Large-scale human pretraining: Pretrained on roughly 35 billion nucleotide tokens drawn from the human reference genome (hg38), partitioned into 34,021 segments of up to approximately one million base pairs.

Transfer without retraining: Evaluated as a frozen feature extractor across multiple benchmark suites, demonstrating that its pretrained representations transfer broadly rather than requiring task-specific architectural changes.

Strong cross-benchmark results: Reported as the best method on 6 of 8 Genomic Benchmarks tasks and 16 of 18 Nucleotide Transformer tasks, with additional evaluation on BEND and variant effect prediction.

Technical Details

Applications

Impact

Citations

Wisteria: A unified multi-scale feature learning framework for DNA language model

Wang, W., et al. (2026) Wisteria: A unified multi-scale feature learning framework for DNA language model. Pattern Recognition.

DOI: 10.1016/j.patcog.2026.114289

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model

Preprint

Wang, W., et al. (2026) Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model.

DOI: 10.48550/arXiv.2605.05913

Wisteria

Key Features

Technical Details

Applications

Impact

Citations

Wisteria: A unified multi-scale feature learning framework for DNA language model

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

Wisteria

Key Features

Technical Details

Applications

Impact

Citations

Wisteria: A unified multi-scale feature learning framework for DNA language model

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

Wisteria

#Key Features

#Technical Details

#Applications

#Impact

Citations

Wisteria: A unified multi-scale feature learning framework for DNA language model

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Wisteria

#Key Features

#Technical Details

#Applications

#Impact

Citations

Wisteria: A unified multi-scale feature learning framework for DNA language model

Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact