Multimodal masked-modeling genomics foundation model that unifies sequence-to-function prediction, DNA language modeling, and generative regulatory design in one pretrained architecture.
Nona is a multimodal genomics foundation model from Genentech's BRAID (Biology Research | AI Development) group, introduced in a November 2025 bioRxiv preprint. It addresses a persistent fragmentation in computational genomics: sequence-to-function prediction, DNA language modeling, and generative design of regulatory elements have historically been tackled by separate, specialized models, each with its own architecture, training regime, and assumptions. Nona instead pursues a single pretrained model that performs all three within one framework.
The model's central idea is a multimodal masked-modeling objective that jointly trains on raw DNA sequence and base-resolution functional genomics measurements. By treating both the underlying sequence and its measured functional readouts as modalities to be reconstructed from masked inputs, Nona learns representations that couple genomic sequence to regulatory activity at single-nucleotide resolution. The same pretrained backbone supports discriminative tasks (predicting function from sequence) and generative tasks (designing new sequences with desired regulatory properties), the latter cast as masked discrete diffusion over the sequence modality.
A defining claim of the work is generality without retraining: the authors demonstrate three distinct downstream applications drawn directly from the single pretrained architecture, rather than fine-tuning bespoke models per task. This positions Nona within the broader movement — alongside genomic foundation models such as Enformer-style sequence-to-function predictors and DNA language models like the Nucleotide Transformer and Evo lineage — toward unified, reusable backbones for regulatory genomics.
Nona is a multimodal masked-modeling framework that unifies three capabilities — sequence-to-function prediction, DNA language modeling, and generative regulatory element design — in one pretrained model. Pretraining jointly uses DNA sequence and base-resolution functional genomics data under a masked-reconstruction objective; generation is performed via masked discrete diffusion, a class of discrete generative methods that progressively unmask tokens and is well suited to sequence design. The work is described in the bioRxiv preprint posted on 6 November 2025 (version 2, 18 November 2025; DOI 10.1101/2025.11.06.687036). Detailed architectural hyperparameters such as parameter count, context length, and the specific functional-genomics assays used for training are reported in the preprint; precise figures are not restated here pending confirmation from the primary source.
Nona targets researchers in regulatory and functional genomics who need both predictive and generative capabilities from a common model. Sequence-to-function prediction supports interpreting how genomic sequence — including variants — shapes regulatory activity, relevant to prioritizing noncoding variants and dissecting gene regulation. The DNA language modeling capability provides general-purpose sequence representations for downstream analysis, while the masked-diffusion generative mode enables de novo design of regulatory elements such as promoters or enhancers with intended properties, of interest to synthetic biology and therapeutic discovery teams. Because the three applications derive from one pretrained backbone without retraining, the model is positioned to streamline workflows that would otherwise require assembling several specialized tools.
Nona contributes to the consolidation of regulatory genomics around unified foundation models by showing that prediction, language modeling, and generative design can share a single multimodal masked-modeling backbone grounded in base-resolution functional data. As a Genentech BRAID release, it reflects industry investment in reusable genomic foundation models for therapeutic discovery. As of June 2026, no public code or model weights were located — the previously expected genentech/nona GitHub and Hugging Face repositories return 404 — and the preprint is released under CC BY-NC 4.0, with the license for any future model weights unspecified. These openness gaps, together with the preprint (not yet peer-reviewed) status, are the primary caveats for prospective users; independent benchmarking and reproducibility await the release of artifacts or peer review.
Nair, S., et al. (2025) Nona: A unifying multimodal masking framework for functional genomics. bioRxiv.
DOI: 10.1101/2025.11.06.687036Papers that recently cited this model.
Masayuki Nagai, A. E. Murphy, Kaeli Rizzo, et al.
Feb 2026
The most-cited papers that cite this model.
Masayuki Nagai, A. E. Murphy, Kaeli Rizzo, et al.
Feb 2026
Share of papers citing this model.