A foundation model for global 3D genome architecture that uses masked locus modeling over genome-wide contact profiles to represent chromosome-scale organization.
ARCH3D is a foundation model for the three-dimensional architecture of the genome, developed by researchers at the University of Michigan and posted to bioRxiv in early 2026. The spatial folding of chromosomes — captured experimentally by Hi-C and related chromosome-conformation assays — encodes how distant regulatory elements and genes come into contact, but most computational models of this data operate locally or on individual contact maps. ARCH3D instead aims to represent genome structure at a global, genome-wide scale.
The model is trained with a masked locus modeling objective: it learns to predict the genome-wide contact profiles of loci that are masked out, using the surrounding profiles spread across the entirety of the genome. This self-supervised approach lets the model build representations that capture long-range, chromosome-spanning organization rather than only nearby interactions.
By preserving spatial genomic organization and reconstructing chromosome-scale interactions even under limited data, ARCH3D is positioned as a building block toward computational systems that can simulate genome structure and function.
ARCH3D is a foundation model trained via masked locus modeling on genome-wide contact profiles, learning to infer the contact profile of masked loci from those distributed across the whole genome. The preprint reports that the model preserves spatial genomic organization, reconstructs chromosome-spanning interactions despite limited data availability, and identifies multi-way interactions. The work is released under a CC BY license. As a recent preprint, specific architectural details, parameter counts, the training datasets used, and the availability of released weights and code should be confirmed against the manuscript.
ARCH3D is intended for genome-biology and computational-genomics researchers studying 3D genome organization. Potential uses include representing and reconstructing chromosome-scale contact structure from sparse or incomplete measurements, detecting multi-way chromatin interactions, and providing learned embeddings of genome architecture that can support downstream analyses or simulations of genome function.
ARCH3D extends the foundation-model paradigm from genomic sequence to the spatial, 3D organization of the genome, addressing a global modeling gap left by methods that work locally or per-contact-map. As a recent preprint, its broader adoption and independent validation remain to be established, but it points toward foundation models that could underpin simulations of how genome structure shapes function.