Overview

CryoFM is a generative foundation model for cryo-electron microscopy (cryo-EM) density maps, developed by ByteDance Seed. Rather than training separate supervised models for each cryo-EM processing task, CryoFM learns a generalizable prior over the distribution of high-quality biomolecular density maps using flow matching. At inference time, this prior is applied to multiple downstream tasks — including denoising, map sharpening, and missing wedge restoration — without any task-specific fine-tuning, using a technique called flow posterior sampling.

The original CryoFM model was released as an arXiv preprint in October 2024 and subsequently accepted at ICLR 2025. A substantially expanded version, CryoFM-v2, was released as a bioRxiv preprint in December 2025 with an extended Bayesian inference framework that adds support for preferred-orientation correction, spatially heterogeneous signal-to-noise handling, and cryo-electron tomography (cryo-ET) artifact correction.

The model was trained on a curated subset of the Electron Microscopy Data Bank (EMDB), comprising 3,447 high-quality single-particle cryo-EM half-map pairs filtered to a reported resolution better than 3.0 Angstroms. A held-out set of 32 entries was reserved for benchmarking. The EMDB IDs used for both training and testing are publicly available via Figshare.

Key Features

Flow posterior sampling: Uses CryoFM as a Bayesian prior combined with task-specific likelihood models to perform denoising, sharpening, and missing wedge restoration at inference time without fine-tuning.
Zero-shot generalization: A single trained model addresses multiple cryo-EM processing tasks, eliminating the need for task-specific supervised training data.
Dual model scales: CryoFM-S captures fine local structural detail at high resolution; CryoFM-L emphasizes global shape at medium-to-low resolution.
Cryo-ET support: Extends to cryo-electron tomography, including correction of missing wedge artifacts that distort density along the beam axis in tomographic reconstructions.
Hallucination resistance: Posterior sampling is explicitly constrained by dataset-derived likelihood models, reducing the risk of fabricated structural features compared to unconstrained generative methods.
Fine-tuning ready: CryoFM-v2 weights on HuggingFace support fine-tuning into conditional generative models for specialized density post-processing tasks.

Technical Details

CryoFM uses a 3D U-Net backbone trained with flow matching. The U-Net combines neighborhood attention (NA) — operating on localized spatial windows to capture fine local structure efficiently — with global attention (GA) layers that integrate information across the full volume. This hybrid attention design balances computational cost at the input/output layers with the need for long-range context at the bottleneck.

Flow matching trains the model to learn a continuous vector field mapping samples from a simple Gaussian distribution to the complex target distribution of high-quality cryo-EM half maps. At inference, flow posterior sampling conditions this generative process on observed degraded data by incorporating explicit likelihood models that describe the specific experimental degradation (e.g., spectral noise from measurement, anisotropic angular sampling, or the missing wedge in tomography). CryoFM-v2 extends the likelihood framework with additional components covering preferred-orientation artifacts and spatially heterogeneous signal-to-noise ratios. In benchmarks evaluated using Fourier shell correlation (FSC) and structural similarity (SSIM) metrics, CryoFM achieves state-of-the-art or competitive performance on denoising, map sharpening, and missing wedge restoration, outperforming supervised baselines such as DeepEMhancer and crefDenoiser while requiring no task-specific training data.

Applications

CryoFM is relevant wherever cryo-EM density maps suffer from noise, limited angular sampling, or reconstruction artifacts. Structural biologists processing single-particle cryo-EM datasets can use it for map denoising and sharpening to improve downstream model-building interpretability. Cryo-ET practitioners can apply it to correct missing wedge artifacts that are unavoidable in tomographic data collection. The model is also applicable to samples with preferred orientations — a persistent challenge for membrane proteins and other flat or elongated complexes — and to datasets with spatially non-uniform signal-to-noise, such as flexible assemblies or small particles on thin ice. For groups developing specialized density modification pipelines, CryoFM-v2 provides a foundation for fine-tuning on domain-specific data.

Impact

CryoFM represents a principled shift in cryo-EM image processing from task-specific supervised models to a single generalist generative prior, demonstrating that zero-shot generalization across structurally distinct processing problems is achievable in the cryo-EM domain. Acceptance at ICLR 2025 marks formal peer-reviewed validation of the original framework. The release of CryoFM-v2 on HuggingFace broadens accessibility and establishes a publicly available foundation for the community to build upon. Key limitations include a resolution ceiling imposed by training data restricted to sub-3.0 Angstrom maps, exclusion of helical reconstructions and subtomogram averaging workflows from the training set, and the computational overhead of iterative flow posterior sampling relative to single-pass supervised methods. The CryoFM-v2 framework, released in December 2025, remains a preprint pending peer review.

Citations

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

Preprint

Zhou, Y., et al. (2024) CryoFM: A Flow-based Foundation Model for Cryo-EM Densities. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2410.08631

A Generative Foundation Model for Cryo-EM Densities

Li, Y., et al. (2025) A Generative Foundation Model for Cryo-EM Densities. bioRxiv.

DOI: 10.64898/2025.12.29.696802

Overview

Key Features

Flow posterior sampling: Uses CryoFM as a Bayesian prior combined with task-specific likelihood models to perform denoising, sharpening, and missing wedge restoration at inference time without fine-tuning.

Zero-shot generalization: A single trained model addresses multiple cryo-EM processing tasks, eliminating the need for task-specific supervised training data.

Dual model scales: CryoFM-S captures fine local structural detail at high resolution; CryoFM-L emphasizes global shape at medium-to-low resolution.

Cryo-ET support: Extends to cryo-electron tomography, including correction of missing wedge artifacts that distort density along the beam axis in tomographic reconstructions.

Hallucination resistance: Posterior sampling is explicitly constrained by dataset-derived likelihood models, reducing the risk of fabricated structural features compared to unconstrained generative methods.

Fine-tuning ready: CryoFM-v2 weights on HuggingFace support fine-tuning into conditional generative models for specialized density post-processing tasks.

Technical Details

Applications

Impact

Citations

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

Preprint

Zhou, Y., et al. (2024) CryoFM: A Flow-based Foundation Model for Cryo-EM Densities. International Conference on Learning Representations.

DOI: 10.48550/arXiv.2410.08631

A Generative Foundation Model for Cryo-EM Densities

Li, Y., et al. (2025) A Generative Foundation Model for Cryo-EM Densities. bioRxiv.

DOI: 10.64898/2025.12.29.696802

CryoFM

Overview

Key Features

Technical Details

Applications

Impact

Citations

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

A Generative Foundation Model for Cryo-EM Densities

Metrics

Citations

HuggingFace

Tags

Resources

CryoFM

Overview

Key Features

Technical Details

Applications

Impact

Citations

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

A Generative Foundation Model for Cryo-EM Densities

Metrics

Citations

HuggingFace

Tags

Resources