ProtNHF

Neural Hamiltonian flow for protein sequence generation with inference-time control over composition and net charge via analytical bias potentials.

Released: March 2026

Generative models for protein sequences can sample plausible proteins, but steering them toward specific physicochemical targets—a desired amino acid composition, a particular net charge, or other compositional constraints—usually means conditioning the model during training or fine-tuning it for each new objective. ProtNHF takes a different route, framing controllable protein generation as a problem in Hamiltonian dynamics so that control can be applied entirely at inference time.

Developed at Oak Ridge National Laboratory and posted to bioRxiv in March 2026, ProtNHF (Protein Neural Hamiltonian Flows) learns a symplectic transport map that moves samples from a simple latent distribution to the space of protein embeddings. A transformer parameterizes a learned potential-energy function that, combined with a kinetic term, defines Hamiltonian dynamics integrated with a leapfrog scheme. Because the dynamics are governed by an energy function, external "bias potentials" can be added directly into the Hamiltonian at generation time to nudge sampling toward desired properties without altering or retraining the underlying model.

This places ProtNHF in the family of flow- and energy-based generative models for proteins, but with an unusual emphasis: smooth, quantitative, post-hoc control derived from the physics-inspired structure of Hamiltonian flows rather than from learned conditioning.

Key Features

Inference-time controllability: Analytical bias potentials are injected into the Hamiltonian at sampling time, so properties like composition and net charge can be tuned without modifying or retraining the learned model.
Symplectic transport map: The model learns an energy-conserving, reversible map from a latent distribution to protein embeddings, integrated with leapfrog dynamics.
Transformer potential: A transformer parameterizes the potential-energy function, supplying the expressive, sequence-aware energy landscape that drives the flow.
Smooth property steering: Control over amino acid composition and net charge is continuous and quantitative, allowing graded rather than all-or-nothing constraints while preserving sequence diversity.

Technical Details

ProtNHF combines a learned, transformer-based potential energy with a kinetic term to construct a Hamiltonian whose dynamics define an invertible, volume-preserving flow. Training fits this neural Hamiltonian so that deterministic leapfrog integration maps samples from a latent distribution onto the distribution of protein sequence embeddings. At inference, analytical bias functions—encoding objectives such as target amino acid composition or net charge—are added to the Hamiltonian, biasing the integrated trajectories toward sequences satisfying those constraints. The authors report that this steering maintains sequence validity and diversity while delivering continuous control, demonstrating the approach on compositional and charge targets without any retraining of the base model.

Applications

ProtNHF is intended for protein designers who need to bias sequence generation toward specific biophysical properties—for example, tuning net charge for solubility or purification, or constraining amino acid composition to meet expression or formulation requirements. Because control is applied at inference time, a single trained model can serve many design campaigns with different objectives, which is attractive for high-throughput in silico screening and for exploratory design where targets shift frequently. The approach is most directly useful to computational protein engineers and groups exploring energy- and physics-inspired generative methods.

Impact

ProtNHF illustrates how physics-inspired generative architectures can deliver controllability "for free" at inference, decoupling property steering from model training and avoiding the cost of retraining for each new objective. If the approach generalizes, it offers a template for adding interpretable, analytically specified constraints to protein generative models more broadly. As an early preprint without a confirmed public code or weights release, its practical performance relative to conditioned diffusion and autoregressive designers remains to be benchmarked independently, and the demonstrated controls so far focus on composition and net charge rather than structure-level objectives.

Citation

ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation

Raghavan, B. & Rogers, D. M. (2026) ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation. bioRxiv.

DOI: 10.64898/2026.03.04.709305

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations0

Influential0

References44

Fields of citing research

Not enough data

Openness

bio.rodeo opennessOpen weights · open weights, closed recipe

64Partial

Usability — can I run it?99

Reproducibility — can I retrain it?29

open weights, closed recipe

Model Openness Framework

Class III

Open Model

Resources

Research Paper

Key Features

Inference-time controllability: Analytical bias potentials are injected into the Hamiltonian at sampling time, so properties like composition and net charge can be tuned without modifying or retraining the learned model.

Symplectic transport map: The model learns an energy-conserving, reversible map from a latent distribution to protein embeddings, integrated with leapfrog dynamics.

Transformer potential: A transformer parameterizes the potential-energy function, supplying the expressive, sequence-aware energy landscape that drives the flow.

Smooth property steering: Control over amino acid composition and net charge is continuous and quantitative, allowing graded rather than all-or-nothing constraints while preserving sequence diversity.

Technical Details

Applications

Impact

ProtNHF

Key Features

Technical Details

Applications

Impact

Citation

ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

ProtNHF

Key Features

Technical Details

Applications

Impact

Citation

ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation

Recent citations

Top citations

Citations

Fields of citing research

Openness

Tags

Resources

ProtNHF

#Key Features

#Technical Details

#Applications

#Impact

Citation

ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

ProtNHF

#Key Features

#Technical Details

#Applications

#Impact

Citation

ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation

Recent citations

Top citations

Related models

Citations

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact