bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Protein foundation models
Protein

ProtFlow

Zhejiang University

A flow-matching generative model for peptide sequence design that learns the protein semantic distribution, with antimicrobial-peptide fine-tuning.

Released: February 2026

ProtFlow is a generative model for protein and peptide sequence design that uses rectified flow matching to learn the underlying semantic distribution of the protein design space. It was developed by researchers in the College of Computer Science and Technology at Zhejiang University and posted to bioRxiv in early 2026. Where many recent sequence-design methods rely on autoregressive language models or diffusion, ProtFlow applies flow matching — a continuous-time generative paradigm that learns to transport noise to data along straight (rectified) paths — to the problem of proposing functional peptide sequences.

A central design choice is to model the protein "semantic distribution" through a semantic integration network, so that generation is grounded in learned representations of sequence meaning rather than raw token statistics alone. The authors pretrain on a large corpus of peptide sequences and then fine-tune toward a concrete therapeutic objective: the design of antimicrobial peptides (AMPs) active against a range of pathogens.

ProtFlow sits within the fast-growing space of generative protein-design models, contributing a flow-matching approach aimed at efficient, high-quality peptide generation with controllable functional properties.

#Key Features

  • Rectified flow matching: Uses a flow-matching generative process to capture the protein design manifold efficiently along rectified transport paths.
  • Semantic distribution learning: A semantic integration network grounds generation in learned representations of protein sequence semantics.
  • Antimicrobial-peptide focus: Fine-tuned to design AMPs with desired activity profiles across multiple pathogens.
  • Functional controllability: Targets generation of high-quality peptides with specified activity rather than unconditioned sampling.

#Technical Details

ProtFlow employs a rectified flow-matching algorithm together with a semantic integration network to model the distribution over peptide sequences. According to the preprint, the model is pretrained on roughly 2.6 million peptide sequences and then fine-tuned on antimicrobial peptides, after which it is evaluated on its ability to generate high-quality peptides with desired antimicrobial activity across various pathogens. The paper reports that ProtFlow generates peptides that compare favorably to prior approaches on these design objectives. It is released under a CC BY-NC-ND license. As a recent preprint, exact parameter counts, full hyperparameters, and the availability of released weights and code should be confirmed against the manuscript.

#Applications

ProtFlow is intended for researchers designing functional peptides, with antimicrobial peptides as the primary demonstrated use case. Such models help triage and propose candidate sequences computationally — for example AMPs targeting drug-resistant pathogens — before synthesis and experimental assays, narrowing large design spaces to promising leads.

#Impact

ProtFlow adds flow matching to the toolbox of generative peptide-design methods, emphasizing semantic-distribution learning and a concrete antimicrobial-peptide application. As a recent preprint with a non-commercial license, its broader adoption and independent experimental validation remain to be established, but it reflects growing interest in flow-based generative models for protein and peptide design.

Tags

protein_designpeptide_designde_novo_designflow_matchinggenerativefoundation_modelantimicrobial_peptides