RFdiffusion for de novo protein binder design
RFdiffusion generates novel protein structures conditioned on a target binding site, enabling de novo protein design and computational binder discovery without starting from any known scaffold or natural template.
Diffusion-based backbone generation for protein design
RFdiffusion is a generative model that produces protein backbone structures by reversing a noise diffusion process. Trained on the Protein Data Bank, it has learned the distribution of physically realizable protein folds and can generate novel backbone coordinates that are geometrically complementary to a specified target surface.
The key insight: RFdiffusion does not search existing sequence databases. It generates protein structures that have never existed in nature, constrained only by the physics of protein folding and the geometry of the target binding site.
Gaussian noise is progressively added to known protein structures until they become random coordinates.
The trained model learns to reverse this noise, generating valid backbone coordinates from random starting points.
Target structure and hotspot residues constrain the generation, producing backbones that engage the specified epitope.
Structure-guided, not random
Not all epitope residues contribute equally to binding energy. Hotspot residues are identified through computational alanine scanning, evolutionary conservation, and interface energy decomposition.
By conditioning RFdiffusion on these hotspot residues, generated backbones are forced to make contacts at the most energetically productive positions. This dramatically increases the fraction of designs that survive downstream validation compared to unconstrained generation.
Tuning the design campaign
Target surface residues that the generated binder must contact. Defines binding geometry.
Amino acid count of the generated backbone. Typically 60-150 residues. Shorter = more compact, longer = more surface area.
Number of denoising steps. More steps = higher quality but slower. We optimize per campaign.
For homodimer or multimeric targets, symmetry operations enforce identical binding at each subunit.
What RFdiffusion does not do
Not sequence design. RFdiffusion generates backbone coordinates only. Amino acid sequences must be designed separately using ProteinMPNN or similar inverse folding tools.
Target structure required. Performance depends on the quality of the input target structure. Low-confidence AlphaFold models or poorly resolved regions produce lower-quality outputs.
No affinity prediction. RFdiffusion generates structurally plausible backbones, not binding affinity estimates. Downstream validation with structure prediction and experimental screening is required.
Where RFdiffusion fits
RFdiffusion is one of three generative models in our design pipeline, alongside BindCraft and Boltzgen. Backbone outputs feed into ProteinMPNN for sequence design, then through structural validation with Boltz-2, ESMFold, and ColabFold before any candidate advances to synthesis.
Ready to run an RFdiffusion campaign?
Send us your target structure. We will assess the binding site, define hotspots, and propose a design campaign.
Start a project →