ProteinMPNN inverse folding sequence design
ProteinMPNN solves the inverse-folding problem: given a backbone, what amino acid sequence will fold into that structure? A message-passing neural network that has become the standard pairing for RFdiffusion, BoltzGen, and other de novo backbone generators in modern protein design pipelines.
Run sequence design on a generated backbone. Temperature sampling, fixed-position interface constraints, and stability scoring are all exposed as standard arguments.
Based on Dauparas et al., Science 2022. Used inside every Ranomics wet-lab binder campaign that passes through RFdiffusion or BoltzGen.
From backbone coordinates to scored sequences
Input backbone
Provide backbone coordinates from RFdiffusion, BoltzGen, or any de novo generator. Optionally pin hotspot contact residues as fixed positions.
MPNN inference
The message-passing network considers each residue position in the context of its local geometric environment and learned neighbor identities.
Sampled sequences
Generate 8-16 sequences per backbone at temperatures 0.1-0.5. Each sequence receives a model-confidence score reflecting predicted foldability.
Filter and rank
Apply solubility, aggregation, and motif filters; rank by MPNN score; advance the highest-confidence designs to AlphaFold2 structural validation.
How we configure ProteinMPNN in production runs
Five settings define a Ranomics ProteinMPNN job. These are the same defaults we use in wet-lab campaigns and that the tools-hub UI exposes for self-serve users.
Sequences per backbone
For each backbone we generate 8-16 candidate sequences. This produces sequence diversity across a single topology, so the downstream AF2 validator can pick the best-folding variant rather than betting on a single sample.
Temperature sampling
Low T (0.1) yields conservative, high-confidence sequences; high T (0.4-0.5) explores more diversity at the cost of foldability. Default 0.2-0.3 is the empirical sweet spot for binder design.
Interface position constraints
Hotspot contact residues identified during backbone generation are locked. ProteinMPNN optimizes everything else while preserving the binding geometry at the designed interface.
Stability and foldability ranking
Each sequence receives a score reflecting the model confidence that it will fold into the target backbone. We rank within-backbone and advance only the top fraction to AlphaFold2 validation.
Solubility and motif checks
Predicted solubility, aggregation propensity, free cysteines, and N-glycosylation sites are screened before sequences leave the design pool. Problematic motifs are filtered out, not just flagged.
Where ProteinMPNN sits in a de novo design run
Between backbone gen and AF2
Backbone generators produce coordinates without amino acid identities. ProteinMPNN assigns the sequence; AlphaFold2 then refolds each sequence end-to-end and we keep only the ones whose predicted structure matches the input backbone. Three discrete stages, three different models.
Pairs with RFdiffusion and BoltzGen
Both RFdiffusion and BoltzGen output sequence-agnostic backbones, and ProteinMPNN is the standard inverse-folder for both. The RFdiffusion paper and the broader open-source binder design community treat MPNN as the default sequence design step.
Skip for BindCraft
BindCraft handles its own sequence design inside an iterative co-optimization loop against AlphaFold2. Running ProteinMPNN on BindCraft output is redundant and can hurt scores. Use BindCraft’s native designs directly, validate, and move on.
The default sequence design step for de novo binders
If you are running RFdiffusion or BoltzGen, you need ProteinMPNN. Backbone generators do not assign amino acid identities, and naive sequence design from a Rosetta-style energy function recovers far fewer foldable, expressible designs than a learned inverse folder.
ProteinMPNN replaces hand-tuned design with a single inference pass that produces ranked, scored, ready-to-validate sequences. Per Dauparas et al. (Science, 2022), sequence recovery is substantially higher than physics-based methods, and the model has now been validated experimentally in dozens of published binder design campaigns.
Just generated backbones with RFdiffusion and need sequences before validation
Running BoltzGen and want to redesign sequences with explicit interface constraints
Optimizing a known binder backbone with a different target or stability profile
Designing soluble variants of a structural scaffold while preserving the fold
Comparing inverse-folding outputs against a physics-based baseline on the same backbone
From designed sequences to validated binders
ProteinMPNN gives you sequences. Wet-lab validation tells you which ones actually bind. Two entry points depending on scope.
Validate your MPNN-designed sequences
The Binder Pilot is a short, fixed-scope campaign with one round of design, a smaller pool, ranked hits, and a technical report. Scoped for academic labs, seed biotech, industrial SMBs, and student research groups who already have a target and want validated binders without committing to a full multi-round program.
See the Binder Pilot → Flagship programMulti-algorithm de novo campaign
The AI Binder Sprint runs RFdiffusion, BindCraft, and BoltzGen in parallel over 6-8 weeks with milestone check-ins and a 100% binder guarantee. ProteinMPNN handles sequence design for the RFdiffusion and BoltzGen branches inside this pipeline.
See the AI Binder Sprint →Run ProteinMPNN on your backbone
Upload a backbone, set the temperature, pin your hotspot positions. Eight to sixteen scored sequences returned, ready for AlphaFold2 validation or wet-lab handoff.