ProteinMPNN and the sequence design problem: what it does and why it matters

Generating a protein backbone and designing a protein sequence are two separate computational problems. RFdiffusion solves the first: given a target surface, generate a backbone geometry that could engage it. ProteinMPNN solves the second: given that backbone, which amino acid sequences will actually fold into it?

This distinction matters because a backbone without a sequence is not a protein. Many plausible-looking backbone geometries from diffusion models cannot be realized by any sequence that would actually fold stably. ProteinMPNN’s role is to identify sequences that are geometrically and thermodynamically compatible with the backbone, and to do so with enough diversity that you get a pool of candidates to screen, not a single prediction.

The inverse folding problem

Structure prediction (AlphaFold, Boltz-2) takes a sequence and predicts its structure. Inverse folding takes a structure and asks: what sequences are consistent with this geometry? This is fundamentally a harder and less determined problem. For any given backbone, many sequences are at least partially compatible, but most will not fold stably.

ProteinMPNN addresses this using a message-passing neural network trained on structure-sequence pairs from the PDB. Given backbone coordinates (C-alpha, C-beta, C, N, O positions), it learns to predict which amino acids at each position are consistent with the local and global structural context. The model predicts a probability distribution over the 20 amino acids at each position, conditioned on the full backbone geometry.

At inference time, sequences are sampled from this distribution. The sampling temperature controls the sharpness of the distribution: low temperature selects the highest-probability residue at each position (more conservative, higher predicted stability), high temperature samples more diversely (lower average confidence, higher sequence diversity).

How ProteinMPNN fits into a de novo binder design pipeline

After RFdiffusion generates backbone structures, ProteinMPNN is run on each backbone to generate candidate sequences. A typical production workflow:

RFdiffusion produces 10,000-50,000 backbone structures, each geometrically conditioned on the target hotspot.
An initial quality filter removes backbones with low diffusion confidence scores or implausible geometry.
ProteinMPNN generates 8-16 sequences per remaining backbone, sampled at multiple temperatures.
The sequence pool (tens of thousands of candidates) is passed to structure prediction (Boltz-2, ESMFold, ColabFold) for validation.
Candidates are filtered by predicted complex quality (ipTM, PAE, interface pLDDT).
The filtered pool (typically 500-2,000 sequences) advances to synthesis.

ProteinMPNN is fast relative to structure prediction: running it on thousands of backbones takes minutes. The computational bottleneck in the pipeline is the complex structure prediction step, not sequence design.

Identify epitopes on your own target. Epitope Scout scores and ranks surface patches on any PDB structure. Free to use.

Fixed-position residues: locking interface contacts

One of the most useful features of ProteinMPNN is the ability to fix specific residue identities and allow the model to design only the non-fixed positions. In binder design, this is used to preserve contacts at predicted hotspot positions.

If the backbone places a residue in direct contact with a hotspot on the target, and structural analysis suggests that a specific amino acid (e.g., an Arg or Tyr forming a hydrogen bond network with the target interface) should be conserved, that position can be fixed. ProteinMPNN will design the rest of the sequence freely while respecting the constraint.

This is particularly useful when:

A pilot screen has identified a weak binder and you want to preserve the key contact while redesigning the scaffold
Structural modeling suggests a specific interaction motif (aromatic stacking, charged-charged, etc.) that should be maintained
You are doing affinity maturation of a confirmed hit and want to hold the known binding contacts while diversifying the surrounding scaffold

Temperature sampling: managing the diversity-stability trade-off

The sampling temperature parameter is the main lever for controlling sequence diversity in ProteinMPNN output.

At low temperature (T = 0.1-0.2): sequences are close to the model’s maximum-likelihood prediction. High average sequence confidence, high predicted stability, low diversity. Suitable when you want a small number of high-confidence predictions per backbone.

At standard temperature (T = 0.3-0.5): the default for most production runs. Reasonable diversity with acceptable predicted quality. Generates sequences that are distinct from each other without excessive deviation from the backbone’s design constraints.

At high temperature (T = 0.7-1.0): high sequence diversity across the output pool. More exploration of sequence space, but a larger fraction of sequences will fail the downstream structural validation step. Useful when you want to characterize the sequence landscape around a scaffold topology or when you suspect the low-temperature output is too conservative.

In practice, running ProteinMPNN at two temperatures (e.g., T = 0.2 and T = 0.5) per backbone and pooling the outputs provides both confident predictions and a diversity buffer without substantially increasing compute time.

What ProteinMPNN does not do

ProteinMPNN does not predict binding affinity. It predicts sequence-backbone compatibility (whether a sequence will fold into the provided structure), not whether the resulting protein will bind the intended target. A high-confidence ProteinMPNN score means the sequence is likely to fold; it says nothing about the quality of the binding interface.

This is why structural validation against the target (Boltz-2 / ESMFold / ColabFold complex prediction) is a required subsequent step, not optional. Sequences that ProteinMPNN scores favorably for backbone stability may still have weak, misaligned, or absent binding interfaces when the full complex is predicted.

The pipeline (RFdiffusion, ProteinMPNN, complex prediction, experimental screen) works because each step filters on a different property. No single computational step is sufficient. The experimental display screen is the only step that measures actual binding.

See how ProteinMPNN fits into the Ranomics pipeline: AI Design Engine

ProteinMPNN: Sequence design across designed backbones with developability-aware filters.
AI design engine: Orchestrated RFdiffusion + ProteinMPNN + BindCraft + validation pipeline.