Ranomics
Inverse folding visualization with amino acid identities being assigned to protein backbone positions
Sequence design

ProteinMPNN protein sequence design

Inverse folding for amino acid sequence design across computationally generated protein backbones

8-16
Sequences per backbone
0.1-0.5
Temperature sampling range
Fixed
position interface constraints
Ranked
Score-based filtering to validation
The inverse folding problem

ProteinMPNN designs sequences for generated backbones

Backbone generation and sequence design are separate problems in computational protein engineering. RFdiffusion and Boltzgen produce backbone coordinates — the three-dimensional shape of the protein — but do not assign amino acid identities. ProteinMPNN solves the complementary problem: given a backbone, what amino acid sequences will fold into that structure?

ProteinMPNN uses a message-passing neural network architecture that considers the local geometric environment of each residue position. It learns which amino acid identities are compatible with the backbone geometry at each position, producing sequences that are predicted to fold stably into the target structure.

How Ranomics uses it

Multiple sequences per backbone

For each backbone generated by RFdiffusion or Boltzgen, we generate 8-16 sequences using temperature sampling. This produces sequence diversity across a single backbone topology. BindCraft handles its own sequence design internally as part of its iterative co-optimization loop.

Fixed-position constraints are applied at predicted hotspot contact residues. These positions are locked to preserve the binding geometry at the interface, while allowing ProteinMPNN to optimize the remaining positions for stability and foldability.

Temperature sampling

Controlling the diversity-stability tradeoff

Low T (0.1)

Conservative sequences. Higher predicted stability. Less sequence diversity. Best when the backbone geometry is well-suited to the target.

Medium T (0.2-0.3)

Balanced sampling. Our default for most campaigns. Good diversity with reasonable predicted foldability.

High T (0.4-0.5)

Diverse sequences. More exploration. Higher risk of non-folding designs, but occasionally finds unexpected solutions.

Scoring & filtering

Stability and foldability metrics

Each ProteinMPNN output receives a score reflecting the model's confidence that the sequence will fold into the target backbone. We rank all sequences per backbone and advance high-scoring candidates to structural validation.

Additional filters include predicted solubility, aggregation propensity, and absence of known problematic sequence motifs (e.g., free cysteines, N-glycosylation sites in non-glycosylated contexts).

Pipeline position

Between backbone generation and structural validation

Input

Backbone coordinates from RFdiffusion or Boltzgen

ProteinMPNN

Sequence design + scoring

Output

Scored sequences for validation

See how ProteinMPNN fits into the full pipeline

ProteinMPNN is one step in our integrated design-to-screening workflow. Explore the full pipeline or start a project.