If your team has an AlphaFold model of a target protein and no structural biologist on staff, you are in the same position as most seed-stage biotech companies that want to run a de novo binder design campaign. AlphaFold and AlphaFold 3 have made structural models easy to get. The harder question is: can you use this model to actually design binders, and if so, where do you target them? This post walks through the decisions in plain language, without assuming a structural biology background.
Step 1: Decide Whether Your Model Is Usable
Not every AlphaFold model is a valid starting point for binder design. The key number to look at is pLDDT — AlphaFold’s per-residue confidence score, from 0 to 100. Roughly:
- pLDDT above 90: high confidence. Treat this region as you would a crystal structure.
- pLDDT 70-90: reasonable confidence. Usable, but check the backbone against any available experimental data.
- pLDDT 50-70: low confidence. Usable only for rough context; do not design against this region.
- pLDDT below 50: effectively no structural information. This region is almost certainly disordered or poorly predicted.
For binder design, what matters is the pLDDT locally, around the region you want to target, not the average over the whole protein. A target with overall pLDDT of 85 but pLDDT of 55 in the exact surface patch you want to bind is not a usable starting point for that patch.
How to check this without a structural biologist: open the PDB file in a viewer that color-codes by pLDDT (ChimeraX does this automatically; PyMOL can be scripted; Mol* in a web browser is the fastest). Look at the regions near the surface you care about. If they are red or orange (low pLDDT), pick a different region or invest in experimental structure determination first.
A second check: AlphaFold’s PAE (predicted aligned error) plot. If you are using AlphaFold 3, the PAE plot shows how confident the model is about relative positions between different parts of the protein. Large off-diagonal values mean the model has high internal uncertainty about how two domains sit relative to each other. For binder design, you want low PAE in the region you are targeting.
Step 2: Choose Where on the Protein to Target
This is the single most consequential decision in the entire campaign. It is more consequential than which generative model you use, how many designs you generate, or which display platform you screen on.
A good binder target surface has five properties, drawn from the protein-protein interaction literature:
- Hydrophobic enough to drive desolvation. Binder-target interfaces form because hydrophobic surfaces prefer each other to water. Pure polar or charged patches are hard to bind de novo.
- Structurally ordered. Beta-strands and alpha-helices are rigid, preorganized, and predictable. Loop-dominated surfaces move too much.
- Rigid. Low B-factor in a crystal structure, high pLDDT in a model. Flexibility costs entropy at binding and is the single most common reason computational designs fail experimentally.
- Accessible. The designed binder has to physically reach the surface. Deep pockets, grooves, and buried sites are harder than flat, exposed surfaces.
- Populated with hot-spot residues. Trp, Tyr, Arg, and Phe disproportionately contribute to binding free energy at protein-protein interfaces. Patches that contain them are easier to engage.
Assessing these by eye is what a structural biologist would normally do with PyMOL. Epitope Scout automates exactly this analysis — upload a PDB file, select a chain, and get a ranked list of candidate epitope patches scored on these five criteria. It is free, and the output is a CSV of residue selections that can be fed directly into RFdiffusion as a hotspot specification.
If there are existing antibodies or natural binding partners against your target with solved co-crystal structures, those contact surfaces are also valid targeting options — they are validated bindable by construction. The SAbDab and RCSB PDB are the databases to check.
Step 3: Pick a Design Algorithm
You have three mature open-source options for de novo binder design:
- RFdiffusion. Generates protein backbones conditioned on target hotspots. You provide the target structure and the hotspot residues; it generates backbones predicted to make contacts at those positions. Needs ProteinMPNN for sequence design after backbone generation.
- BindCraft. Jointly optimizes binder structure and sequence against an AlphaFold 2 confidence objective. Produces binders that are simultaneously optimized for fold and binding.
- Boltzgen. Boltzmann-weighted conformational sampling, useful when target flexibility matters or when you want design-time structure ensembles.
For a first campaign, RFdiffusion + ProteinMPNN is the best-characterized path. BindCraft tends to give higher filtered hit rates when the target is rigid and well-structured. Boltzgen is specialized. If budget is constrained, pick one algorithm and run it well, rather than running all three poorly.
Step 4: Filter Before You Build
Generative models produce backbones that look plausible in silico. They do not all fold correctly in reality. Before committing any candidate to gene synthesis, run a self-consistency check:
- Take the designed sequence.
- Fold it with an independent structure predictor — ESMFold, ColabFold, or Boltz-2.
- Align the predicted structure back to the intended backbone.
- If the RMSD is above roughly 2 Å, the sequence does not encode the structure you designed. Discard it.
This filter typically removes 50-80% of raw RFdiffusion output and is the single most important computational quality gate. Campaigns that skip it synthesize candidates that were never going to fold and waste weeks of experimental time.
Other filters worth applying before synthesis:
- Expression prediction. Very hydrophobic, very charged, or very long candidates often fail to express.
- Solubility prediction. SolubleMPNN or simple net-charge / hydrophobic-patch heuristics.
- Target binding prediction. AlphaFold 2 co-folding of designed binder + target with pLDDT on the interface residues.
A 1,500-design RFdiffusion run typically filters down to 200-500 candidates worth synthesizing.
Step 5: Decide How to Validate Experimentally
You have three broad options for experimental validation of designed binders:
- Yeast surface display. The workhorse for de novo binder validation. Well-validated for extracellular targets, high throughput (thousands to hundreds of thousands of variants), quantitative FACS readout, NGS-compatible.
- Mammalian display. Preferred when the displayed protein requires mammalian post-translational modifications. More expensive and slower than yeast.
- One-by-one purification. Express each top candidate as a purified protein and run SPR, BLI, or a direct binding assay. Low throughput but gives clean affinity measurements. Viable for 5-20 candidates from a filtered pool; infeasible for hundreds.
Most teams running a first campaign combine yeast display (for pool-level hit calling) with purification of the top 5-10 ranked hits (for orthogonal affinity validation).
If Your Team Does Not Have Display Infrastructure
This is the situation most seed-stage biotechs are in. Setting up yeast display, FACS, and NGS analysis from scratch is a six-month project and requires a bench scientist with the relevant training. For a first campaign, it is usually faster to outsource the experimental side.
The Binder Pilot program at Ranomics is scoped exactly for this case — target structure in, ranked hit list out. We do not replace the strategic decisions (target choice, epitope selection, downstream follow-up); we handle the experimental throughput. The scoping call is the right place to discuss whether your starting AlphaFold model is good enough and which epitope patches to prioritize.
For multi-target pipelines or teams that need the milestone structure and 100% binder guarantee of a flagship program, the AI Binder Sprint is the scope up from a Pilot.
Common Pitfalls
A few failure modes to watch for when moving from an AlphaFold model to a first binder campaign:
- Targeting a disordered loop that looks plausible in cartoon view. Always check pLDDT or B-factors locally.
- Designing against an interface that is only bindable in a PTM-dependent form. If the native binding partner only engages a phosphorylated or glycosylated version of your target, a de novo binder on a non-modified recombinant protein will miss.
- Skipping self-consistency filtering. Trusting raw generative output directly into gene synthesis is the fastest way to burn a budget on non-folding candidates.
- Choosing a hotspot far from the functional site. A binder that binds the target is not necessarily a binder that inhibits or activates the target. If your downstream readout is a functional assay, the hotspot selection has to consider mechanism, not just bindability.
Summary
An AlphaFold model is a starting point, not a finished target. The work between “we have a model” and “we have a validated binder” is mostly about three decisions: is the model usable, where on the target should you bind, and how will you filter and validate. The tools to answer those questions are largely free and open-source. The experimental infrastructure is where most small teams run into bandwidth constraints — and that is usually the right place to outsource.
Related Ranomics services
- Binder Pilot: Scoped for academic / seed-biotech teams starting their first binder campaign.
- Epitope Scout: Free tool: identify exposed, designable epitopes on your AlphaFold model.