Yeast display for antibody discovery has become the default discovery platform for campaigns where the input library is smaller than 10^9 variants and the readout needs to be quantitative on a per-clone basis. Phage display still wins for raw diversity; mammalian display still wins for full-length IgG and complex PTM dependence. Between them, yeast display occupies the practical center of the discovery workflow — and for AI-designed libraries especially, it is the platform that pulls the most useful information out of every clone screened.
This article is a methods primer for the choices that matter: which scaffold to display, how to build the library, how to sort, how to call hits from NGS, and how to triage candidates before committing to expression scale-up.
Why yeast display for antibody discovery
The structural argument is direct. Yeast surface display anchors antibody fragments to the cell wall via Aga2p fusion, presents the fragment at 10,000–100,000 copies per cell, and labels each cell with two independent fluorophores — one for display level (anti-myc or anti-HA on the C-terminal tag), one for antigen binding. Flow cytometry then reports both signals per cell, which means every clone passing through the sorter delivers a normalized affinity estimate, not just a binary “binds / doesn’t bind” call.
That normalization is the part that matters. Phage display enrichment reports population-level changes in clone abundance after panning; you learn which sequences enriched, but not their relative affinities without follow-up titrations. Yeast display reports each clone’s mean fluorescence intensity ratio, which is a quantitative proxy for Kd within the linear range of the assay. For ranking the top 100 hits from a 10^4-hit pool, yeast saves a quarter of validation work.
A second argument is folding fidelity. Yeast — like mammalian cells but unlike E. coli — has the secretory pathway, BiP and PDI for disulfide chaperoning, and quality control that aborts misfolded proteins before surface display. The displayed fraction of a yeast library is the fraction that actually folded. Phage doesn’t enforce this filter; bacterial periplasmic chaperoning is a weaker quality gate, and aggregation-prone clones can still display.
Scaffold choices: scFv, Fab, IgG, VHH
The four common antibody-derived scaffolds present trade-offs you commit to at the library-design stage.
scFv (single-chain Fv) is the workhorse for yeast display. The VH and VL chains are joined by a flexible linker (typically (Gly₄Ser)₃), expressed as a single polypeptide, and display efficiently. The downside: scFvs are prone to thermal instability and aggregation, and the linker can interfere with the binding mode when the antibody must engage a specific epitope geometry. For early discovery, scFv format is the cheapest and most informative. For lead optimization beyond the discovery hit, conversion to Fab or IgG is required.
Fab (antigen-binding fragment) displays both chains independently. One chain is fused to Aga2p; the other expresses cytoplasmically and assembles in the secretory pathway. Fab displays correlate better with full IgG binding mode than scFv, and the format is closer to a developable lead. The trade-off is library-construction complexity: you have two chains to diversify, which means combinatorial libraries grow combinatorially.
IgG in yeast is not practical at library scale. The molecule is too large, and assembly of the four chains in the yeast secretory pathway is inefficient. For full IgG discovery, mammalian display is the right platform.
VHH (single-domain antibodies from camelids) is the cleanest yeast display scaffold by far. One domain, one chain, ~110 residues, no light chain to coordinate, no linker artifact, intrinsically stable. Library sizes of 10^9 unique VHHs are routine, and the format delivers fully developable leads with minimal reformatting. For programs where the IgG-format constraint isn’t load-bearing (intracellular targets, multivalent constructs, bispecifics built on a VHH scaffold), VHH-on-yeast is the platform of choice.
We default to scFv or VHH for discovery depending on the downstream format. Fab when full-length IgG developability is a near-term concern.
Library construction: natural, synthetic, AI-designed
The three library-construction strategies trade off diversity, design control, and cost. We covered the comparison in detail in natural vs synthetic vs AI-designed libraries; the summary applicable to yeast display:
- Natural libraries (cDNA amplified from immunized animals or naïve donor B cells) deliver evolved diversity at low cost. Library sizes reach 10^9 in yeast. The trade-off is no design control: you get what the immune repertoire produced, including liabilities (deamidation hotspots, oxidation-prone methionines, aggregation-prone CDR-H3 sequences).
- Synthetic libraries (germline frameworks plus combinatorial CDR diversification using NNK, NNS, or trimer codons) deliver designed diversity with controllable framework choice and CDR length distribution. Library sizes are also 10^9. Costs are higher but liabilities can be engineered out at the design stage.
- AI-designed libraries (ProteinMPNN-resampled, RFdiffusion-conditioned, BindCraft-generated) deliver pre-filtered diversity concentrated near the target’s binding mode. Library sizes are 10^4–10^6 — orders of magnitude smaller — but hit rates per clone are correspondingly higher.
For naïve discovery against a new target, we run synthetic-on-yeast as the default. For targets where AI-designed candidates exist, we run AI-on-yeast as a second arm and union the hits.
Sorting: MACS pre-enrichment, then FACS rounds
A 10^9 yeast library cannot be FACS-sorted in one pass. Flow rates cap at ~10^7–10^8 cells per hour on a high-end sorter, and the volume of buffer to handle 10^9 cells is impractical. The solution is magnetic-activated cell sorting (MACS) pre-enrichment before the first FACS round.
The pattern:
- MACS round 1: incubate 10^9 yeast with biotinylated antigen at ~100 nM; capture on streptavidin microbeads. Yield: ~10^7 enriched cells. This single step removes the 99% of the library that is non-displaying or non-binding.
- FACS round 1: stain with biotinylated antigen at ~10 nM plus anti-myc-PE for display normalization. Sort the top ~1% of double-positive cells. Yield: ~10^5 enriched cells.
- FACS rounds 2–4: decrease antigen concentration each round (1 nM, 100 pM, 10 pM as the campaign proceeds). Each round tightens selection stringency. Yield: ~10^3–10^4 enriched cells per round.
Total wall-clock: 3–4 weeks for a complete campaign. Total reagent burden: ~50 μg biotinylated antigen if titrations are conservative. Across rounds, the population shifts from 0.1% antigen-positive to >50% antigen-positive — a clear sorting signature.
The trick that matters: titrate display levels before staining, especially in late rounds. As selection narrows the population, the surviving clones may differ in display level by 5–10×, which inflates apparent affinity differences. Gating on a fixed display-level window normalizes this. We documented the protocol in correctly titrating display levels.
A second practical note from our campaigns: for a target or scaffold we have not screened before, our first pass is to incubate overnight. Most published protocols specify one to two hours, but the time a new yeast display assay takes to reach equilibrium is not always the time the protocol guide assumes. Overnight is a safe upper bound, and we titrate the incubation time back from there once we have a reliable signal.
NGS hit calling
After each FACS round, plasmid is recovered from the sorted population, the antibody coding region is PCR-amplified with NGS-compatible primers, and the population is sequenced on Illumina (typically 1–5 million reads per round per population). The output is a per-clone abundance trace across rounds.
What you look for:
- Enrichment factor: clones that increase ≥10× from round 2 to round 4 are the lineage of interest. Background clones (carry-over non-binders, contamination) typically stay flat or decrease.
- Lineage convergence: clones with shared CDR-H3 sequences (or VHH CDR3) are evidence the campaign is converging on a productive solution rather than enriching noise.
- Read-depth sufficiency: clones below 100 reads in the final round are statistically unreliable. Either re-sort or skip them.
For the full math on what library diversity and read depth deliver, see the numbers game on NGS library diversity.
The output is typically 50–200 unique CDR sequences that meet enrichment and read-depth criteria. From there, you triage to the top 20–50 for biochemical follow-up.
In our recent campaigns, we are increasingly running the same NGS step on the non-binding pool too. The training datasets behind the public AI design models skew heavily positive, and the negative pool from a sorted yeast library is some of the cleanest failure-labelled data anyone has access to. The marginal sequencing cost is small relative to the value of those labelled negatives for any downstream model work.
From hit to lead
The top NGS hits are reformatted to soluble Fab or IgG, expressed at small scale (typically 1–10 mL in HEK293 or CHO transient expression), and characterized for:
- Binding by BLI or SPR: confirm Kd is in the expected range (typically 0.1–10 nM for a successful campaign).
- Specificity: counter-screen against off-target proteins (related family members, serum albumin) and against PSR (polyspecificity reagent).
- Thermal stability: nanoDSF Tm. Below 65 °C is a yellow flag for downstream developability.
- Hydrodynamic behavior: SEC profile. Aggregation, dimers, fragments — all visible in this single readout.
- Production yield: mg/L from the small-scale prep. Below 10 mg/L predicts difficult scale-up.
Hits passing all five criteria advance to scaled expression and full in vitro characterization. Hits failing one criterion enter optimization (engineering, DMS-guided affinity maturation, or framework re-grafting). Hits failing multiple criteria are deprioritized.
The conversion rate from “NGS-enriched clone” to “developable lead” in a well-run campaign is typically 30–60%. Below 20%, something is wrong with sorting stringency, library design, or target preparation.
Two-platform handoff
Yeast display is a discovery platform, not a developability platform. The high-mannose glycans yeast installs are not the complex glycans mammalian production cells will install, and a clone that displays well on yeast can still fail on mammalian expression for reasons yeast cannot model.
For programs heading to therapeutic development, the right workflow runs the two-platform sequence: yeast display for affinity discovery, mammalian display for developability validation. The yeast arm delivers the hit list; the mammalian arm filters out clones that fail PTM compatibility before committing to expression scale-up.
Decision summary
If your target is a soluble antigen, your library is smaller than 10^9, and your downstream format is scFv, Fab, or VHH: yeast display is the right discovery platform.
If your library must be larger than 10^9 (truly naïve diversity, no design control): start with phage display, then triage the top 10^4–10^5 hits on yeast for quantitative ranking.
If your target depends on PTMs yeast cannot install (complex glycans, sulfation, specific O-linked structures): skip yeast and run mammalian display from the start.
If you’re scoping a yeast display campaign and want a second opinion on scaffold choice, library design, or sorting strategy, see our yeast surface display services or start a Binder Pilot. For multi-target programs and full IgG development, see the AI Binder Sprint.
Related Ranomics services
- Yeast surface display services: Full-stack yeast display campaigns for antibody discovery and engineering.
- AI protein binder design: De novo binder design feeding into yeast display validation.
- AI Binder Sprint: 6–8 week flagship program combining AI design with yeast/mammalian display.