NGS analysis for protein engineering
Next-generation sequencing pipeline for quantitative hit calling, enrichment analysis, and candidate ranking in protein engineering
Discuss your project →Quantitative hit calling from next-generation sequencing
Every display screening and deep mutational scanning experiment at Ranomics includes next-generation sequencing as the final readout. Raw Illumina reads are processed through our standardized bioinformatics pipeline to produce variant-level enrichment scores, replicate-validated hit lists, and statistical confidence metrics.
This replaces colony-picking and Sanger sequencing — the historical bottleneck in directed evolution — with a quantitative, comprehensive readout. Instead of characterizing 96 clones and hoping the best variant is among them, NGS analysis evaluates every variant in the library simultaneously and ranks them by functional performance.
NGS analysis pipeline for enrichment-based screening
Read processing and QC
Raw reads are demultiplexed, quality-filtered (Q30 threshold), and adapter-trimmed. Paired-end reads are merged where applicable. Per-sample read count distributions and quality metrics are reported for every run.
Variant calling and counting
Reads are aligned to the reference sequence and translated. Each unique amino acid variant is counted across pre-selection (input) and post-selection (output) populations. Synonymous codon collapse ensures accurate variant-level quantification.
Enrichment ratio calculation
Log2 enrichment ratios are computed for each variant: output frequency divided by input frequency, with pseudocount correction for low-abundance variants. Ratios are normalized to the wild-type sequence as the internal reference.
Statistical filtering and hit calling
Variants are filtered by minimum read count thresholds in both input and output populations. Replicate concordance is assessed. Final hit lists include enrichment scores, confidence intervals, and false discovery rate estimates.
Ranked candidate delivery
Top candidates are delivered as a ranked list with enrichment scores, read counts, replicate agreement, and recommended follow-up priority. For DMS experiments, full position-by-substitution fitness matrices and heatmaps are included.
Sequencing analysis for different experiment types
Display screening analysis
Enrichment analysis across multiple rounds of FACS or MACS selection. Tracks variant frequency trajectories to distinguish genuine hits from stochastic enrichment. Identifies convergent clones across independent selections.
DMS fitness scoring
Complete saturation mutagenesis analysis producing position-by-substitution fitness matrices. Normalized enrichment scores, replicate averaging, and statistical significance testing for every measured variant.
Library QC sequencing
Pre-selection library characterization: diversity coverage, positional bias detection, amino acid distribution analysis, and wild-type contamination assessment. Verifies library quality before committing to selection.
Combinatorial variant analysis
Multi-mutation variant tracking for combinatorial libraries. Epistasis detection between co-occurring mutations. Identifies synergistic combinations that outperform single-mutation predictions.
Cross-campaign comparison
Normalized enrichment data enables comparison across different selection conditions, targets, or library designs within the same campaign. Identifies variants with broad activity profiles.
Custom bioinformatics
Non-standard analyses for specialized experimental designs. Custom reference sequences, non-standard codon tables, frame-shift detection, and integration with external datasets as needed.
Need sequencing analysis for your campaign?
Whether integrated with a Ranomics screening campaign or as standalone bioinformatics support, we deliver ranked candidates from your sequencing data.
Start a project →