ColabFold vs AlphaFold 2: When to Use Which

ColabFold and AlphaFold 2 are often described as the same model with a different frontend, and that description is correct but easy to misread. The neural-network weights and inference graph are identical. The difference sits one step earlier in the pipeline, in how the multiple-sequence alignment (MSA) gets built before the weights ever see the input. That single substitution changes throughput by roughly an order of magnitude, and it shifts the right tool depending on what you are trying to do.

This post walks through what changes, what does not, and how to pick between them for a given task.

The Single Engineering Change

Classical AlphaFold 2 builds its MSA by running jackhmmer against UniRef90 and HHblits against BFD, sometimes with a third pass through MGnify. On a single sequence this homology search takes thirty to ninety minutes on a modern server, and it dominates wall-clock time. The neural-network inference itself, which is the part most people associate with AlphaFold 2, takes a few minutes.

ColabFold replaces that homology-search frontend with an MMseqs2 search against precomputed UniRef30 and ColabFoldDB profile databases. The MMseqs2 servers are hot, the profile databases are clustered for fast lookup, and the same MSA quality comes back in seconds to a couple of minutes. The AlphaFold 2 weights then run on that MSA exactly as they would on a jackhmmer-built one.

That is the entire architectural delta. There is no second model, no fine-tune, no distilled student network. ColabFold is AlphaFold 2 with a faster MSA pipeline bolted on the front.

What Stays the Same

Because the weights are identical, several properties carry over without modification:

Accuracy on well-represented targets. For sequences with deep evolutionary support in UniRef and an MSA depth of a few hundred sequences or more, ColabFold and full-MSA AlphaFold 2 produce structures within sub-angstrom RMSD of each other on the high-pLDDT regions.
pLDDT calibration. Per-residue confidence scores are reported on the same 0 to 100 scale and mean the same thing. A pLDDT of 85 from ColabFold is the same signal as a pLDDT of 85 from full-MSA AF2.
PAE matrix interpretation. The predicted aligned error matrix is generated by the same head on the same weights. Off-diagonal blocks still tell you about inter-domain or inter-chain confidence.
Multimer support. Both pipelines support the AF2-multimer weights with paired and unpaired MSAs.
Template usage. PDB70 templates can be passed in either pipeline. The template featurization is the same.

A reader sometimes asks whether ColabFold outputs are publishable, or whether reviewers will want full-MSA AF2 instead. Both pipelines run the same network on functionally equivalent inputs for well-represented sequences. The published methods note (Mirdita et al., Nat Methods 2022) covers the comparison in detail.

Where the Two Diverge

The divergence shows up in three places, all related to how the MSA itself differs.

MSA Depth on Rare Sequences

MMseqs2 searches UniRef30 (a 30% identity clustering of UniRef100) and ColabFoldDB (a curated environmental database). For sequences with hundreds or thousands of homologs in those databases, the MSA depth is plenty. For orphan proteins, very recently evolved proteins, or proteins from a narrow taxonomic clade with few sequenced relatives, the MMseqs2 search can come back with a shallow MSA where full jackhmmer plus HHblits would have built a deeper one. On those sequences, full-MSA AlphaFold 2 can give a more confident structure.

In practice this is rare. Most natural protein sequences are well-represented. The edge case shows up most often on:

De novo designed protein binders (which have no natural homologs by construction; see below for the right tool here).
Proteins from organisms whose genomes have not been deeply sequenced or whose families are taxonomically narrow.
Highly diverged viral proteins.
Engineered or synthetic enzymes far from natural sequence space.

Single-Sequence Mode

ColabFold exposes a --msa-mode single_sequence flag that runs AlphaFold 2 with no MSA at all. This is the right mode for de novo binder designs, where there are no natural homologs to align against and feeding a shallow random MSA actively confuses the model. Full-MSA AlphaFold 2 does not have a clean single-sequence path built in.

For designed sequences, single-sequence ColabFold inference plus pLDDT on the designed region is the standard self-consistency check. Many design pipelines (RFdiffusion plus ProteinMPNN, BindCraft, BoltzGen) run exactly this check before promoting candidates to wet lab.

Recycle Iterations and Inference Tweaks

ColabFold exposes inference knobs that most full-MSA AF2 pipelines hide. Recycle count is configurable from 1 to 12, with 3 as the default. Increasing recycles past 6 has diminishing returns on most targets but can pull marginal sequences toward higher pLDDT. The choice of which model weight checkpoint to use (model_1 through model_5) is also exposed.

These are inference-time controls. They do not change the underlying model, but they let you trade GPU time for marginal accuracy when a single high-stakes prediction warrants it.

When to Pick Which

A simple decision rule that holds up in practice:

Situation	Recommended tool
Triaging a pool of designed binder sequences	ColabFold (single sequence mode)
Folding 50 to 5,000 natural sequences for a survey	ColabFold
Predicting a target structure for binder design	ColabFold first, full AF2 if MSA is shallow
One definitive final-model prediction for a manuscript	Full-MSA AlphaFold 2
Orphan or highly diverged sequence with few homologs	Full-MSA AF2 (or AlphaFold 3)
Multimer prediction with paired chain MSAs	Either; ColabFold for speed, full AF2 for depth
Quick triage to decide if a sequence is worth investigating	ColabFold

The pattern across that table is consistent: ColabFold wins anywhere throughput matters and the sequence has reasonable evolutionary support. Full-MSA AlphaFold 2 wins when you need the deepest possible MSA on a single sequence or when the sequence is in a region of protein space where the MMseqs2 frontend struggles.

A Note on AlphaFold 3 and Newer Models

AlphaFold 3, Boltz-1, and Boltz-2 expand the scope to nucleic acids, small molecules, and explicit handling of post-translational modifications. They are not direct replacements for the AlphaFold 2 weights; they are different models for different problems. For protein-only structure prediction, the AlphaFold 2 weights (whether fronted by jackhmmer or by MMseqs2) remain the gold standard for accuracy per GPU-hour.

If you need to fold a protein in the presence of a small-molecule ligand or with a glycan, Boltz-2 or AlphaFold 3 is the right choice. If you are folding a bare protein sequence and want the fastest path to a reliable structure, ColabFold is still it.

Practical Recommendations

A workflow that holds up across most design and analysis tasks:

Default to ColabFold for any sequence you want a structure for. The MMseqs2 frontend will work in the vast majority of cases.
Inspect pLDDT and MSA depth. If pLDDT is low across the model and the MSA is shallow (under fifty sequences), the issue is the MSA, not the network. Re-run with full-MSA AlphaFold 2 or AlphaFold 3.
For designed sequences, use single-sequence mode. Full MSA is the wrong input when there are no natural homologs.
Reserve full-MSA AlphaFold 2 for high-stakes single predictions. When a single structure is going into a paper, a patent, or a critical design decision, the extra wall-clock is cheap insurance.
Run a quick AlphaFold 3 cross-check if you have access to it and the prediction matters. AF3 plus ColabFold agreement is a strong signal.

You can run the ColabFold tool on tools.ranomics.com with no install; the same hosted environment runs ColabFold, ESMFold, and full AlphaFold 2 so you can compare outputs head to head.

Common Pitfalls

A few mistakes to avoid when picking between the two pipelines:

Using full MSA on a designed sequence. A shallow, contaminated MSA built by jackhmmer over distant homologs of a de novo design will degrade the prediction. Single-sequence ColabFold is the correct choice.
Comparing pLDDT across model checkpoints. Different model_1 through model_5 weights give slightly different pLDDT distributions on the same sequence. Pick one and stick with it for ranking within a batch.
Reading low MSA depth as a network failure. If ColabFold returns low pLDDT and the MMseqs2 hit count is in the tens, the MSA is the bottleneck. Falling back to full-MSA AlphaFold 2 or to AlphaFold 3 is the right next step.
Skipping templates when they help. PDB70 templates are optional in both pipelines. For sequences with close PDB homologs, including templates can pull marginal predictions over a confidence threshold. For de novo designed scaffolds with no PDB analogs, templates do nothing.

Summary

ColabFold and AlphaFold 2 are the same model with two different ways of building the MSA that feeds the network. On natural sequences with reasonable evolutionary support, the MMseqs2 frontend gives equivalent accuracy at roughly tenfold higher throughput. On orphan sequences with shallow MSAs, the slower jackhmmer plus HHblits path is still better. For designed sequences, single-sequence ColabFold is the right tool and full MSA is the wrong one. The decision is rarely close once you know which regime your sequences sit in.

ColabFold technology page: how the MMseqs2 frontend works and where it sits in our triage pipeline.
Binder Pilot: ranked hit list from your top folded designs, scoped for academic and seed-biotech teams.
AI Binder Sprint: multi-algorithm campaign with ColabFold sitting in the in-silico triage loop.

ColabFold vs AlphaFold 2: When the MMseqs2 Frontend Beats the Full MSA Pipeline

The Single Engineering Change

What Stays the Same

Where the Two Diverge

MSA Depth on Rare Sequences

Single-Sequence Mode

Recycle Iterations and Inference Tweaks

When to Pick Which

A Note on AlphaFold 3 and Newer Models

Practical Recommendations

Common Pitfalls

Summary

Ready to design your binder?

ColabFold vs AlphaFold 2: When the MMseqs2 Frontend Beats the Full MSA Pipeline

The Single Engineering Change

What Stays the Same

Where the Two Diverge

MSA Depth on Rare Sequences

Single-Sequence Mode

Recycle Iterations and Inference Tweaks

When to Pick Which

A Note on AlphaFold 3 and Newer Models

Practical Recommendations

Common Pitfalls

Summary

Related Ranomics services

Ready to design your binder?