Running ColabFold Locally: A Practical Guide

ColabFold ships in three forms: the original Google Colab notebook, a self-hosted install called LocalColabFold, and various hosted GPU services that wrap one of the two. Choosing between them is rarely about whether they produce the same output (they do, since the underlying AlphaFold 2 weights and MMseqs2 frontend are identical). It is about who pays the maintenance cost and whether your throughput, privacy, and GPU constraints fit the chosen environment.

This post lays out what each path actually costs, where it breaks, and how to pick.

Option 1: The Original Google Colab Notebook

The ColabFold notebook (the AlphaFold2.ipynb and AlphaFold2_advanced.ipynb files in the sokrypton/ColabFold repository) is what put the project on the map. It runs in a free Google Colab session with no install, and for a single sequence it works in about 5 to 10 minutes end to end.

Strengths:

Zero setup. Click the notebook, paste a sequence, run.
Free GPU access through the standard Colab tier (a T4 most of the time, sometimes a V100).
Always pinned to a recent ColabFold release.
Good for teaching, one-off folds, and a first look.

Weaknesses:

Colab session timeouts. Free Colab disconnects after about 90 minutes of inactivity or 12 hours of total runtime, whichever comes first. Long sequence sets get killed mid-batch.
GPU memory ceiling. Free-tier T4s have 16 GB VRAM. Multimer predictions on long chains can OOM.
No persistent storage. Outputs live in the Colab VM filesystem and disappear when the session ends. You need to manually download every PDB, every PAE plot, every JSON file.
Shared queue. Free Colab has no SLA; allocation can vary by hour of day.
Cannot be scripted from outside Colab without contortions.

A reasonable upper bound for the notebook path is around 20 to 50 sequences per session before the friction of session timeouts and manual downloads pushes you toward LocalColabFold or a hosted alternative.

Option 2: LocalColabFold

LocalColabFold is the same ColabFold codebase packaged for installation on a local Linux or macOS machine. The install script downloads the AlphaFold 2 weights, sets up a conda environment, and gives you a colabfold_batch command-line tool you can run on your own GPU.

Strengths:

No session timeouts. Run as many sequences as your storage holds.
Local data. Sequences never leave your machine, which matters for IP-sensitive design work.
Scriptable. colabfold_batch input.fasta output_dir/ slots into any pipeline.
Full control over recycles, model choice, MSA mode, and template usage.

Weaknesses:

GPU required. An NVIDIA GPU with at least 16 GB VRAM (an RTX 3090, RTX 4090, A4000, A5000, A6000, A100, or H100) is the practical floor. Multimers on long chains want 24 GB or more.
CUDA stack. The install pins specific JAX, CUDA, and cuDNN versions. Driver mismatches are the most common install failure.
Maintenance. ColabFold and JAX both move quickly. A six-month-old install will accumulate dependency drift.
MSA traffic. By default, LocalColabFold still queries the MMseqs2 server hosted by the Söding lab in Munich. For very high-volume use, the polite path is to spin up your own MMseqs2 server or use the offline mode.

A clean LocalColabFold install on a modern Linux box takes about an hour for someone comfortable with conda and CUDA, and longer if the box has any unusual driver setup. Once installed, it runs unattended overnight; the only manual step is monitoring disk usage as PDB files accumulate.

Option 3: A Hosted GPU Service

The third path is a hosted ColabFold endpoint that runs the same colabfold_batch binary on a dedicated GPU and gives you an interface or API for submitting sequences. The Ranomics ColabFold tool at tools.ranomics.com is one such option; there are others, including AWS HealthOmics, NVIDIA BioNeMo, and various academic group instances.

Strengths:

No install or maintenance. Someone else handles JAX upgrades, GPU drivers, and the MSA backend.
Persistent storage. Outputs are saved against your account, with PAE plots and pLDDT colored PDB files available for download.
Predictable GPU. A dedicated A100 or H100 cuts a typical fold to 1 to 2 minutes versus 5 to 10 on a free Colab T4.
Batch submission. Drop a multi-sequence FASTA, get a results page.
Free tier available. Most hosted offerings let you fold a handful of sequences before requiring credits.

Weaknesses:

Data leaves your machine. For IP-sensitive sequences, you are trusting the host’s privacy policy and infrastructure.
Cost at scale. Hosted GPU time is sold per credit or per second. For a campaign folding tens of thousands of sequences, owning the GPU is eventually cheaper.
Less direct control over inference knobs, though most hosts expose the standard recycle and model-choice settings.

A Quick Comparison

Concern	Google Colab notebook	LocalColabFold	Hosted GPU
Setup time	Zero	About 1 hour (clean Linux box)	Sign-up only
Per-fold wall clock	5 to 10 min	1 to 5 min (own GPU)	1 to 2 min
Session limits	12 h free, 24 h Pro	None	Per-job, usually generous
GPU memory ceiling	16 GB (T4) free	Whatever you bought	40 to 80 GB (A100 / H100)
Data privacy	Google Colab terms	Your machine, your terms	Host’s policy
Throughput ceiling	About 20 to 50 sequences	Limited by GPU and disk	Whatever you pay for
Maintenance	None	You	Someone else

The right answer depends on volume and privacy. A graduate student folding one design a week should use the notebook. A team running quarterly campaigns on a few hundred sequences should pick LocalColabFold if they have a 24 GB+ GPU sitting around, or a hosted service otherwise. A company running continuous design pipelines at thousands of folds per week needs either dedicated hardware or a hosted plan sized accordingly.

LocalColabFold Install in Practice

If you decide on LocalColabFold, the install path is documented in the YoshitakaMo/localcolabfold repository. Three details that catch most first-time installers:

Use a fresh conda environment. Mixing the ColabFold environment into an existing JAX or PyTorch environment almost always breaks one of the two.
Check the CUDA major version. The install script wants a specific JAX-CUDA pair. A CUDA 12.x driver on a system that defaults to CUDA 11.x produces a confusing import error at first run.
Pre-download the AlphaFold 2 weights once. The weights are about 4 GB. The installer downloads them automatically, but if you are setting up multiple machines or a cluster, fetch the weights once and mount them.

After install, a sanity test on a single sequence should complete in 1 to 5 minutes on an A100 or RTX 4090, including the MMseqs2 MSA call. If the first run takes 30 minutes, the MSA fell back to running locally (which is fine but signals a network or server issue worth diagnosing).

When the Hosted Path Pays for Itself

The break-even between LocalColabFold and a hosted service usually comes down to three numbers: GPU cost, your time, and how much you fold.

A consumer-grade RTX 4090 costs about 1,800 USD and gives 24 GB VRAM, which is enough for most monomer folds and many multimers. Amortized over two years that is about 75 USD per month before electricity and maintenance time. If you fold less than a few hundred sequences a month, hosted credits are likely cheaper. If you fold thousands per month, owning the GPU pays back inside a year, though only if your time installing and maintaining the stack is effectively free.

For most academic and seed-stage industrial groups, the math favors hosted services until the campaign size justifies a dedicated machine and a postdoc to keep it running.

What ColabFold Will Not Solve

A common misread of ColabFold is to treat it as a sequence-to-truth oracle. It is not. The same caveats that apply to AlphaFold 2 apply to ColabFold:

The model predicts a single plausible structure. Conformational dynamics, induced fit, and large-scale flexibility are mostly invisible to a single fold.
Low pLDDT regions are usually disordered or poorly predicted. Treat them as low-confidence regardless of which pipeline produced them.
Predicted structures of de novo designed sequences require single-sequence mode and a self-consistency check. A confident fold on a design is necessary but not sufficient for binding.
Bound conformations, ligand-induced states, and PTM-dependent folds are out of scope. AlphaFold 3 or Boltz-2 cover some of these cases.

Picking ColabFold over AlphaFold 2 is a throughput decision. Picking ColabFold over wet-lab validation is a different decision and a much harder one.

Summary

The three paths to running ColabFold are not equivalent. The Google Colab notebook is for one-offs and learning. LocalColabFold is for groups with a dedicated GPU and the bandwidth to keep the stack patched. Hosted GPU services are for everyone in between, and for anyone who would rather spend campaign budget on wet-lab validation than on devops. Pick on volume, privacy, and maintenance bandwidth; the output PDBs are interchangeable.

You can run the ColabFold tool at tools.ranomics.com with a free account, no install, and a dedicated A100. It is the same colabfold_batch binary that LocalColabFold installs locally.

ColabFold technology page: how the MMseqs2 frontend works and where ColabFold sits in a design campaign.
Binder Pilot: ranked hit list from your top folded designs, scoped for academic and seed-biotech teams.
AI Binder Sprint: multi-algorithm campaign with ColabFold sitting in the in-silico triage loop.

Running ColabFold Locally: LocalColabFold, the Notebook, or a Hosted GPU

Option 1: The Original Google Colab Notebook

Option 2: LocalColabFold

Option 3: A Hosted GPU Service

A Quick Comparison

LocalColabFold Install in Practice

When the Hosted Path Pays for Itself

What ColabFold Will Not Solve

Summary

Ready to design your binder?

Running ColabFold Locally: LocalColabFold, the Notebook, or a Hosted GPU

Option 1: The Original Google Colab Notebook

Option 2: LocalColabFold

Option 3: A Hosted GPU Service

A Quick Comparison

LocalColabFold Install in Practice

When the Hosted Path Pays for Itself

What ColabFold Will Not Solve

Summary

Related Ranomics services

Ready to design your binder?