Document and wire the weight-caching mechanism:
- modal.Volume is a cloud-backed FS independent of the GPU/container;
run 1 downloads weights into /weights, run 2+ reuses them (no GPU time
wasted re-downloading).
- Point downloaders at the mount: HF_HOME/TORCH_HOME/boltz --cache; persist
via weights.commit(), see updates via weights.reload().
- Volume storage costs pennies, separate from GPU = near-free caching.
modal_app.py cofold(): set cache env vars to /weights, reload()/commit()
around the (stubbed) boltz call.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs/gpu_plan.md: cost-efficient plan for running AF3-class co-folding
(Boltz-2/DiffDock) on a GPU then paying nothing when idle.
- Key insight: structure-track data is tiny (MB of PDBs/SMILES); only the
GPU + model weights are heavy -> serverless is ideal.
- Recommend Modal (per-second billing, scales to zero = nothing to kill);
RunPod as the SSH-box alternative with idle auto-terminate.
- Lifecycle: image -> weights Volume (cache, don't re-download) -> run ->
git push small results -> teardown automatic.
- Phase 1 validate on 3 known binders (~$1) before paying for a screen;
Boltz-2 (affinity) on an L4/A10 (24-48GB); est total ~$5-15.
gpu/modal_app.py: Modal app skeleton (image, weights volume, GPU cofold()
function, local entrypoint); boltz invocation stubbed with TODOs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>