Commit Graph

4 Commits

Author SHA1 Message Date
07705a5884 GPU Phase 1: co-fold cofactors/metals (the binding-mode determinants)
Add metal/cofactor handling to the Boltz-2 YAML as CCD ligand entries -
the modes classical docking couldn't model:
- HDAC2 + catalytic Zn (vorinostat chelates it)
- PKR + FBP + Mg (allosteric activator + metal)
- hemoglobin + heme
Same cofactors present when co-folding negatives into a target (fair test).

build_boltz_yaml() gains a cofactor_ccds arg (emits `ligand: {ccd: ...}`
entries); TARGETS carries per-target cofactors; cofold()/main() thread them
through. Verified locally: YAML builds correctly with Zn / FBP+Mg.

Honest limitation noted: Hb's voxelotor site is at the tetramer centre and
covalent (Schiff base), so single-chain+heme only approximates it - HDAC2
(Zn) and PKR (cofactor) are the real co-folding tests. Ready for
`modal run gpu/modal_app.py`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 17:16:06 +02:00
4022c0cb94 GPU Phase 1 runnable: real Boltz-2 co-folding + alignment review
Flesh out the Modal app into a runnable Phase-1 positive-control test and
reconcile it with the plan:
- cofold() GPU fn: build Boltz-2 YAML (protein+ligand+affinity), run
  `boltz predict --use_msa_server --cache /weights/boltz`, parse affinity
  JSON + predicted pose; weights persist via Volume.
- Local helpers (CPU, import-tested against our PDBs): binding_chain_sequence
  (gemmi -- correctly picks the binding chain, e.g. alpha-globin for 5E83),
  pubchem_smiles, build_boltz_yaml, fetch_pdb (RCSB).
- main(): fan out cofold.starmap over 3 targets x (known binder + 2
  negatives); tabulate; PASS if known binder has top P(binder) for its target.

Alignment fixes:
- Rank by P(binder) (higher=better), NOT raw affinity_pred_value whose sign
  (~log IC50) is version-dependent -- avoids a backwards positive-control test.
- gpu_plan.md Phase 1 updated to affinity/P(binder) ranking; pose-RMSD noted
  as a later refinement (needs receptor superposition).

Local half verified (sequence/SMILES/YAML); cofold() needs a live `modal run`
(account + `modal token new`) to validate end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:56:27 +02:00
81d56b7a76 GPU plan: make weight persistence concrete (Modal Volume cache)
Document and wire the weight-caching mechanism:
- modal.Volume is a cloud-backed FS independent of the GPU/container;
  run 1 downloads weights into /weights, run 2+ reuses them (no GPU time
  wasted re-downloading).
- Point downloaders at the mount: HF_HOME/TORCH_HOME/boltz --cache; persist
  via weights.commit(), see updates via weights.reload().
- Volume storage costs pennies, separate from GPU = near-free caching.

modal_app.py cofold(): set cache env vars to /weights, reload()/commit()
around the (stubbed) boltz call.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:48:50 +02:00
08ed713cc8 GPU plan: ephemeral serverless co-folding (Modal) + app skeleton
docs/gpu_plan.md: cost-efficient plan for running AF3-class co-folding
(Boltz-2/DiffDock) on a GPU then paying nothing when idle.
- Key insight: structure-track data is tiny (MB of PDBs/SMILES); only the
  GPU + model weights are heavy -> serverless is ideal.
- Recommend Modal (per-second billing, scales to zero = nothing to kill);
  RunPod as the SSH-box alternative with idle auto-terminate.
- Lifecycle: image -> weights Volume (cache, don't re-download) -> run ->
  git push small results -> teardown automatic.
- Phase 1 validate on 3 known binders (~$1) before paying for a screen;
  Boltz-2 (affinity) on an L4/A10 (24-48GB); est total ~$5-15.

gpu/modal_app.py: Modal app skeleton (image, weights volume, GPU cofold()
function, local entrypoint); boltz invocation stubbed with TODOs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:45:04 +02:00