Commit Graph

6 Commits

Author SHA1 Message Date
9efdc0acf1 Pose-RMSD confirms HDAC2/vorinostat geometry: 0.21A (Vina was 7.9A)
Close the §12.4 validation loop. scripts/pose_rmsd.py superposes the
Boltz-2-predicted HDAC2 onto crystal 4LXZ, transforms the predicted
ligand, and scores pose RMSD (spyrmsd, in-place):
- protein fold: Ca RMSD 0.14A over 366 residues
- vorinostat pose: 0.21A (crystal-accurate) vs Vina 7.9A on this exact
  Zn-chelation case
- catalytic Zn ion: 2.73A off (ligand perfect, metal slightly less)

HDAC2 now validated on BOTH affinity (P(binder)=0.999) and geometry
(0.21A). The structure-binding modality is comprehensively validated on
its decisive metal-coordination case. Commits the predicted complex as
evidence (docs/results/HDAC2_vorinostat_pred.pdb).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:50:33 +02:00
c891a78541 Phase 1 co-folding WORKS: HDAC2/Zn validated (Boltz-2 on Modal)
First clear positive result in the project. Ran Phase 1 on Modal L4
(~$0.70). Boltz-2 P(binder), cofactors co-folded:
- HDAC2 (+Zn): vorinostat 0.9994 vs negatives ~0.1 -> PASS, decisive
- hemoglobin (+heme): voxelotor 0.46 -> PASS (weak; covalent/tetramer)
- PKR (+FBP/Mg): mitapivat 0.32 < hydroxyurea 0.40 -> FAIL (allosteric)

HDAC2/Zn is the exact case classical Vina failed (no metal term, 7.9A
redock). Co-folding handles the Zn-chelation chemistry -> the structure-
binding modality pivot (PLAN §12) is validated on its decisive test.

Engineering fixes that got it running: image needs cuequivariance kernels;
max_containers=1 so weights download once (parallel corrupted the shared-
Volume checkpoint); rank by P(binder) not affinity_pred_value (sign).

Adds docs/results/phase1_affinity.csv (committed; raw under data/ gitignored).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:41:19 +02:00
7c6cef1aef Production docking: prep helps (7.9->4.8A) but Vina wrong tool for sickle
§12.4 pushed to its limit. Meeko ligand prep + in-place symmetry RMSD
(spyrmsd, not obrms) on clean HDAC2/vorinostat: 7.9A -> 4.76A. Prep and
metric mattered, but still FAIL.

Residual cause is fundamental: vorinostat binds via hydroxamate-Zn
chelation and Vina has no metal-coordination term. Real finding: sickle's
druggable targets bind via non-classical chemistry classical docking
handles poorly -- Hb (covalent), PKR (allosteric+cofactor), HDAC (Zn
chelation). Vina is the wrong tool for this target landscape.

Redirect: data-driven AF3-class co-folding (Boltz-2/Chai-1/DiffDock)
handles these modes -- the indicated next tool, gated by the 24GB local
memory ceiling (cloud GPU needed). The "GPU breaks all-local" §12.6
prediction is now the binding constraint of the track.

Adds: scripts/dock_production.py; deps meeko, spyrmsd, gemmi.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:38:54 +02:00
51bd90df41 Redocking-RMSD validation fails 3/3: pipeline-quality issue
§12.4 de-biased validation (scripts/dock_validate.py).
Redock each co-crystal ligand into its own structure, RMSD vs crystal:
- voxelotor->Hb: NA (covalent binder, out of scope §12.7)
- mitapivat->PKR: 8.2A (allosteric, cofactors stripped)
- vorinostat->HDAC2 (4LXZ, zinc kept): 7.9A -- a CLASSICAL target that
  should have worked

The clean target also failing => systematic pipeline-quality problem,
not target choice. Cheap Vina + open-babel prep gives scores but doesn't
reproduce known geometry, so affinities aren't trustworthy. Ligand
efficiency over-corrects (ranks tiny hydroxyurea best). Fix needs
production prep (Meeko/AutoDockTools prepare_receptor + reduce) and an
in-place RMSD metric. Consistent with the project theme: the quick
version of every method runs but fails honest validation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 07:28:47 +02:00
75f5383961 Docking baseline: toolchain solved, raw affinity is size-biased
§12.3-12.4 first binding result on ARM Mac.
- Toolchain SOLVED: AutoDock Vina 1.2.5 mac binary (Rosetta) + open-babel
  (brew). No conda, no MLX. dock_positive_controls.py runs end-to-end.
- Cross-dock known binders + negatives into Hb (5E83) and PKR (8XFD),
  box centered on co-crystal ligands (5L7=voxelotor, WV2=mitapivat).

Finding: raw Vina affinity ranks almost perfectly by MOLECULAR SIZE
(mitapivat > voxelotor > decitabine/caffeine > hydroxyurea) in both
pockets — mitapivat wins even on hemoglobin it doesn't target. Raw score
can't distinguish target-specific binding: the docking analog of the
connectivity specificity problem. Next: redocking-RMSD validation +
ligand-efficiency normalization.

Note: machine is 24GB (not 96GB per PLAN §2), capping local AF3-class
inference. tools/ gitignored (vina binary).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 00:03:00 +02:00
817bcda7dc Structure-binding track: scaffold + ligand-retrieval baseline
Start the structure-based binding branch (PLAN §12), baseline-first.

- src/binding.py: validated RDKit ligand retrieval (morgan_fp, tanimoto,
  retrieve_nearest = the §12.9 engine) + dock() stub documenting the
  blocked ARM-Mac toolchain
- scripts/binding_ligand_baseline.py: 300 drugs vs known binders
- docs/structure_binding_notes.md: status, toolchain blocker, next steps
- pyproject: [structure] extra (rdkit); data/raw/structures/ for PDBs

Step-0 finding: retrieval engine VALIDATED on in-set classes
(decitabine->azacitidine 0.62; vorinostat->scriptaid/belinostat) but the
distinctive binders voxelotor/mitapivat have no analog in our 300-drug
set (Tanimoto ~0.2). Needs (a) bigger library, (b) real docking (§12.3),
which is blocked on the ARM-Mac docking toolchain (§12.6 pitfall 4).
Structures 5E83 (Hb+voxelotor) and 8XFD (PKR+mitapivat) fetched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 23:53:27 +02:00