Files

Junior B. 51bd90df41 Redocking-RMSD validation fails 3/3: pipeline-quality issue

§12.4 de-biased validation (scripts/dock_validate.py).
Redock each co-crystal ligand into its own structure, RMSD vs crystal:
- voxelotor->Hb: NA (covalent binder, out of scope §12.7)
- mitapivat->PKR: 8.2A (allosteric, cofactors stripped)
- vorinostat->HDAC2 (4LXZ, zinc kept): 7.9A -- a CLASSICAL target that
  should have worked

The clean target also failing => systematic pipeline-quality problem,
not target choice. Cheap Vina + open-babel prep gives scores but doesn't
reproduce known geometry, so affinities aren't trustworthy. Ligand
efficiency over-corrects (ranks tiny hydroxyurea best). Fix needs
production prep (Meeko/AutoDockTools prepare_receptor + reduce) and an
in-place RMSD metric. Consistent with the project theme: the quick
version of every method runs but fails honest validation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-24 07:28:47 +02:00

5.1 KiB

Raw Blame History

Structure-based binding track — working notes

Branch structure-based-binding. Implements PLAN §12. Baseline-first, start with the two cleanest targets (Hemoglobin + PKR), de-risk the harness before scaling.

Status (2026-06-23)

Toolchain check (PLAN §12.6 pitfall 4, confirmed real):

✅ RDKit installs on ARM Mac — ligand side ready.
❌ AutoDock Vina does NOT pip-install on ARM Mac; no docking binary available. Docking (§12.3) is blocked on toolchain — must resolve via conda/micromamba (vina/smina), a GPU AF3-class model (Boltz-2/Chai-1/DiffDock), or an x86 Vina binary under Rosetta.

Structures obtained: 5E83 (hemoglobin + voxelotor), 8XFD (PKR + mitapivat) in data/raw/structures/.

Step 0 — ligand-based retrieval baseline (scripts/binding_ligand_baseline.py): RDKit Tanimoto of our 300 drugs vs known sickle binders.

Engine VALIDATED on in-set classes: decitabine→azacitidine (0.62); vorinostat→scriptaid (0.42), belinostat (0.28). Correctly clusters DNMT1 / HDAC HbF-inducers.
But voxelotor / mitapivat have no analog in our set (max Tanimoto ~0.20–0.26). A 300-drug library is too sparse to contain look-alikes of distinctive scaffolds.

Takeaways:

Ligand retrieval works but needs a bigger drug library to be useful for distinctive targets.
The targets without in-set analogs (Hb, PKR) need actual docking (§12.3) — which scores binding directly, no look-alike required. That is the gating next step, and it needs the toolchain solved.

Step 1 — docking baseline (2026-06-24)

Toolchain SOLVED on ARM Mac: AutoDock Vina 1.2.5 mac binary (tools/vina, runs under Rosetta)

open-babel (brew) for prep. Docking runs end-to-end (scripts/dock_positive_controls.py). Co-crystal ligands identified: 5L7 = voxelotor (5E83), WV2 = mitapivat (8XFD).

Positive-control cross-docking — inconclusive, and instructively so. Affinities (kcal/mol):

ligand	hemoglobin	PKR
voxelotor	−8.1	−9.3
mitapivat	−10.0	−11.2
decitabine	−6.6	−7.0
hydroxyurea	−3.9	−3.6
caffeine	−6.1	−6.4

The scores rank almost perfectly by molecular size (mitapivat > voxelotor > decitabine/caffeine

hydroxyurea) in both pockets — mitapivat wins even on hemoglobin, which it doesn't target. So raw Vina affinity is confounded by ligand size and per-pocket stickiness; it cannot yet distinguish target-specific binding. This is the docking analog of the connectivity specificity problem — raw scores carry a systematic bias (size here, broadness there) that masquerades as signal. voxelotor does dock to Hb (−8.1, a real score); the cross-target test just isn't the right validation.

Step 2 — redocking-RMSD validation FAILS across the board (2026-06-24)

Redocked each co-crystal ligand into its own structure (scripts/dock_validate.py); RMSD vs crystal pose via obrms:

redock	RMSD	note
voxelotor → Hb (5E83)	NA	covalent binder (Schiff base, αVal1) — out of scope §12.7
mitapivat → PKR (8XFD)	8.2 Å	allosteric, cofactor (FBP/Mg) stripped
vorinostat → HDAC2 (4LXZ, Zn kept)	7.9 Å	classical non-covalent target — should have worked

The clean target also failing means this is a systematic PIPELINE-QUALITY problem, not target choice. The cheap Vina + open-babel setup produces scores but does not reproduce known binding geometry, so its affinities are not yet trustworthy. Ligand efficiency (affinity / heavy atoms) also doesn't fix it — it over-corrects, ranking tiny hydroxyurea (−0.78) "best".

Likely causes (in priority order):

Low-quality receptor prep — open-babel -xr is not production docking prep. Need AutoDockTools prepare_receptor or Meeko + reduce/pdb2pqr for protonation, charges, and proper AutoDock atom typing.
Ligand prep — should use Meeko (correct rotatable bonds / typing), not bare obabel --gen3d.
RMSD metric — obrms superimposes before RMSD; redocking validation wants symmetry-corrected RMSD in place (receptor frame). Worth confirming with an in-place metric.

Honest takeaway: consistent with the whole project — the quick version of each method runs but doesn't survive honest validation. Credible structure-based docking needs production prep tooling (Meeko/ADFR), which is the real next investment for this track.

Next steps

Install Meeko (+ reduce / pdb2pqr) and redo receptor+ligand prep; re-run redocking RMSD.
Fix the RMSD metric (in-place, symmetry-corrected) to rule out a measurement artifact.
Only once redocking validates (<2 Å) are affinity scores trustworthy — then cross-dock / screen the library and revisit ligand-efficiency / pose-based scoring.
Later: AF3-class co-folding (Boltz-2/DiffDock via PyTorch-MPS — 24 GB ceiling) and the §12.9 generative beacon.

Hardware note: this machine is 24 GB unified memory (not the 96 GB PLAN §2 assumed), which caps local AF3-class model inference. Classical docking (above) is unaffected.

5.1 KiB Raw Blame History Unescape Escape