Redocking-RMSD validation fails 3/3: pipeline-quality issue
§12.4 de-biased validation (scripts/dock_validate.py). Redock each co-crystal ligand into its own structure, RMSD vs crystal: - voxelotor->Hb: NA (covalent binder, out of scope §12.7) - mitapivat->PKR: 8.2A (allosteric, cofactors stripped) - vorinostat->HDAC2 (4LXZ, zinc kept): 7.9A -- a CLASSICAL target that should have worked The clean target also failing => systematic pipeline-quality problem, not target choice. Cheap Vina + open-babel prep gives scores but doesn't reproduce known geometry, so affinities aren't trustworthy. Ligand efficiency over-corrects (ranks tiny hydroxyurea best). Fix needs production prep (Meeko/AutoDockTools prepare_receptor + reduce) and an in-place RMSD metric. Consistent with the project theme: the quick version of every method runs but fails honest validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -51,15 +51,41 @@ problem** — raw scores carry a systematic bias (size here, broadness there) th
|
||||
signal. voxelotor *does* dock to Hb (−8.1, a real score); the cross-target test just isn't the
|
||||
right validation.
|
||||
|
||||
## Step 2 — redocking-RMSD validation FAILS across the board (2026-06-24)
|
||||
|
||||
Redocked each co-crystal ligand into its own structure (`scripts/dock_validate.py`); RMSD vs
|
||||
crystal pose via obrms:
|
||||
|
||||
| redock | RMSD | note |
|
||||
|---|---|---|
|
||||
| voxelotor → Hb (5E83) | NA | covalent binder (Schiff base, αVal1) — out of scope §12.7 |
|
||||
| mitapivat → PKR (8XFD) | 8.2 Å | allosteric, cofactor (FBP/Mg) stripped |
|
||||
| **vorinostat → HDAC2 (4LXZ, Zn kept)** | **7.9 Å** | classical non-covalent target — should have worked |
|
||||
|
||||
**The clean target also failing means this is a systematic PIPELINE-QUALITY problem, not target
|
||||
choice.** The cheap Vina + open-babel setup produces scores but does not reproduce known binding
|
||||
geometry, so its affinities are not yet trustworthy. Ligand efficiency (affinity / heavy atoms)
|
||||
also doesn't fix it — it over-corrects, ranking tiny hydroxyurea (−0.78) "best".
|
||||
|
||||
Likely causes (in priority order):
|
||||
1. **Low-quality receptor prep** — open-babel `-xr` is not production docking prep. Need
|
||||
AutoDockTools `prepare_receptor` or **Meeko** + `reduce`/pdb2pqr for protonation, charges, and
|
||||
proper AutoDock atom typing.
|
||||
2. **Ligand prep** — should use Meeko (correct rotatable bonds / typing), not bare obabel `--gen3d`.
|
||||
3. **RMSD metric** — obrms superimposes before RMSD; redocking validation wants symmetry-corrected
|
||||
RMSD **in place** (receptor frame). Worth confirming with an in-place metric.
|
||||
|
||||
**Honest takeaway:** consistent with the whole project — the *quick* version of each method runs
|
||||
but doesn't survive honest validation. Credible structure-based docking needs production prep
|
||||
tooling (Meeko/ADFR), which is the real next investment for this track.
|
||||
|
||||
## Next steps
|
||||
- [ ] **Redocking-RMSD validation** (the gold-standard positive control): redock the crystal ligand
|
||||
5L7/WV2 into its own structure, compute pose RMSD vs crystal. <2 Å = geometry validated. This
|
||||
tests pose accuracy, which size bias doesn't corrupt.
|
||||
- [ ] **Ligand-efficiency normalization** (affinity / heavy-atom count) to de-bias the size effect,
|
||||
the docking counterpart of the connectivity calibration work.
|
||||
- [ ] Expand the ligand library (full ChEMBL/LINCS) for retrieval reach.
|
||||
- [ ] Only then: AF3-class co-folding (Boltz-2/DiffDock via PyTorch-MPS — note 24 GB ceiling) vs the
|
||||
docking baseline; and §12.9 generative beacon.
|
||||
- [ ] Install **Meeko** (+ reduce / pdb2pqr) and redo receptor+ligand prep; re-run redocking RMSD.
|
||||
- [ ] Fix the RMSD metric (in-place, symmetry-corrected) to rule out a measurement artifact.
|
||||
- [ ] Only once redocking validates (<2 Å) are affinity scores trustworthy — then cross-dock /
|
||||
screen the library and revisit ligand-efficiency / pose-based scoring.
|
||||
- [ ] Later: AF3-class co-folding (Boltz-2/DiffDock via PyTorch-MPS — 24 GB ceiling) and the §12.9
|
||||
generative beacon.
|
||||
|
||||
> **Hardware note:** this machine is **24 GB** unified memory (not the 96 GB PLAN §2 assumed),
|
||||
> which caps local AF3-class model inference. Classical docking (above) is unaffected.
|
||||
|
||||
Reference in New Issue
Block a user