Reverso/docs/structure_binding_notes.md

# Structure-based binding track — working notes

Branch `structure-based-binding`. Implements PLAN §12. Baseline-first, start with the two cleanest
targets (Hemoglobin + PKR), de-risk the harness before scaling.

## Status (2026-06-23)

**Toolchain check (PLAN §12.6 pitfall 4, confirmed real):**
- ✅ RDKit installs on ARM Mac — ligand side ready.
- ❌ AutoDock Vina does NOT pip-install on ARM Mac; no docking binary available. Docking (§12.3)
  is **blocked on toolchain** — must resolve via conda/micromamba (`vina`/`smina`), a GPU AF3-class
  model (Boltz-2/Chai-1/DiffDock), or an x86 Vina binary under Rosetta.

**Structures obtained:** `5E83` (hemoglobin + voxelotor), `8XFD` (PKR + mitapivat) in
`data/raw/structures/`.

**Step 0 — ligand-based retrieval baseline (`scripts/binding_ligand_baseline.py`):**
RDKit Tanimoto of our 300 drugs vs known sickle binders.
- Engine VALIDATED on in-set classes: `decitabine`→azacitidine (0.62); `vorinostat`→scriptaid
  (0.42), belinostat (0.28). Correctly clusters DNMT1 / HDAC HbF-inducers.
- But voxelotor / mitapivat have **no analog** in our set (max Tanimoto ~0.20–0.26). A 300-drug
  library is too sparse to contain look-alikes of distinctive scaffolds.

**Takeaways:**
1. Ligand retrieval works but needs a **bigger drug library** to be useful for distinctive targets.
2. The targets without in-set analogs (Hb, PKR) need **actual docking** (§12.3) — which scores
   binding directly, no look-alike required. That is the gating next step, and it needs the
   toolchain solved.

## Step 1 — docking baseline (2026-06-24)

**Toolchain SOLVED on ARM Mac:** AutoDock Vina 1.2.5 mac binary (`tools/vina`, runs under Rosetta)
+ open-babel (brew) for prep. Docking runs end-to-end (`scripts/dock_positive_controls.py`).
Co-crystal ligands identified: 5L7 = voxelotor (5E83), WV2 = mitapivat (8XFD).

**Positive-control cross-docking — inconclusive, and instructively so.** Affinities (kcal/mol):

| ligand | hemoglobin | PKR |
|---|---|---|
| voxelotor | −8.1 | −9.3 |
| mitapivat | −10.0 | −11.2 |
| decitabine | −6.6 | −7.0 |
| hydroxyurea | −3.9 | −3.6 |
| caffeine | −6.1 | −6.4 |

The scores rank almost perfectly by **molecular size** (mitapivat > voxelotor > decitabine/caffeine
> hydroxyurea) in *both* pockets — mitapivat wins even on hemoglobin, which it doesn't target. So
raw Vina affinity is confounded by ligand size and per-pocket stickiness; it cannot yet
distinguish target-specific binding. This is the **docking analog of the connectivity specificity
problem** — raw scores carry a systematic bias (size here, broadness there) that masquerades as
signal. voxelotor *does* dock to Hb (−8.1, a real score); the cross-target test just isn't the
right validation.

## Next steps
- [ ] **Redocking-RMSD validation** (the gold-standard positive control): redock the crystal ligand
  5L7/WV2 into its own structure, compute pose RMSD vs crystal. <2 Å = geometry validated. This
  tests pose accuracy, which size bias doesn't corrupt.
- [ ] **Ligand-efficiency normalization** (affinity / heavy-atom count) to de-bias the size effect,
  the docking counterpart of the connectivity calibration work.
- [ ] Expand the ligand library (full ChEMBL/LINCS) for retrieval reach.
- [ ] Only then: AF3-class co-folding (Boltz-2/DiffDock via PyTorch-MPS — note 24 GB ceiling) vs the
  docking baseline; and §12.9 generative beacon.

> **Hardware note:** this machine is **24 GB** unified memory (not the 96 GB PLAN §2 assumed),
> which caps local AF3-class model inference. Classical docking (above) is unaffected.