Post-hoc improvement after the pre-registered v1 recovery test failed. Two changes, diagnosing v1's failure: - score on the full 12,328-gene LINCS space (week2_lincs_extract.py), lifting signature overlap from 12% to 85% (brings erythroid markers in) - src/scoring.py: KS connectivity + per-drug specificity z-score (spec_z = SDs below a 1,000 random-query null). Primary ranking is now spec_z. (Textbook tau saturated at +/-100 for a coherent query — documented; needs a reference-signature library, a v2 item.) - week3_scoring.py: spec_z primary + WTCS reference + prior-blended - tests: tau/spec_z calibration test; 19 passing - scripts/exp_genespace.py: the BING vs all-12,328 comparison Result: hydroxyurea recovers (rank 40 -> 18, top 6%, passes top-10%), confirming the v1 failure was the landmark bottleneck not the algorithm. Overall STILL FAILS: L-glutamine does not reverse (rank 213, metabolite), and negative controls (norethindrone, ciprofloxacin) rank top-3 — connectivity != therapeutic relatedness. v1.1 is post-hoc/exploratory, not a confirmatory test; reported as such in recovery_test_report.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
128 lines
7.3 KiB
Markdown
128 lines
7.3 KiB
Markdown
# Sickle Cell Repurposing — Recovery Test Report
|
||
|
||
> **Status: COMPLETE (v1 confirmatory + v1.1 exploratory).** Reproduce with `scripts/week1_*` →
|
||
> `week2_*` → `week3_scoring.py` → `week4_recovery_test.py`. ~2 pages, for a sceptical pharma scientist.
|
||
|
||
## Pre-registered success criteria
|
||
|
||
The MVP passes if:
|
||
|
||
- Hydroxyurea ranks in the **top 10%** (top 30 of 300), **AND**
|
||
- L-glutamine ranks in the **top 25%** (top 75) **OR** is documented as unscorable due to a
|
||
missing LINCS signature, **AND**
|
||
- At least **4 of 5** negative-control drugs rank in the **bottom half**.
|
||
|
||
_Pre-registered in the scaffold commit (`b731478`) before any scoring. **Primary (confirmatory)
|
||
analysis = v1**: 978 landmark genes, weighted connectivity score (WTCS). The 5 negative controls
|
||
were pre-specified by category rule without inspecting ranks._
|
||
|
||
---
|
||
|
||
## Section 1 — Methodology
|
||
|
||
A sickle cell disease signature was built from **two whole-blood microarray studies** (GSE35007
|
||
Illumina SS-vs-AA; GSE16728 Affymetrix patient-vs-control), keeping the **671 genes concordant**
|
||
across both (q<0.05, same direction) → a cross-platform Tier-A signature (250 up / 227 down).
|
||
Profiles were built for **300 small molecules** (2 ground-truth; 32 related-mechanism; 26
|
||
negative controls; 240 random), each with a **LINCS L1000** consensus signature (mean Level-5
|
||
MODZ across cell lines, both CMap phases). Drugs were ranked by **CMap connectivity scoring**
|
||
(Kolmogorov-Smirnov, Lamb 2006 / Subramanian 2017): negative = reversal = candidate.
|
||
|
||
**v1 (pre-registered/confirmatory):** scored on the 978 landmark genes with WTCS.
|
||
**v1.1 (post-hoc/exploratory):** after v1 failed, two changes were made to diagnose why — (a)
|
||
score on the full **12,328-gene** space (landmark overlap 12% → 85%, bringing the erythroid
|
||
markers in); (b) add a **per-drug specificity z-score** (`spec_z`): how many SDs the real
|
||
connectivity is below a null of 1,000 random queries of the same size against that drug. Because
|
||
these changes followed inspection of the v1 result, **v1.1 is exploratory, not a confirmatory
|
||
test of the pre-registered hypothesis.**
|
||
|
||
## Section 2 — Recovery test result
|
||
|
||
| Criterion | v1 (confirmatory) | v1.1 (exploratory) |
|
||
|---|---|---|
|
||
| Hydroxyurea top-10% (≤30) | rank **40** (13.3%) ❌ | rank **18** (6.0%) ✅ |
|
||
| L-glutamine top-25% (≤75) | rank 100, WTCS=0 ❌ | rank 213, spec_z=+0.98 ❌ |
|
||
| ≥4/5 neg controls bottom-half | 1/5 ❌ | 2/5 ❌ |
|
||
| **Overall** | **FAIL** | **FAIL** (but hydroxyurea recovered) |
|
||
|
||
v1.1 negative controls: clotrimazole 258 ✅, astemizole 211 ✅, azithromycin 142 ❌,
|
||
ethinyl-estradiol 114 ❌, caffeine 77 ❌.
|
||
|
||
**Honest reading.** The **pre-registered test FAILED (v1).** The post-hoc v1.1 changes
|
||
**recover hydroxyurea** (rank 40 → 18, passing top-10%) — strong evidence that the v1 failure was
|
||
driven by the 978-landmark bottleneck, not the algorithm. But two failures survive into v1.1, and
|
||
both are now precisely diagnosed:
|
||
|
||
1. **L-glutamine does not reverse the signature** (positive connectivity, spec_z=+0.98). This is
|
||
intrinsic to its LINCS data — a metabolite with no reversal signal — not a coverage gap. More
|
||
genes cannot fix it.
|
||
2. **The negative-control criterion is arguably invalid for connectivity scoring.** Two
|
||
"negative controls" (norethindrone, ciprofloxacin) rank in the top 3 by spec_z. Connectivity
|
||
measures *expression reversal*, not *therapeutic relatedness* — an antibiotic or contraceptive
|
||
can still down-regulate the inflammation genes that dominate the scorable signature. The test
|
||
design conflates the two.
|
||
|
||
A note on the calibration: textbook CMap **tau** (percentile vs a reference population) was
|
||
implemented but **saturated at ±100** here, because a coherent real query always out-connects
|
||
random gene sets — proper tau needs a library of *real* reference signatures, which this MVP
|
||
lacks. The continuous `spec_z` is the workable substitute.
|
||
|
||
## Section 3 — Top 10 candidates (v1.1 spec_z)
|
||
|
||
| Rank | Drug | spec_z | Inclusion | Read |
|
||
|---|---|---|---|---|
|
||
| 1 | reserpic-acid | −3.80 | random | reserpine metabolite; non-obvious |
|
||
| 2 | norethindrone | −3.78 | **negative control** | false positive (see §2) |
|
||
| 3 | ciprofloxacin | −3.61 | **negative control** | false positive |
|
||
| 4 | resveratrol | −3.46 | related-mechanism | antioxidant studied in SCD — coherent |
|
||
| 5 | BRD-K57490754 | −3.37 | random | tool compound |
|
||
| 6 | anastrozole | −3.27 | random | aromatase inhibitor |
|
||
| 7–10 | BRD-* / palmitoylethanolamide | ~−3.1 | random | mostly tool compounds |
|
||
|
||
That two negative controls outrank hydroxyurea is the single most informative result here — see §4.
|
||
|
||
## Section 4 — One non-obvious result worth investigating
|
||
|
||
The most useful finding is **not** a candidate drug but the **negative-control failure**:
|
||
unrelated drugs (norethindrone, ciprofloxacin) score as strong specific reversers. This is a
|
||
real, generalizable lesson — for a signature whose *scorable* portion is generic
|
||
inflammation/metabolic genes, connectivity rewards any broad transcriptional perturbation that
|
||
touches those genes. The honest implication: **this signature is not specific enough to
|
||
discriminate true repurposing candidates from incidental expression reversers.** Of the
|
||
plausibly-real hits, **resveratrol (rank 4)** — an antioxidant with prior sickle cell literature
|
||
— is the most defensible, but it is a hypothesis, not a discovery.
|
||
|
||
## Section 5 — Honest limitations
|
||
|
||
1. **Pre-registered test failed; the pass is post-hoc.** v1.1's hydroxyurea recovery is
|
||
exploratory and must be re-validated on a held-out disease before any claim is made.
|
||
2. **Missing HbF axis** — HBG1/HBG2 are absent from LINCS entirely (not just landmarks), so the
|
||
pathway hydroxyurea acts on can never be scored by this method.
|
||
3. **Signature specificity** — scorable genes are inflammation/metabolic; negative controls
|
||
reverse them too. Connectivity ≠ therapeutic relatedness.
|
||
4. **Cell-composition confound** — the whole-blood signature is reticulocyte-dominated.
|
||
5. **LINCS cancer-cell-line bias**, and **no reference-signature library** for proper tau.
|
||
6. **No mechanistic validation** — all hits are computational hypotheses.
|
||
|
||
## Section 6 — What v2 would fix
|
||
|
||
- **A reference-signature library** to make tau (proper specificity calibration) work — the
|
||
single biggest fix to the negative-control problem, and a direct use of the curated-data moat.
|
||
- **Cell-type deconvolution** + a non-globin-depleted RNA-seq study to recover a more specific,
|
||
HbF-containing signature.
|
||
- **Signature prediction / mechanism graph** to score genes with no LINCS measurement.
|
||
- **A second disease** to test generalization and to honestly re-validate the v1.1 method
|
||
(PLAN §9.5).
|
||
|
||
---
|
||
|
||
### Bottom line
|
||
|
||
The pre-registered recovery test **failed**. Post-hoc diagnosis shows the dominant cause was a
|
||
fixable gene-space bottleneck — correcting it **recovers hydroxyurea** — but also surfaces a
|
||
deeper, genuine limitation: this whole-blood signature is **not specific enough** for
|
||
connectivity scoring to separate real candidates from incidental reversers (negative controls
|
||
rank at the top). The MVP's real deliverable is a precise, honest map of *what it takes to make
|
||
this method work*: a more specific (deconvolved, HbF-containing) signature and a reference library
|
||
for calibration — exactly the curated-data investments the platform thesis is built on.
|