Files
Reverso/docs/recovery_test_report.md
Junior B. 3417f85eb1 v1.1: full gene space + specificity z-score; hydroxyurea recovers
Post-hoc improvement after the pre-registered v1 recovery test failed.
Two changes, diagnosing v1's failure:
- score on the full 12,328-gene LINCS space (week2_lincs_extract.py),
  lifting signature overlap from 12% to 85% (brings erythroid markers in)
- src/scoring.py: KS connectivity + per-drug specificity z-score
  (spec_z = SDs below a 1,000 random-query null). Primary ranking is
  now spec_z. (Textbook tau saturated at +/-100 for a coherent query —
  documented; needs a reference-signature library, a v2 item.)
- week3_scoring.py: spec_z primary + WTCS reference + prior-blended
- tests: tau/spec_z calibration test; 19 passing
- scripts/exp_genespace.py: the BING vs all-12,328 comparison

Result: hydroxyurea recovers (rank 40 -> 18, top 6%, passes top-10%),
confirming the v1 failure was the landmark bottleneck not the algorithm.
Overall STILL FAILS: L-glutamine does not reverse (rank 213, metabolite),
and negative controls (norethindrone, ciprofloxacin) rank top-3 —
connectivity != therapeutic relatedness. v1.1 is post-hoc/exploratory,
not a confirmatory test; reported as such in recovery_test_report.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 22:57:30 +02:00

128 lines
7.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Sickle Cell Repurposing — Recovery Test Report
> **Status: COMPLETE (v1 confirmatory + v1.1 exploratory).** Reproduce with `scripts/week1_*` →
> `week2_*` → `week3_scoring.py` → `week4_recovery_test.py`. ~2 pages, for a sceptical pharma scientist.
## Pre-registered success criteria
The MVP passes if:
- Hydroxyurea ranks in the **top 10%** (top 30 of 300), **AND**
- L-glutamine ranks in the **top 25%** (top 75) **OR** is documented as unscorable due to a
missing LINCS signature, **AND**
- At least **4 of 5** negative-control drugs rank in the **bottom half**.
_Pre-registered in the scaffold commit (`b731478`) before any scoring. **Primary (confirmatory)
analysis = v1**: 978 landmark genes, weighted connectivity score (WTCS). The 5 negative controls
were pre-specified by category rule without inspecting ranks._
---
## Section 1 — Methodology
A sickle cell disease signature was built from **two whole-blood microarray studies** (GSE35007
Illumina SS-vs-AA; GSE16728 Affymetrix patient-vs-control), keeping the **671 genes concordant**
across both (q<0.05, same direction) a cross-platform Tier-A signature (250 up / 227 down).
Profiles were built for **300 small molecules** (2 ground-truth; 32 related-mechanism; 26
negative controls; 240 random), each with a **LINCS L1000** consensus signature (mean Level-5
MODZ across cell lines, both CMap phases). Drugs were ranked by **CMap connectivity scoring**
(Kolmogorov-Smirnov, Lamb 2006 / Subramanian 2017): negative = reversal = candidate.
**v1 (pre-registered/confirmatory):** scored on the 978 landmark genes with WTCS.
**v1.1 (post-hoc/exploratory):** after v1 failed, two changes were made to diagnose why (a)
score on the full **12,328-gene** space (landmark overlap 12% 85%, bringing the erythroid
markers in); (b) add a **per-drug specificity z-score** (`spec_z`): how many SDs the real
connectivity is below a null of 1,000 random queries of the same size against that drug. Because
these changes followed inspection of the v1 result, **v1.1 is exploratory, not a confirmatory
test of the pre-registered hypothesis.**
## Section 2 — Recovery test result
| Criterion | v1 (confirmatory) | v1.1 (exploratory) |
|---|---|---|
| Hydroxyurea top-10% (≤30) | rank **40** (13.3%) | rank **18** (6.0%) |
| L-glutamine top-25% (≤75) | rank 100, WTCS=0 | rank 213, spec_z=+0.98 |
| 4/5 neg controls bottom-half | 1/5 | 2/5 |
| **Overall** | **FAIL** | **FAIL** (but hydroxyurea recovered) |
v1.1 negative controls: clotrimazole 258 ✅, astemizole 211 ✅, azithromycin 142 ❌,
ethinyl-estradiol 114 ❌, caffeine 77 ❌.
**Honest reading.** The **pre-registered test FAILED (v1).** The post-hoc v1.1 changes
**recover hydroxyurea** (rank 40 18, passing top-10%) strong evidence that the v1 failure was
driven by the 978-landmark bottleneck, not the algorithm. But two failures survive into v1.1, and
both are now precisely diagnosed:
1. **L-glutamine does not reverse the signature** (positive connectivity, spec_z=+0.98). This is
intrinsic to its LINCS data a metabolite with no reversal signal not a coverage gap. More
genes cannot fix it.
2. **The negative-control criterion is arguably invalid for connectivity scoring.** Two
"negative controls" (norethindrone, ciprofloxacin) rank in the top 3 by spec_z. Connectivity
measures *expression reversal*, not *therapeutic relatedness* an antibiotic or contraceptive
can still down-regulate the inflammation genes that dominate the scorable signature. The test
design conflates the two.
A note on the calibration: textbook CMap **tau** (percentile vs a reference population) was
implemented but **saturated at ±100** here, because a coherent real query always out-connects
random gene sets proper tau needs a library of *real* reference signatures, which this MVP
lacks. The continuous `spec_z` is the workable substitute.
## Section 3 — Top 10 candidates (v1.1 spec_z)
| Rank | Drug | spec_z | Inclusion | Read |
|---|---|---|---|---|
| 1 | reserpic-acid | 3.80 | random | reserpine metabolite; non-obvious |
| 2 | norethindrone | 3.78 | **negative control** | false positive (see §2) |
| 3 | ciprofloxacin | 3.61 | **negative control** | false positive |
| 4 | resveratrol | 3.46 | related-mechanism | antioxidant studied in SCD coherent |
| 5 | BRD-K57490754 | 3.37 | random | tool compound |
| 6 | anastrozole | 3.27 | random | aromatase inhibitor |
| 710 | BRD-* / palmitoylethanolamide | ~3.1 | random | mostly tool compounds |
That two negative controls outrank hydroxyurea is the single most informative result here see §4.
## Section 4 — One non-obvious result worth investigating
The most useful finding is **not** a candidate drug but the **negative-control failure**:
unrelated drugs (norethindrone, ciprofloxacin) score as strong specific reversers. This is a
real, generalizable lesson for a signature whose *scorable* portion is generic
inflammation/metabolic genes, connectivity rewards any broad transcriptional perturbation that
touches those genes. The honest implication: **this signature is not specific enough to
discriminate true repurposing candidates from incidental expression reversers.** Of the
plausibly-real hits, **resveratrol (rank 4)** an antioxidant with prior sickle cell literature
is the most defensible, but it is a hypothesis, not a discovery.
## Section 5 — Honest limitations
1. **Pre-registered test failed; the pass is post-hoc.** v1.1's hydroxyurea recovery is
exploratory and must be re-validated on a held-out disease before any claim is made.
2. **Missing HbF axis** HBG1/HBG2 are absent from LINCS entirely (not just landmarks), so the
pathway hydroxyurea acts on can never be scored by this method.
3. **Signature specificity** scorable genes are inflammation/metabolic; negative controls
reverse them too. Connectivity therapeutic relatedness.
4. **Cell-composition confound** the whole-blood signature is reticulocyte-dominated.
5. **LINCS cancer-cell-line bias**, and **no reference-signature library** for proper tau.
6. **No mechanistic validation** all hits are computational hypotheses.
## Section 6 — What v2 would fix
- **A reference-signature library** to make tau (proper specificity calibration) work the
single biggest fix to the negative-control problem, and a direct use of the curated-data moat.
- **Cell-type deconvolution** + a non-globin-depleted RNA-seq study to recover a more specific,
HbF-containing signature.
- **Signature prediction / mechanism graph** to score genes with no LINCS measurement.
- **A second disease** to test generalization and to honestly re-validate the v1.1 method
(PLAN §9.5).
---
### Bottom line
The pre-registered recovery test **failed**. Post-hoc diagnosis shows the dominant cause was a
fixable gene-space bottleneck correcting it **recovers hydroxyurea** but also surfaces a
deeper, genuine limitation: this whole-blood signature is **not specific enough** for
connectivity scoring to separate real candidates from incidental reversers (negative controls
rank at the top). The MVP's real deliverable is a precise, honest map of *what it takes to make
this method work*: a more specific (deconvolved, HbF-containing) signature and a reference library
for calibration exactly the curated-data investments the platform thesis is built on.