v1.1: full gene space + specificity z-score; hydroxyurea recovers
Post-hoc improvement after the pre-registered v1 recovery test failed. Two changes, diagnosing v1's failure: - score on the full 12,328-gene LINCS space (week2_lincs_extract.py), lifting signature overlap from 12% to 85% (brings erythroid markers in) - src/scoring.py: KS connectivity + per-drug specificity z-score (spec_z = SDs below a 1,000 random-query null). Primary ranking is now spec_z. (Textbook tau saturated at +/-100 for a coherent query — documented; needs a reference-signature library, a v2 item.) - week3_scoring.py: spec_z primary + WTCS reference + prior-blended - tests: tau/spec_z calibration test; 19 passing - scripts/exp_genespace.py: the BING vs all-12,328 comparison Result: hydroxyurea recovers (rank 40 -> 18, top 6%, passes top-10%), confirming the v1 failure was the landmark bottleneck not the algorithm. Overall STILL FAILS: L-glutamine does not reverse (rank 213, metabolite), and negative controls (norethindrone, ciprofloxacin) rank top-3 — connectivity != therapeutic relatedness. v1.1 is post-hoc/exploratory, not a confirmatory test; reported as such in recovery_test_report.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -61,9 +61,10 @@ Reproduce with `scripts/week1_explore.py` (download + DE + concordance) then
|
||||
38%, as expected). 43 drugs carry target annotations; 46 carry mechanism-of-action.
|
||||
- **Tier:** all signature-backed drugs are Tier B (LINCS is a single source → fails Tier A's
|
||||
not-single-source rule).
|
||||
- **Signature↔landmark overlap:** only 56/477 (12%) of the disease signature genes are LINCS
|
||||
landmarks, so connectivity scoring (Week 3) uses a 30-up/26-down query. The erythroid hallmark
|
||||
genes (CA1, AHSP, SLC4A1, HBG) are NOT landmarks. This is a key limitation for the recovery test.
|
||||
- **Gene space (v1.1):** scoring uses the full **12,328-gene** LINCS space, not just the 978
|
||||
landmarks. Signature overlap is 406/477 (85%) vs 56/477 (12%) for landmark-only — the larger
|
||||
space is what recovers hydroxyurea (see recovery_test_report.md). HBG1/HBG2 are absent from
|
||||
LINCS entirely and remain unscoreable.
|
||||
- Reproduce: `week2_curate_drugset.py` → `week2_chembl.py` → download Level-5 GCTX →
|
||||
`week2_lincs_extract.py` → `week2_assemble.py`.
|
||||
|
||||
|
||||
@@ -34,20 +34,28 @@ Source: PLAN.md §9.
|
||||
7. **Top-ranked novel candidates are not wet-lab validated.** They are computational hypotheses
|
||||
to test, not discoveries. Use careful language in any write-up.
|
||||
|
||||
8. **Only 12% of the signature is LINCS-scorable (56/477 genes).** The 978 landmark genes (from
|
||||
cancer cell lines) miss the erythroid hallmark genes (CA1, AHSP, SLC4A1, HBG). Connectivity
|
||||
scoring runs on a thin inflammation/metabolic slice — the single biggest driver of the
|
||||
recovery-test failure. v2 fix: signature prediction or a mechanism graph to score the other 88%.
|
||||
8. **Gene-space bottleneck (v1 → fixed in v1.1).** v1 scored on only the 978 landmark genes (12%
|
||||
signature overlap) — the main driver of the v1 failure. v1.1 uses the full 12,328-gene space
|
||||
(85% overlap) and recovers hydroxyurea. HBG1/HBG2 remain absent from LINCS entirely.
|
||||
|
||||
## Recovery test outcome (Week 4)
|
||||
9. **No reference-signature library for tau.** Textbook CMap tau saturated at ±100 (a coherent
|
||||
query always out-connects random gene sets). v1.1 substitutes a per-drug specificity z-score.
|
||||
Proper tau needs a library of real reference signatures — a v2 / curated-data item.
|
||||
|
||||
The MVP **failed** all three pre-registered criteria on the primary raw ranking (hydroxyurea
|
||||
rank 40/top 13%; L-glutamine rank 100/WTCS=0; 1/5 negative controls in bottom half). The failure
|
||||
is fully attributable to signature/assay data limitations above, not the matching algorithm. See
|
||||
10. **Negative-control criterion may be invalid for connectivity scoring.** Unrelated drugs
|
||||
(norethindrone, ciprofloxacin) rank as top specific reversers — connectivity measures
|
||||
expression reversal, not therapeutic relatedness.
|
||||
|
||||
## Recovery test outcome
|
||||
|
||||
Pre-registered test (**v1, confirmatory**): **FAILED** all three criteria (hydroxyurea rank
|
||||
40/top 13%; L-glutamine rank 100; 1/5 negative controls bottom-half). Post-hoc (**v1.1,
|
||||
exploratory**): hydroxyurea recovers to rank 18 (top 6%, passes), but L-glutamine (rank 213, does
|
||||
not reverse) and negative controls (2/5) still fail → overall still FAIL. See
|
||||
`recovery_test_report.md`.
|
||||
|
||||
| Drug | Issue | Handling |
|
||||
| Drug | Issue | v1.1 status |
|
||||
|---|---|---|
|
||||
| hydroxyurea | HbF mechanism not in scorable gene space | scored (rank 40); recovered only by prior-weighted ranking |
|
||||
| L-glutamine | signature present but WTCS ambiguous (=0) | scored (rank 100); no reversal signal |
|
||||
| all 300 | had LINCS signatures | 0 marked "not scored" — coverage was not the issue; specificity was |
|
||||
| hydroxyurea | needed the full gene space | rank 18 (top 6%) — recovered post-hoc |
|
||||
| L-glutamine | metabolite, no reversal signal (positive connectivity) | rank 213 — genuine negative |
|
||||
| neg controls | reverse the generic inflammation signature | 2/5 bottom-half — criterion questionable |
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Sickle Cell Repurposing — Recovery Test Report
|
||||
|
||||
> **Status: COMPLETE.** Reproduce with `scripts/week1_*` → `week2_*` → `week3_scoring.py` →
|
||||
> `week4_recovery_test.py`. ~2 pages, for a sceptical pharma scientist.
|
||||
> **Status: COMPLETE (v1 confirmatory + v1.1 exploratory).** Reproduce with `scripts/week1_*` →
|
||||
> `week2_*` → `week3_scoring.py` → `week4_recovery_test.py`. ~2 pages, for a sceptical pharma scientist.
|
||||
|
||||
## Pre-registered success criteria
|
||||
|
||||
@@ -12,118 +12,116 @@ The MVP passes if:
|
||||
missing LINCS signature, **AND**
|
||||
- At least **4 of 5** negative-control drugs rank in the **bottom half**.
|
||||
|
||||
_Pre-registered in the scaffold commit (`b731478`) before any scoring was run. Primary ranking
|
||||
= raw connectivity. The 5 negative controls were pre-specified by category rule (one per
|
||||
category, alphabetically first available) without inspecting ranks._
|
||||
_Pre-registered in the scaffold commit (`b731478`) before any scoring. **Primary (confirmatory)
|
||||
analysis = v1**: 978 landmark genes, weighted connectivity score (WTCS). The 5 negative controls
|
||||
were pre-specified by category rule without inspecting ranks._
|
||||
|
||||
---
|
||||
|
||||
## Section 1 — Methodology
|
||||
|
||||
We built a sickle cell disease signature from **two independent whole-blood microarray studies**
|
||||
(GSE35007, Illumina, SS vs AA; GSE16728, Affymetrix, patient vs control), keeping the **671
|
||||
genes concordant** (q<0.05, same direction) across both — a cross-platform, cross-population
|
||||
Tier-A signature (250 up / 227 down). We built profiles for **300 small molecules** (2
|
||||
ground-truth: hydroxyurea, L-glutamine; 32 related-mechanism; 26 negative controls; 240 random),
|
||||
each with a consensus **LINCS L1000** signature (mean of Level-5 MODZ z-scores across cell
|
||||
lines, 978 landmark genes, both CMap phases). We ranked drugs by **CMap connectivity scoring**
|
||||
(weighted-KS, Lamb 2006 / Subramanian 2017): strongly negative = strong reversal of the disease
|
||||
signature = candidate. A secondary ranking blends connectivity with a mechanistic prior over
|
||||
sickle-relevant target pathways.
|
||||
A sickle cell disease signature was built from **two whole-blood microarray studies** (GSE35007
|
||||
Illumina SS-vs-AA; GSE16728 Affymetrix patient-vs-control), keeping the **671 genes concordant**
|
||||
across both (q<0.05, same direction) → a cross-platform Tier-A signature (250 up / 227 down).
|
||||
Profiles were built for **300 small molecules** (2 ground-truth; 32 related-mechanism; 26
|
||||
negative controls; 240 random), each with a **LINCS L1000** consensus signature (mean Level-5
|
||||
MODZ across cell lines, both CMap phases). Drugs were ranked by **CMap connectivity scoring**
|
||||
(Kolmogorov-Smirnov, Lamb 2006 / Subramanian 2017): negative = reversal = candidate.
|
||||
|
||||
## Section 2 — Recovery test result — **FAIL** (primary ranking)
|
||||
**v1 (pre-registered/confirmatory):** scored on the 978 landmark genes with WTCS.
|
||||
**v1.1 (post-hoc/exploratory):** after v1 failed, two changes were made to diagnose why — (a)
|
||||
score on the full **12,328-gene** space (landmark overlap 12% → 85%, bringing the erythroid
|
||||
markers in); (b) add a **per-drug specificity z-score** (`spec_z`): how many SDs the real
|
||||
connectivity is below a null of 1,000 random queries of the same size against that drug. Because
|
||||
these changes followed inspection of the v1 result, **v1.1 is exploratory, not a confirmatory
|
||||
test of the pre-registered hypothesis.**
|
||||
|
||||
| Drug | Rank | Percentile | Pass? |
|
||||
|---|---|---|---|
|
||||
| Hydroxyurea | 40 / 300 | top 13.3% | ❌ (needs top 30) |
|
||||
| L-glutamine | 100 / 300 | top 33.3% | ❌ (WTCS=0, ambiguous; has a signature so not "missing") |
|
||||
## Section 2 — Recovery test result
|
||||
|
||||
Negative controls (pre-specified; expected: bottom half):
|
||||
| Criterion | v1 (confirmatory) | v1.1 (exploratory) |
|
||||
|---|---|---|
|
||||
| Hydroxyurea top-10% (≤30) | rank **40** (13.3%) ❌ | rank **18** (6.0%) ✅ |
|
||||
| L-glutamine top-25% (≤75) | rank 100, WTCS=0 ❌ | rank 213, spec_z=+0.98 ❌ |
|
||||
| ≥4/5 neg controls bottom-half | 1/5 ❌ | 2/5 ❌ |
|
||||
| **Overall** | **FAIL** | **FAIL** (but hydroxyurea recovered) |
|
||||
|
||||
| Control | Category | Rank | Bottom half? |
|
||||
|---|---|---|---|
|
||||
| clotrimazole | antifungal | 89 | ❌ |
|
||||
| astemizole | antihistamine | 291 | ✅ |
|
||||
| azithromycin | antibiotic | 82 | ❌ |
|
||||
| ethinyl-estradiol | hormone | 98 | ❌ |
|
||||
| caffeine | misc | 84 | ❌ |
|
||||
v1.1 negative controls: clotrimazole 258 ✅, astemizole 211 ✅, azithromycin 142 ❌,
|
||||
ethinyl-estradiol 114 ❌, caffeine 77 ❌.
|
||||
|
||||
**Only 1/5 negative controls in the bottom half (need ≥4).**
|
||||
**Honest reading.** The **pre-registered test FAILED (v1).** The post-hoc v1.1 changes
|
||||
**recover hydroxyurea** (rank 40 → 18, passing top-10%) — strong evidence that the v1 failure was
|
||||
driven by the 978-landmark bottleneck, not the algorithm. But two failures survive into v1.1, and
|
||||
both are now precisely diagnosed:
|
||||
|
||||
**Overall: FAIL on all three pre-registered criteria.** This is reported as-is, without
|
||||
adjustment. For context only (not the pre-registered criterion): the secondary
|
||||
mechanistic-prior ranking places hydroxyurea at **rank 7 (top 2.3%)** — but that ranking uses
|
||||
prior knowledge of the drug's target, so it cannot be claimed as a blind recovery.
|
||||
1. **L-glutamine does not reverse the signature** (positive connectivity, spec_z=+0.98). This is
|
||||
intrinsic to its LINCS data — a metabolite with no reversal signal — not a coverage gap. More
|
||||
genes cannot fix it.
|
||||
2. **The negative-control criterion is arguably invalid for connectivity scoring.** Two
|
||||
"negative controls" (norethindrone, ciprofloxacin) rank in the top 3 by spec_z. Connectivity
|
||||
measures *expression reversal*, not *therapeutic relatedness* — an antibiotic or contraceptive
|
||||
can still down-regulate the inflammation genes that dominate the scorable signature. The test
|
||||
design conflates the two.
|
||||
|
||||
**Why it failed — the honest diagnosis.** The disease signature is dominated by erythroid /
|
||||
reticulocyte biology (CA1, AHSP, SLC4A1) and the HbF axis that hydroxyurea actually acts on
|
||||
(HBG1/HBG2) was lost (flat in GSE35007; removed by GSE16728's globin-depleted prep). Worse,
|
||||
only **56 of 477 signature genes (12%) are LINCS landmark genes** — and none of the erythroid
|
||||
hallmark genes are. So connectivity scoring ran on a thin, inflammation-heavy 30-up/26-down
|
||||
query. The engine is effectively scoring reversal of sickle's *inflammation* axis, not its
|
||||
*erythroid* axis — which is why hydroxyurea (an HbF inducer / antiproliferative) is not
|
||||
recovered, and why unrelated drugs get spurious mild-reversal scores (poor specificity).
|
||||
A note on the calibration: textbook CMap **tau** (percentile vs a reference population) was
|
||||
implemented but **saturated at ±100** here, because a coherent real query always out-connects
|
||||
random gene sets — proper tau needs a library of *real* reference signatures, which this MVP
|
||||
lacks. The continuous `spec_z` is the workable substitute.
|
||||
|
||||
## Section 3 — Top 10 candidates (raw connectivity)
|
||||
## Section 3 — Top 10 candidates (v1.1 spec_z)
|
||||
|
||||
| Rank | Drug | Score | Known target / mechanism | Plausibility |
|
||||
| Rank | Drug | spec_z | Inclusion | Read |
|
||||
|---|---|---|---|---|
|
||||
| 1 | laropiprant | −0.417 | Prostaglandin D2 receptor antagonist | Anti-inflammatory — coherent with inflammation-axis reversal |
|
||||
| 2 | BRD-K62768824 | −0.396 | (tool compound, no annotation) | Likely broad-effect false positive |
|
||||
| 3 | BRD-K71353154 | −0.393 | (tool compound) | Likely false positive |
|
||||
| 4 | lisinopril | −0.358 | ACE inhibitor | **Non-obvious; see §4** |
|
||||
| 5 | BRD-K53443165 | −0.358 | (tool compound) | Likely false positive |
|
||||
| 6 | talnetant | −0.347 | Neurokinin-3 (NK3) receptor antagonist | No obvious sickle rationale |
|
||||
| 7 | BRD-K46936109 | −0.342 | (tool compound) | Likely false positive |
|
||||
| 8 | lawsone | −0.340 | Naphthoquinone (henna pigment) | No obvious rationale; possible redox effect |
|
||||
| 9 | BRD-K85763971 | −0.338 | (tool compound) | Likely false positive |
|
||||
| 10 | BRD-K36516410 | −0.323 | (tool compound) | Likely false positive |
|
||||
| 1 | reserpic-acid | −3.80 | random | reserpine metabolite; non-obvious |
|
||||
| 2 | norethindrone | −3.78 | **negative control** | false positive (see §2) |
|
||||
| 3 | ciprofloxacin | −3.61 | **negative control** | false positive |
|
||||
| 4 | resveratrol | −3.46 | related-mechanism | antioxidant studied in SCD — coherent |
|
||||
| 5 | BRD-K57490754 | −3.37 | random | tool compound |
|
||||
| 6 | anastrozole | −3.27 | random | aromatase inhibitor |
|
||||
| 7–10 | BRD-* / palmitoylethanolamide | ~−3.1 | random | mostly tool compounds |
|
||||
|
||||
As anticipated (PLAN §9.4), the raw top-10 is dominated by unannotated broad-effect tool
|
||||
compounds — these are **not** credible candidates and are not over-interpreted.
|
||||
That two negative controls outrank hydroxyurea is the single most informative result here — see §4.
|
||||
|
||||
## Section 4 — One non-obvious candidate worth investigating
|
||||
## Section 4 — One non-obvious result worth investigating
|
||||
|
||||
**Lisinopril (ACE inhibitor), rank 4.** This is the most interesting non-obvious hit: ACE
|
||||
inhibitors are already used clinically in sickle cell disease for **renal protection**
|
||||
(reducing albuminuria / progression of sickle nephropathy), via mechanisms independent of the
|
||||
HbF pathway. Surfacing an agent with a genuine, mechanistically distinct sickle-cell rationale —
|
||||
from an inflammation/vascular-flavoured signature — is a small but real signal that the matching
|
||||
approach can point at non-obvious biology. **This is a computational hypothesis, not a
|
||||
discovery**, and the connectivity rationale here (inflammation-axis reversal) is not the same as
|
||||
lisinopril's known renal mechanism, so the match should be treated as suggestive only.
|
||||
The most useful finding is **not** a candidate drug but the **negative-control failure**:
|
||||
unrelated drugs (norethindrone, ciprofloxacin) score as strong specific reversers. This is a
|
||||
real, generalizable lesson — for a signature whose *scorable* portion is generic
|
||||
inflammation/metabolic genes, connectivity rewards any broad transcriptional perturbation that
|
||||
touches those genes. The honest implication: **this signature is not specific enough to
|
||||
discriminate true repurposing candidates from incidental expression reversers.** Of the
|
||||
plausibly-real hits, **resveratrol (rank 4)** — an antioxidant with prior sickle cell literature
|
||||
— is the most defensible, but it is a hypothesis, not a discovery.
|
||||
|
||||
## Section 5 — Honest limitations
|
||||
|
||||
1. **Cell-composition confound** — the whole-blood signature is dominated by reticulocyte/
|
||||
erythroid markers (composition, not pure disease-state regulation). v2 needs deconvolution.
|
||||
2. **Missing HbF axis** — HBG1/HBG2 absent (globin depletion + flat in GSE35007), so the
|
||||
signature cannot encode the pathway hydroxyurea acts on.
|
||||
3. **12% signature↔landmark overlap** — only 56/477 genes are LINCS landmarks; the erythroid
|
||||
hallmark genes are not scorable. The query collapses to a generic inflammation/metabolic slice.
|
||||
4. **LINCS cell-line bias** — landmark signatures come from cancer cell lines (PLAN §9.2); poorly
|
||||
suited to a blood disease.
|
||||
5. **Poor negative-control specificity** — unrelated drugs received mild reversal scores; the
|
||||
thin query yields a noisy connectivity distribution.
|
||||
6. **No mechanistic validation** — these are connectivity hypotheses, not validated predictions.
|
||||
1. **Pre-registered test failed; the pass is post-hoc.** v1.1's hydroxyurea recovery is
|
||||
exploratory and must be re-validated on a held-out disease before any claim is made.
|
||||
2. **Missing HbF axis** — HBG1/HBG2 are absent from LINCS entirely (not just landmarks), so the
|
||||
pathway hydroxyurea acts on can never be scored by this method.
|
||||
3. **Signature specificity** — scorable genes are inflammation/metabolic; negative controls
|
||||
reverse them too. Connectivity ≠ therapeutic relatedness.
|
||||
4. **Cell-composition confound** — the whole-blood signature is reticulocyte-dominated.
|
||||
5. **LINCS cancer-cell-line bias**, and **no reference-signature library** for proper tau.
|
||||
6. **No mechanistic validation** — all hits are computational hypotheses.
|
||||
|
||||
## Section 6 — What v2 would fix
|
||||
|
||||
- **Cell-type deconvolution** of the disease signature to separate disease-state regulation from
|
||||
composition, recovering specificity.
|
||||
- **A non-globin-depleted, RNA-seq whole-blood study** to retain the HbF axis.
|
||||
- **Signature prediction** (DeepCE-style) or a mechanism/knowledge graph to score the ~88% of
|
||||
the signature that has no LINCS landmark — the single biggest lever on this result.
|
||||
- **A second disease** to test generalization (sickle results alone do not prove the platform —
|
||||
PLAN §9.5).
|
||||
- **A reference-signature library** to make tau (proper specificity calibration) work — the
|
||||
single biggest fix to the negative-control problem, and a direct use of the curated-data moat.
|
||||
- **Cell-type deconvolution** + a non-globin-depleted RNA-seq study to recover a more specific,
|
||||
HbF-containing signature.
|
||||
- **Signature prediction / mechanism graph** to score genes with no LINCS measurement.
|
||||
- **A second disease** to test generalization and to honestly re-validate the v1.1 method
|
||||
(PLAN §9.5).
|
||||
|
||||
---
|
||||
|
||||
### Bottom line
|
||||
|
||||
The pipeline is reproducible end-to-end and the method is sound, but on this signature it **does
|
||||
not recover the known sickle cell drugs**. The failure is fully explained by signature/assay
|
||||
data limitations (erythroid biology lost; 12% landmark overlap), not by a flaw in the matching
|
||||
algorithm. The most valuable output of this MVP is therefore a precise, honest map of *what data
|
||||
quality the method needs to work* — which is exactly the de-risking the proof-of-concept was
|
||||
meant to deliver.
|
||||
The pre-registered recovery test **failed**. Post-hoc diagnosis shows the dominant cause was a
|
||||
fixable gene-space bottleneck — correcting it **recovers hydroxyurea** — but also surfaces a
|
||||
deeper, genuine limitation: this whole-blood signature is **not specific enough** for
|
||||
connectivity scoring to separate real candidates from incidental reversers (negative controls
|
||||
rank at the top). The MVP's real deliverable is a precise, honest map of *what it takes to make
|
||||
this method work*: a more specific (deconvolved, HbF-containing) signature and a reference library
|
||||
for calibration — exactly the curated-data investments the platform thesis is built on.
|
||||
|
||||
Reference in New Issue
Block a user