Week 4: recovery test (FAIL, reported honestly) + 2-page report

Run the formal recovery test against the pre-registered criteria and
write the deliverable report (PLAN §6 Week 4):
- week4_recovery_test.py: evaluate hydroxyurea/L-glutamine + 5
  pre-specified negative controls vs the committed criteria
- recovery_test_report.md: methodology, FAIL result with diagnosis,
  top-10, lisinopril as the non-obvious candidate, limitations, v2
- known_limitations.md: L-glutamine coverage resolved, 12%-overlap
  driver, recovery outcome table

Outcome: FAIL on all 3 criteria (hydroxyurea top 13%, L-glutamine
WTCS=0, 1/5 negative controls bottom-half). Root cause is signature/
assay data limitations (lost erythroid+HbF axis, 12% landmark overlap),
not the matching algorithm — reported straight per the project ethos.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-23 22:38:56 +02:00
parent fd4591949c
commit 72f1a49de6
3 changed files with 192 additions and 36 deletions

View File

@@ -12,9 +12,11 @@ Source: PLAN.md §9.
cell lines (MCF7, A375, PC3, …). Signatures for non-oncology diseases may be noisy. A
field-wide limitation, not unique to Reverso.
3. **L-glutamine probably has no LINCS signature.** Amino acids and metabolites weren't LINCS
priorities. If true, the ground-truth test effectively rests on hydroxyurea alone, which is
weaker. _Status: TBD — record the actual finding here once LINCS is pulled (Week 2)._
3. **L-glutamine LINCS coverage — RESOLVED, opposite of expected.** L-glutamine DOES have a
Phase I signature (hydroxyurea is Phase-II-only) — both ground-truth drugs are scorable. But
L-glutamine's connectivity is **ambiguous (WTCS=0)**: its up- and down-set enrichments share
a sign, so it shows no reversal. It ranks 100/300. So the ground-truth test effectively rests
on hydroxyurea, which itself only reaches top 13% (raw) — see the recovery test report.
4. **Connectivity scoring surfaces broad-effect drugs as false positives.** HDAC inhibitors and
broad kinase inhibitors often top connectivity rankings simply because they perturb many
@@ -32,8 +34,20 @@ Source: PLAN.md §9.
7. **Top-ranked novel candidates are not wet-lab validated.** They are computational hypotheses
to test, not discoveries. Use careful language in any write-up.
## Drug-specific gaps (fill in during Week 23)
8. **Only 12% of the signature is LINCS-scorable (56/477 genes).** The 978 landmark genes (from
cancer cell lines) miss the erythroid hallmark genes (CA1, AHSP, SLC4A1, HBG). Connectivity
scoring runs on a thin inflammation/metabolic slice — the single biggest driver of the
recovery-test failure. v2 fix: signature prediction or a mechanism graph to score the other 88%.
## Recovery test outcome (Week 4)
The MVP **failed** all three pre-registered criteria on the primary raw ranking (hydroxyurea
rank 40/top 13%; L-glutamine rank 100/WTCS=0; 1/5 negative controls in bottom half). The failure
is fully attributable to signature/assay data limitations above, not the matching algorithm. See
`recovery_test_report.md`.
| Drug | Issue | Handling |
|---|---|---|
| TBD | e.g. no LINCS signature | flagged "not scored, no signature available" |
| hydroxyurea | HbF mechanism not in scorable gene space | scored (rank 40); recovered only by prior-weighted ranking |
| L-glutamine | signature present but WTCS ambiguous (=0) | scored (rank 100); no reversal signal |
| all 300 | had LINCS signatures | 0 marked "not scored" — coverage was not the issue; specificity was |