Run the formal recovery test against the pre-registered criteria and write the deliverable report (PLAN §6 Week 4): - week4_recovery_test.py: evaluate hydroxyurea/L-glutamine + 5 pre-specified negative controls vs the committed criteria - recovery_test_report.md: methodology, FAIL result with diagnosis, top-10, lisinopril as the non-obvious candidate, limitations, v2 - known_limitations.md: L-glutamine coverage resolved, 12%-overlap driver, recovery outcome table Outcome: FAIL on all 3 criteria (hydroxyurea top 13%, L-glutamine WTCS=0, 1/5 negative controls bottom-half). Root cause is signature/ assay data limitations (lost erythroid+HbF axis, 12% landmark overlap), not the matching algorithm — reported straight per the project ethos. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.2 KiB
Known Limitations
The honest list of what would break this MVP at scale or in a different disease. Useful for the next pharma conversation: "yes, we know these are limitations, here's how v2 addresses them." Source: PLAN.md §9.
-
Cell-composition confound in sickle cell expression data. Whole-blood differential expression partly reflects different blood cell ratios, not disease biology. v1 acknowledges this; v2 should deconvolve cell types.
-
LINCS L1000 cell-line limitations. The 978 landmark genes were measured mostly in cancer cell lines (MCF7, A375, PC3, …). Signatures for non-oncology diseases may be noisy. A field-wide limitation, not unique to Reverso.
-
L-glutamine LINCS coverage — RESOLVED, opposite of expected. L-glutamine DOES have a Phase I signature (hydroxyurea is Phase-II-only) — both ground-truth drugs are scorable. But L-glutamine's connectivity is ambiguous (WTCS=0): its up- and down-set enrichments share a sign, so it shows no reversal. It ranks 100/300. So the ground-truth test effectively rests on hydroxyurea, which itself only reaches top 13% (raw) — see the recovery test report.
-
Connectivity scoring surfaces broad-effect drugs as false positives. HDAC inhibitors and broad kinase inhibitors often top connectivity rankings simply because they perturb many genes. The mechanistic prior (Week 3) helps filter, but does not eliminate this.
-
Hydroxyurea will probably pass the recovery test by construction. Sickle cell + hydroxyurea is a well-studied pair. Passing is necessary but not sufficient to claim the platform generalizes. The next disease is the real test — do not sell sickle cell results as proving the platform.
-
No mechanistic validation layer. Pure ML matching is not sufficient for extrapolation (flagged by multiple experts). The MVP knowingly omits the mechanistic layer; it is a phase-2 addition. Position the MVP as "discovery hypothesis generation," not "validated prediction."
-
Top-ranked novel candidates are not wet-lab validated. They are computational hypotheses to test, not discoveries. Use careful language in any write-up.
-
Only 12% of the signature is LINCS-scorable (56/477 genes). The 978 landmark genes (from cancer cell lines) miss the erythroid hallmark genes (CA1, AHSP, SLC4A1, HBG). Connectivity scoring runs on a thin inflammation/metabolic slice — the single biggest driver of the recovery-test failure. v2 fix: signature prediction or a mechanism graph to score the other 88%.
Recovery test outcome (Week 4)
The MVP failed all three pre-registered criteria on the primary raw ranking (hydroxyurea
rank 40/top 13%; L-glutamine rank 100/WTCS=0; 1/5 negative controls in bottom half). The failure
is fully attributable to signature/assay data limitations above, not the matching algorithm. See
recovery_test_report.md.
| Drug | Issue | Handling |
|---|---|---|
| hydroxyurea | HbF mechanism not in scorable gene space | scored (rank 40); recovered only by prior-weighted ranking |
| L-glutamine | signature present but WTCS ambiguous (=0) | scored (rank 100); no reversal signal |
| all 300 | had LINCS signatures | 0 marked "not scored" — coverage was not the issue; specificity was |