Set up the project skeleton per PLAN.md §4: - src/ package: identifiers, disease, drugs, scoring, provenance with pydantic schemas and confidence-tier logic (working); data-pull/compute functions stubbed per their build week - 5 starter notebooks (01-05) with PLAN-referenced steps - tests/test_scoring.py: tier-assignment tests pass; scoring reference test xfail until Week 3 - docs/: recovery_test_report, data_sources, known_limitations skeletons - pyproject.toml (requires-python >=3.11,<3.14), .gitignore, README - data/ tree preserved via .gitkeep; raw/processed/results gitignored Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
50 lines
1.5 KiB
Plaintext
50 lines
1.5 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# 02 \u2014 Disease signature\n",
|
|
"\n",
|
|
"Week 1 (PLAN.md \u00a76). Pull Open Targets + a GEO expression study, run differential expression, and build `sickle_cell_signature_v1.json` (Tier A) with full provenance.\n\nSteps: (1) Open Targets associations, (2) choose + download GEO dataset, (3) differential expression, (4) build + persist signature.\n\n**Pitfall to document:** whole-blood expression is partly driven by cell-composition differences, not disease state (PLAN.md \u00a79.1)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import sys\n",
|
|
"sys.path.insert(0, '..') # import the src package from notebooks/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from src import disease\n",
|
|
"from src.provenance import ConfidenceTier\n",
|
|
"\n",
|
|
"# Step 1: Open Targets associations for MONDO:0011382 -> data/raw/open_targets/\n",
|
|
"# Step 2: choose + download GEO study (GSE53441 / GSE35007 / newer) -> data/raw/geo/\n",
|
|
"# Step 3: disease.compute_differential_expression(...)\n",
|
|
"# Step 4: disease.build_signature(...) then disease.persist_signature(...)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"name": "python"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
} |